Inverting Retargeting: Humanoid Datasets Remember Their Operators

TL;DR We show that human-to-humanoid motion retargeting preserves operator movement dynamics despite discarding body shape, enabling recovery of gender, age, height, and identity from robot joint-angle trajectories alone. We measure and interpret this effect at scale, and ask what it implies for how the robotics community collects and shares demonstration data.

Humanoid Trajectory
UNVEIL Predictions
Ground Truth
Height
168 cm
Δ−4 cm
172 cm
Weight
68 kg
Δ
68 kg
Age
29 yrs
Δ+1 yr
28 yrs
Gender
Female
Δ
Female
0.00 / 0.00s
Humanoid Trajectory
UNVEIL Predictions
Ground Truth
Height
163 cm
Δ−4 cm
167 cm
Weight
57 kg
Δ+5 kg
52 kg
Age
26 yrs
Δ−3 yrs
29 yrs
Gender
Female
Δ
Female
0.00 / 0.00s
Humanoid Trajectory
UNVEIL Predictions
Ground Truth
Height
172 cm
Δ+4 cm
168 cm
Weight
68 kg
Δ+7 kg
61 kg
Age
30 yrs
Δ+3 yrs
27 yrs
Gender
Female
Δ
Female
0.00 / 0.00s
How it's rendered The humanoid robot plays back its retargeted joint angle trajectory; the human mesh is the original motion in BVH body model converted to SMPL body model, whose shape parameters expose height, weight, and gender.

Abstract

Human-to-humanoid motion retargeting is the central operation in teleoperation data collection: it projects operator demonstrations onto a shared robot skeleton, discarding body shape, to produce training data for whole-body controllers and foundation models. We report a surprising property of this transform: retargeting normalizes body proportions but preserves operator movement dynamics (joint velocity profiles, ranges of motion, and coordination patterns shaped by the operator’s physiology). On BONES-SEED (522 operators, 142K sequences), retargeted Unitree G1 trajectories support gender classification at 96.0% and operator re-identification at 97.2% Top-1; on operators never seen during training, gender holds at 83.4% and age and height regress within ±4.2 yr and ±5.7 cm. Partial correlation analysis reveals emergent, biomechanically interpretable structure: the signals are task-invariant across activity categories and hold across retargeting implementations. We introduce UNVEIL, a skeleton-aware spatiotemporal graph network to measure and interpret this effect, take initial steps toward operator anonymization, and ask what this implies for data practices and the operator-invariance of learned policies. Code, models, and anonymized trajectories are at our project website project-unveil.github.io.

View teaser figure (PDF)

Operator movement signatures survive the retargeting transform. Retargeting projects diverse teleoperators onto a uniform robot skeleton (Unitree G1), stripping body-shape parameters entirely. Operator-specific movement dynamics nevertheless persist in the joint-angle trajectories, and our proposed UNVEIL recovers the operator’s attributes directly from the humanoid data.

How does it work?

UNVEIL overview. A retargeted humanoid trajectory is encoded into per-joint motion features, processed by two hierarchical spatial graph convolutions (intra-limb, limb-torso), aggregated across time with N spatiotemporal layers into a trajectory embedding that is decoded by task-specific heads into operator attributes.

Interpretable Features

Beyond raw classification accuracy, the leaked attributes carry interpretable signatures: older operators rotate their trunk less, heavier operators use smaller hip ranges during effortful motion, taller operators cover less root distance per stride, and female and male operators exhibit consistently different upper-body dynamics. Each row below pairs two operators performing different instances of the same activity category; the motion feature plotted in the middle of each panel tracks the attribute shown on the left.

Age
Weight
Height
Gender

Browse more instances per attribute in this page

How it's rendered Each panel shows two real operators playing back different instances of the same activity: the Unitree G1 humanoids on top are driven by their retargeted joint-angle trajectories, and the SMPL human meshes below are the source motions in each operator's true body shape. The plot strip between them traces the per-frame motion feature.

Results

UNVEIL’s performance across inference tasks and evaluation settings. Unseen operators are people never in training (zero-shot biometrics); unseen demos of seen operators are held-out activities for operators we trained on (cross-activity generalization). (↑ higher is better; ↓ lower is better.)

Task →
Evaluation Settings ↓
Gender
Acc (%) ↑
Weight
MAE ↓
Height
MAE ↓
Age
MAE ↓
Re-ID
Top-1 / Top-5 ↑
Linkage
Acc (%) ↑
Unseen operators 83.4 9.1 kg 5.7 cm 4.2 yr
Unseen demos of seen operators 96.0 4.0 kg 3.6 cm 2.5 yr 97.2 / 98.8 92.4
Performance range across 20 activity categories
Most leaking activity 100.0 4.7 kg 3.3 cm 0.4 yr 100.0 / 100.0 100.0
2nd-most leaking 94.1 5.3 kg 3.9 cm 1.5 yr 98.0 / 99.0 99.0
2nd-least leaking 74.4 11.1 kg 7.4 cm 3.8 yr 70.1 / 89.5 74.9
Least leaking activity 72.9 14.8 kg 7.8 cm 3.9 yr 73.0 / 84.2 69.5

View comparison figure (PDF)

Comparison between UNVEIL and two well-tested action-recognition architectures adapted with our framework — SGN and DSGCN. We consistently improve over both adaptations across all tasks. Gray shows task-specific chance performance for reference.

BibTeX

@inproceedings{anonymous2026unveil,
    title={Inverting Retargeting: Humanoid Datasets Remember Their Operators},
    author={Anonymous},
    booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
    year={2026},
    note={Under review}
}