Humanoid Loco-Manipulation

Active Frontier

humanoidlocomotionmanipulationwhole-body

Humanoid Loco-Manipulation

Loco-manipulation is the unified challenge of coordinating locomotion (walking, balancing, navigating) and manipulation (grasping, carrying, placing) simultaneously on a humanoid robot. Unlike industrial arms bolted to a floor, humanoids must maintain dynamic balance while their arms exert forces on the environment — making this one of the hardest open problems in robotics.

Gu et al. provide a comprehensive survey spanning three decades of model-based approaches, from early ZMP (Zero Moment Point) controllers through modern learning-based methods. The survey documents the progression from separate locomotion and manipulation pipelines toward integrated whole-body frameworks.

Wen et al. take a fundamentally different approach with Humanoid-COA (Chain of Action), using foundation models to achieve zero-shot loco-manipulation without task-specific training. Their framework decomposes natural language instructions into executable whole-body behaviors, achieving 96.6% grasping accuracy and 90% mobile pick success on Unitree H1-2 and G1 platforms.

Key Claims

Three decades of model-based approaches surveyed — Gu et al. comprehensively cover the evolution from ZMP controllers to modern learning-based loco-manipulation, establishing the landscape of model fidelity vs. computational efficiency trade-offs. Evidence: strong (Humanoid Locomotion & Manipulation Survey)
96.6% grasping and 90% mobile pick achieved zero-shot — Wen et al.'s CoA framework decomposes high-level language instructions into executable whole-body behaviors without any task-specific training data. Evidence: strong (Humanoid-COA: Chain of Action)
Long-horizon combined tasks remain at 56-63% success — While individual subtasks show high accuracy, chaining locomotion and manipulation over extended task sequences remains significantly harder. Evidence: strong (Humanoid-COA: Chain of Action)

Open Questions

Can long-horizon combined tasks be improved beyond 56-63% success without sacrificing zero-shot generalization?
How do model-based and learning-based approaches compare on robustness across diverse real-world environments?
What role does tactile feedback play in closing the gap for contact-rich manipulation during locomotion?
How to balance computational efficiency with model fidelity for real-time whole-body control?

Related Concepts

Sim-to-Real Transfer — Training loco-manipulation controllers in simulation before deployment
Foundation Models for Robotics — Zero-shot loco-manipulation via language models
Whole-Body Control — The underlying control framework enabling unified locomotion and manipulation
Imitation Learning — Learning loco-manipulation from human demonstrations

Related Entities

Unitree — H1-2 and G1 platforms used for CoA experiments
Figure AI — Pursuing loco-manipulation via imitation learning

Backlinks

Pages that reference this concept:

Related Concepts

Sources

humanoid-locomotion-manipulation-survey humanoid-coa-chain-of-action

Humanoid Loco-Manipulation

Humanoid Loco-Manipulation

Key Claims

Open Questions

Related Concepts

Related Entities

Backlinks

Related Concepts

Foundation Models for Robotics

Imitation Learning

Sim-to-Real Transfer

Whole-Body Control

Sources