[Research Case] Carnegie Mellon University's New Breakthrough in Studying Humano

2024-05-08

The research team at Carnegie Mellon University recently published an innovative study, which developed a real-time human to humanoid (H2O) full body remote control system. It is worth mentioning that the humanoid robots in the experiment are the H1 ReS humanoid robots from Unitree, which have also evolved faster recently (3.3 m/s) and can perform backflip in place.

This is a reinforcement learning (RL) based framework that enables real-time full body remote operation of full-size humanoid robots using only RGB cameras. To create a large-scale human motion retargeting motion dataset for humanoid robots, we propose a scalable "Sim To data" process to filter and select feasible movements using privileged motion simulators. Then, we use these fine movements to train a robust real-time humanoid motion simulator in the simulation and transmit it to a real humanoid robot in a zero sample manner. We have successfully achieved remote operation of dynamic full body movements in real scenes, including walking, back jumping, kicking, turning, waving, pushing, boxing, etc. As far as the team knows, this is the first demonstration to achieve real-time human like remote operation based on learning.

Human Motion Retargeting

Reinforcement Learning

Due to their resemblance to humans in appearance, humanoid robots are highly suitable for real-time remote control. The research team aims to use RGB cameras to convert human gestures into the behavior of humanoid robots in real-time. In addition, this technology enables teams to collect a large amount of high-quality human operation data for robots to use, among which imitation learning can be applied to tasks of humanoid robot remote control.

However, for a long time, the full body control of humanoid robots has been a challenge in the field of robotics, and the complexity increases when attempting to make humanoid robots mimic real-time freeform human movements.

The recent progress in reinforcement learning (RL) in humanoid robot control provides a feasible alternative solution. Firstly, reinforcement learning (RL) has been applied in the field of graphics to generate complex human actions, perform multiple tasks, and follow real-time human actions recorded by cameras in simulations.

However, due to the impractical state space in design and the neglect of hardware limitations such as torque/joint limitations, it has not been confirmed whether these technologies can be applied to full-size humanoid robots. Despite this, RL has achieved stable and fast bipedal walking in practical environments, but to date, there has been no research on RL based remote control of full body humanoid robots.

Progress in real-time full body remote control of humanoid robots

The Human to Humanoid (H2O) system of this team is a scalable learning based system that utilizes only one RGB camera to achieve real-time full body remote control of humanoid robots. Researchers claim that through a new sim to data process and reinforcement learning, their approach solves the complex problem of converting human actions into actions that humanoid robots can perform.

Using a comprehensive full body action simulator, similar to a Persistent Humanoid Controller (PHC), the team proposed a method for training and seamless transition to real-world deployment, using zero shot learning.

As seen in the video shared by the team, the framework involves a Unitree H1 humanoid robot, which comes with a D435 visual camera and a 360 degree LiDAR. It can achieve real-time remote control of humanoid robots through a human operator and a simple network camera interface. The system can easily be coordinated by human operators to perform diverse tasks, including picking and placing, kicking and striking actions, pushing strollers, and so on.

Method

H2O framework

Retargeting: In the first stage, the process aligns the SMPL body model to a humanoid's structure by optimizing shape and motion parameters. The second stage refines this by removing artifacts and infeasible motions using a trained privileged imitation policy, yielding a realistic and cleaned motion dataset for the humanoid.
Sim-to-Real Training: A imitation policy is trained to tracking motion goals sampled from cleaned retargeting dataset.
Real-time Teleoperation Deployment: The real-time teleoperation deployment captures human motion through an RGB camera and a pose estimator, which is then mimicked by a humanoid robot using the trained sim-to-real imitation policy.