- HOVER WBC
- Table of Contents
- Overview
- Installation
- Training
- Testing
- Evaluation
- Overwriting Configuration Values
- Sim-to-Sim Validation
- Developing
- License
- Contributors
- Acknowledgments
This repository contains the IsaacLab extension to train neural whole-body controllers for humanoids as explained in the OmniH2O and HOVER papers. For video demonstrations and to link to the original implementation in Isaac Gym, please visit the OmniH2O project website and the HOVER project website.
- Install Isaac Lab, see the installation guide.
Note: Currently Isaac Lab 2.0.0 is supported. After you clone the Isaac Lab
repository, check out the
v2.0.0
tag before installation. Also note that thersl_rl
package is renamed torsl_rl_lib
with the currentv2.0.0
tag of Isaac Lab, causing installation issues. This will be fixed once a new tag is created on the Isaac Lab repo. This error would not affect this repo, as we have our own customizedrsl_rl
package.git fetch origin git checkout v2.0.0
- Define the following environment variable to specify the path to your IsaacLab installation:
# Set the ISAACLAB_PATH environment variable to point to your IsaacLab installation directory export ISAACLAB_PATH=<your_isaac_lab_path>
- Clone the repo and its submodules:
git clone --recurse-submodules <REPO_URL>
- Install this repo and its dependencies by running the following command from the root of this
repo:
./install_deps.sh
NOTE: Due to the license limitations of the AMASS dataset, we are not able to provide the retargeted dataset directly. All the following training and evaluation scripts will use the
stable_punch.pkl
dataset (not included as well) as a toy example. It is a small subset of the AMASS dataset where the upper body is performing punching motions. We modified the motion data to minimize the lower body's motion to create a simpler example. We suggest that users retarget a small subset of the AMASS dataset to the Unitree H1 robot and use that for trial training. The retargeting process of the whole dataset could take up to 4 days on a 32 CPU core machine. More cores will reduce the time correspondingly.
We utilize the AMASS dataset to train our models. The AMASS dataset is a comprehensive collection of motion capture (mocap) datasets. To develop control policies for a humanoid robot, it is essential to retarget the motion data in the dataset to fit the desired robot. We provide a bash script that retargets the dataset specifically for the Unitree H1 robot. This script is based on the scripts from the human2humanoid repository. Due to the limitations of the license of the AMASS dataset, we are not providing a retargeted dataset directly. To access the dataset, you will need to create an account.
To get started, follow these steps:
- Create a folder to save the datasets in
mkdir -p third_party/human2humanoid/data/AMASS/AMASS_Complete
. - Download the dataset(s) you are interested in from the "SMPL+H G" format section on the AMASS download
page and place the archive files in
third_party/human2humanoid/data/AMASS/AMASS_Complete
. This will take some time due to the number of datasets and the fact that apparently they don't allow parallel downloads. You don't need to extract the files manually - the script will handle that for you. - Download the SMPL model from this
link and
place the zip file
third_party/human2humanoid/data/smpl
. - Finally, run the provided script by executing
./retarget_h1.sh
. The script extracts the downloaded files to desired locations, prepares necessary files and dependencies for retargeting. If you want to retarget only specific motions, you can provide a YAML file with the list of motions by running./retarget_h1.sh --motions-file <path_to_yaml_file>
. Seepunch.yaml
for an example. This will only process the motion files specified in the YAML file instead of the full dataset. Note that the script installs pip dependencies and might build some of them, which requires the matching version of thepython-dev
to be installed. - Before proceeding with training and evaluation, run
./install_deps.sh
again to ensure the correct dependencies are installed.
The retargeted dataset will be found at third_party/human2humanoid/data/h1/amass_all.pkl
. Rename it and move it to your desired location.
While the exact path to the reference motion is not important, we recommend placing it in the neural_wbc/data/data/motions/
folder as the included data library will handle relative path searching, which is useful for unit testing.
For more details, refer to the human2humanoid repository.
In the project's root directory,
${ISAACLAB_PATH:?}/isaaclab.sh -p scripts/rsl_rl/train_teacher_policy.py \
--num_envs 10 \
--reference_motion_path neural_wbc/data/data/motions/stable_punch.pkl
The teacher policy is trained for 10000000 iterations, or until the user interrupts the training. The resulting checkpoint is stored in neural_wbc/data/data/policy/h1:teacher/
and the filename is model_<iteration_number>.pt
.
In the project's root directory,
${ISAACLAB_PATH:?}/isaaclab.sh -p scripts/rsl_rl/train_student_policy.py \
--num_envs 10 \
--reference_motion_path neural_wbc/data/data/motions/stable_punch.pkl \
--teacher_policy.resume_path neural_wbc/data/data/policy/h1:teacher \
--teacher_policy.checkpoint model_<iteration_number>.pt
This assumes that you have already trained the teacher policy as there is no provided teacher policy in the repo. Change the filename to match the checkpoint you trained. The exact path of the teacher policy does not matter, but it is recommended to store it in the data folder. If stored outside the data folder, you might need to provide the full path.
-
The examples above use a low number of environments as a toy demo. For good results we recommend to train with at least 4096 environments.
-
The examples above use the
stable_punch.pkl
dataset as a toy demo. For good results we recommend to train with the full amass dataset. -
Per default the trained checkpoints are stored to
logs/teacher/
orlogs/student/
. -
If you don't want to train from scratch you can resume training from a checkpoint using the options
--teacher_policy.resume_path
/--student_policy.resume_path
and--teacher_policy.checkpoint
/--student_policy.checkpoint
. For example to resume training of the teacher use${ISAACLAB_PATH:?}/isaaclab.sh -p scripts/rsl_rl/train_teacher_policy.py \ --num_envs 10 \ --reference_motion_path neural_wbc/data/data/motions/stable_punch.pkl \ --teacher_policy.resume_path neural_wbc/data/data/policy/h1:teacher \ --teacher_policy.checkpoint model_<iteration_number>.pt
The codebase allows to train both generalist and specialist policies:
- Generalist policies allow to track different command configurations (or modes) with a single policy, as shown in the HOVER paper.
- Specialist policies only allow to track a specific command configuration with a single policy, as shown in the OmniH2O paper.
Per default the codebase trains a specialist policy in OmniH2O mode (tracking head and hand positions).
distill_mask_sparsity_randomization_enabled = False
distill_mask_modes = {"omnih2o": DISTILL_MASK_MODES_ALL["omnih2o"]}
A specialist in a different mode can be trained by modifying the distill_mask_modes
in
the config file.
For an example to train a specialist that tracks the joint angles, root linear velocity and root yaw
orientation use this:
distill_mask_sparsity_randomization_enabled = False
distill_mask_modes = {"humanplus": DISTILL_MASK_MODES_ALL["humanplus"]}
A generalist can be trained by removing/commenting out the specialist mask modes in the config file, ie.
distill_mask_sparsity_randomization_enabled = False
distill_mask_modes = DISTILL_MASK_MODES_ALL
In the current implementation, we hand picked four modes that are discussed in the original paper
for proof of life purposes. The user is free to add more modes to the DISTILL_MASK_MODES_ALL
dictionary to make the generalist policy more general. We recommend the user to turn off sparsity
randomization as the currently implemented randomization strategy (as described in the paper) might
lead to motion ambiguity.
In both cases the same commands from above can be used to launch the training.
In the project's root directory,
${ISAACLAB_PATH:?}/isaaclab.sh -p scripts/rsl_rl/play.py \
--num_envs 10 \
--reference_motion_path neural_wbc/data/data/motions/stable_punch.pkl \
--teacher_policy.resume_path neural_wbc/data/data/policy/h1:teacher \
--teacher_policy.checkpoint model_<iteration_number>.pt
In the project's root directory,
${ISAACLAB_PATH:?}/isaaclab.sh -p scripts/rsl_rl/play.py \
--num_envs 10 \
--reference_motion_path neural_wbc/data/data/motions/stable_punch.pkl \
--student_player \
--student_path neural_wbc/data/data/policy/h1:student \
--student_checkpoint model_<iteration_number>.pt
The evaluation iterates through all the reference motions included in the dataset specified by the
--reference_motion_path
option and exits when all motions are evaluated. Randomization is turned
off during evaluation. At the end of execution, the script summarizes the results with the following
reference motion tracking metrics:
NOTE: Running evaluation with 1024 environments will require approximately 13GB of GPU memory. Adjust the
--num_envs
parameter based on your available GPU resources.
- Success Rate [%]: The percentage of motion tracking episodes that are successfully completed. An episode is considered successful if it follows the reference motion from start to finish without losing balance and avoiding collisions on specific body parts.
- mpjpe_g [mm]: The global mean per-joint position error, which measures the policy's ability to imitate the reference motion globally.
- mpjpe_l [mm]: The root-relative mean per-joint position error, which measures the policy's ability to imitate the reference motion locally.
- mpjpe_pa [mm]: The procrustes aligned mean per-joint position error, which aligns the links with the ground truth before calculating the errors.
- accel_dist [mm/frame^2]: The average joint acceleration error.
- vel_dist [mm/frame]: The average joint velocity error.
- upper_body_joints_dist [radians]: The average distance between the predicted and ground truth upper body joint positions.
- lower_body_joints_dist [radians]: The average distance between the predicted and ground truth lower body joint positions.
- root_r_error [radians]: The average torso roll error.
- root_p_error [radians]: The average torso pitch error.
- root_y_error [radians]: The average torso yaw error.
- root_vel_error [m/frame]: The average torso velocity error.
- root_height_error [m]: The average torso height error.
The metrics are reported multiple times for different configurations: - The metrics are computed over all environments or only the successful ones. - The metrics are computed over all bodies or only the tracked (=masked) bodies.
Per default the masked evaluation is using the OmniH2O mode (tracking head and hand positions). This
can be configured by changing the mask here the distill_mask_modes
in the
config file.
For an example to change the configuration see also the
Generalist vs. Specialist Policy section above.
The evaluation script, scripts/rsl_rl/eval.py
, uses the same arguments as the play script,
scripts/rsl_rl/play.py
. You can use it for both teacher and student policies.
${ISAACLAB_PATH}/isaaclab.sh -p scripts/rsl_rl/eval.py \
--num_envs 10 \
To customize and overwrite default environment configuration values, you can provide a YAML file
with the desired settings. The structure of the YAML file should reflect the hierarchical structure
of the configuration. For nested configuration parameters, use a dot (.) to separate the levels. For
instance, to update the dt
value within the sim configuration, you would use sim.dt
as the key.
Here's an example YAML file demonstrating how to set and overwrite various configuration values:
# scripts/rsl_rl/config_overwrites/sample.overwrite.yaml
sim.dt: 0.017
decimation: 4
add_policy_obs_noise: False
default_rfi_limit: 0.1
ctrl_delay_step_range: [0, 3]
To apply these custom settings, pass the path to your YAML file using the --env_config_overwrite
option when running the script. If the YAML file contains keys that do not exist in the default
configuration, those keys will be ignored.
We also provide a Mujoco environment for conducting sim-to-sim validation of the trained policy. To run the evaluation of Sim2Sim,
${ISAACLAB_PATH:?}/isaaclab.sh -p neural_wbc/inference_env/scripts/eval.py \
--num_envs 1 \
--headless \
--student_path neural_wbc/data/data/policy/h1:student/ \
--student_checkpoint model_<iteration_number>.pt
Please be aware that the mujoco_wrapper
only supports one environment at a time. For a reference, it will take up to 5h
to evaluate 8k
reference motions. The inference_env is designed for maximum versatility. It's also suitable for Sim2Real deployment when the user provides the real robot interface.
In the root of each module directory (e.g. neural_wbc/core
), run the following command:
cd neural_wbc/core
${ISAACLAB_PATH:?}/isaaclab.sh -p -m unittest
We do provide a run_unit_tests.sh
script to run all unit tests.
We have a pre-commit template to automatically lint and format the code. To install pre-commit:
pip install pre-commit
Then you can run pre-commit with:
pre-commit run --all-files
To setup the IDE, please follow these instructions:
- Run VSCode Tasks, by pressing
Ctrl+Shift+P
, selectingTasks: Run Task
and running thesetup_python_env
in the drop down menu. When running this task, you will be prompted to add the absolute path to your Isaac Lab installation.
If everything executes correctly, it should create a file .python.env in the .vscode directory. The file contains the python paths to all the extensions provided by Isaac Sim and Omniverse. This helps in indexing all the python modules for intelligent suggestions while writing code.
You can run scripts in a Docker container without using the Isaac Sim GUI. Follow these steps:
-
Install the NVIDIA Container Toolkit:
- Follow the installation guide here.
-
Access the NGC Container Registry:
- Ensure you have access by following the instructions here.
-
Start the Docker Container:
- Use the following command to start the container:
docker run -it --rm \ --runtime=nvidia --gpus all \ -v $PWD:/workspace/neural_wbc \ --entrypoint /bin/bash \ --name neural_wbc \ nvcr.io/nvidian/isaac-lab:IsaacLab-main-b120
-
Set Up the Container:
-
Navigate to the workspace and install dependencies:
cd /workspace/neural_wbc ./install_deps.sh
-
You can now run scripts in headless mode by passing the --headless
option.
HOVER WBC is released under the Apache License 2.0. See LICENSE for additional details.
The names are ordered in alphabetical order by the last name:
Joydeep Biswas, Yan Chang, Jim Fan, Pulkit Goyal, Lionel Gulich, Tairan He, Rushane Hua, Chenran Li, Wei Liu, Zhengyi Luo, Billy Okal, Stephan Pleines, Soha Pouya, Peter Varvak, Wenli Xiao, Huihua Zhao, Yuke Zhu
We would like to acknowledge the following projects where parts of the codes in this repo is derived from: