PPO#
RoboVerse provides three PPO implementations with different features and use cases:
1. Stable-Baselines3 PPO (Recommended for Beginners)#
Based on Stable-Baselines3, this implementation provides a more user-friendly interface with comprehensive configuration options.
Usage#
# Basic PPO training with Franka robot
python get_started/rl/0_ppo.py --task reach_origin --robot franka --sim isaacgym
# PPO with Gym interface
python get_started/rl/0_ppo_gym_style.py --sim mjx --num-envs 256
Configuration#
Check the file header in get_started/rl/0_ppo.py for available configuration options including:
Task selection (
--task)Robot type (
--robot)Simulator backend (
--sim)Environment settings
2. CleanRL PPO#
Based on CleanRL, this implementation provides a more minimal and educational approach with direct algorithm implementation.
Usage#
# CleanRL PPO with RoboVerse environment
python roboverse_learn/rl/clean_rl/ppo.py --task reach_origin --robot franka --sim mjx --num_envs 2048
Configuration#
Configuration defaults live in roboverse_learn/rl/configs/clean_rl/ppo.py (parsed with tyro). Use --help for all options, including:
Task selection (
--task)Robot type (
--robot)Simulator backend (
--sim)Training hyperparameters (
--num_envs,--learning_rate, etc.)
3. RSL-RL PPO (OnPolicyRunner)#
Based on rsl_rl for high-throughput on-policy training with asymmetric observations.
Usage#
# RSL-RL PPO for Unitree G1 walking
python -m roboverse_learn.rl.rsl_rl.ppo --task walk_g1_dof29 --robot g1 --sim isaacgym --num-envs 4096
Configuration#
Install dependency:
pip install rsl_rlCLI defaults:
roboverse_learn/rl/configs/rsl_rl/ppo.py(tyro). Run with--helpto see environment, training, and PPO hyperparameters (--num-steps-per-env,--max-iterations,--clip-param, etc.).Outputs: checkpoints, TensorBoard logs, and final scripted policy are saved under
models/{exp_name}/{task}/by default (override with--model-dir).
Quick Start Examples#
For detailed tutorials and infrastructure setup:
Infrastructure Overview: See RL Infrastructure for complete setup
Quick Examples: See Quick Start Examples for ready-to-run commands