You can use EAGERx (Engine Agnostic Graph Environments for Robotics) to easily define new (Gymnasium compatible) environments with modular robot definitions.
It enables users to:
- Define environments as graphs of nodes
- Visualize these graph environments interactively in a GUI
- Use a single graph environment both in reality and with various simulators
EAGERx explicitly addresses the differences in learning between simulation and reality, with native support for essential features such as:
- Safety layers and various other state, action and time-scale abstractions
- Delay simulation & domain randomization
- Real-world reset routines
- Synchronized parallel computation within a single environment
Full documentation and tutorials available here.
Sim2Real: Policies trained in simulation and zero-shot evaluated on real systems using EAGERx. In the top left the successful transfer of a policy for the classic pendulum swing-up problem is shown and in the top right for a box-pushing task. Below that a policy to land a quadrotor on a moving inclined platform is shown.
Modular: The modular design of EAGERx allows users to create complex environments easily through composition.
GUI: Users can visualize their graph environment. Here we visualize the graph environment that we built in this tutorial. See the documentation for more information.
Live plotting: In robotics it is crucial to monitor the robot's behavior during the learning process. Luckily, inter-node communication within EAGERx can be listened to externally, so that any relevant information stream can be trivially monitored on-demand. See the documentation for more information.
Applications beyond RL: The modular design and unified software pipeline of the framework have utility beyond reinforcement learning. We explored two such instances: interactive language-conditioned imitation learning (left) and classical control with deep learning based perception in a swimming pool environment (right).
You can do a minimal installation of EAGERx
with:
pip3 install eagerx
We provide other options (Docker, Conda) for installing EAGERx in the documentation.
The following tutorials are currently available in the form of Google Colabs:
Introduction to EAGERx
The solutions are available here.
Developer tutorials
- Tutorial 1: Environment Creation and Training with EAGERx
- Tutorial 2: Reset and Step Function
- Tutorial 3: Space and Processors
- Tutorial 4: Nodes and Graph Validity
- Tutorial 5: Adding Engine Support for an Object
- Tutorial 6: Defining a new Object
- Tutorial 7: More Informative Rendering
- Tutorial 8: Reset Routines
The solutions are available here.
For more information see the docs or the eagerx_tutorials package.
Below you can find a code example of environment creation and training using Stable-Baselines3. To run this code, you should install eagerx_tutorials, which can be done by running:
pip3 install eagerx_tutorials
Detailed explanation of the code can be found in this Colab tutorial.
import eagerx
from eagerx.backends.single_process import SingleProcess
from eagerx.wrappers import Flatten
from eagerx_tutorials.pendulum.objects import Pendulum
from eagerx_ode.engine import OdeEngine
import stable_baselines3 as sb3
import numpy as np
from typing import Dict
class PendulumEnv(eagerx.BaseEnv):
def __init__(self, name: str, rate: float, graph: eagerx.Graph, engine: eagerx.specs.EngineSpec,
backend: eagerx.specs.BackendSpec):
self.max_steps = 100
self.steps = None
super().__init__(name, rate, graph, engine, backend, force_start=True)
def step(self, action: Dict):
observation = self._step(action)
self.steps += 1
# Calculate reward and check if the episode is terminated
th = observation["angle"][0]
thdot = observation["angular_velocity"][0]
u = float(action["voltage"])
th -= 2 * np.pi * np.floor((th + np.pi) / (2 * np.pi))
cost = th ** 2 + 0.1 * thdot ** 2 + 0.01 * u ** 2
truncated = self.steps > self.max_steps
terminated = False
# Render
if self.render_mode == "human":
self.render()
return observation, -cost, terminated, truncated, {}
def reset(self, seed=None, options=None) -> Dict:
states = self.state_space.sample()
observation = self._reset(states)
self.steps = 0
# Render
if self.render_mode == "human":
self.render()
return observation, {}
if __name__ == "__main__":
rate = 30.0
pendulum = Pendulum.make("pendulum", actuators=["u"], sensors=["theta", "theta_dot"], states=["model_state"])
graph = eagerx.Graph.create()
graph.add(pendulum)
graph.connect(action="voltage", target=pendulum.actuators.u)
graph.connect(source=pendulum.sensors.theta, observation="angle")
graph.connect(source=pendulum.sensors.theta_dot, observation="angular_velocity")
engine = OdeEngine.make(rate=rate)
backend = SingleProcess.make()
env = PendulumEnv(name="PendulumEnv", rate=rate, graph=graph, engine=engine, backend=backend)
env = Flatten(env)
model = sb3.SAC("MlpPolicy", env, verbose=1)
model.learn(total_timesteps=int(150 * rate))
env.shutdown()
EAGERx allows to create engine agnostic environments such that a single environment can be used for simulation and reality. The following engines are available for training and evaluation:
- RealEngine for real-world experiments
- PybulletEngine for PyBullet simulations
- OdeEngine for simulations based on ordinary differential equations (ODEs)
Users can also create their own (custom) engines.
If you are using EAGERx for your scientific publications, please cite:
@article{vanderheijden2024eagerx,
author={van der Heijden, Bas and Luijkx, Jelle and Ferranti, Laura and Kober, Jens and Babuska, Robert},
journal={IEEE Robotics \& Automation Magazine},
title={Engine Agnostic Graph Environments for Robotics (EAGERx): A Graph-Based Framework for Sim2real Robot Learning},
year={2024},
volume={},
number={},
pages={2-15},
keywords={Robots;Engines;Robot sensing systems;Delays;Robot learning;Physics;Codes},
doi={10.1109/MRA.2024.3433172}
}
EAGERx is currently maintained by Bas van der Heijden (@bheijden) and Jelle Luijkx (@jelledouwe).
For any question, send an e-mail to [email protected].
EAGERx is funded by the OpenDR Horizon 2020 project.