Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify registration to NoFrameskip-v4 and v5 environments #561

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 28 additions & 28 deletions docs/_scripts/environment-docs.json

Large diffs are not rendered by default.

33 changes: 19 additions & 14 deletions docs/_scripts/gen_environments_md.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
import ale_py
import gymnasium
import tabulate
from ale_py.registration import _rom_id_to_name
from ale_py.registration import rom_id_to_name
from tqdm import tqdm

gymnasium.register_envs(ale_py)
Expand Down Expand Up @@ -40,9 +40,22 @@ def shortened_repr(values):
atari_data = json.load(file)

for rom_id in tqdm(ALL_ATARI_GAMES):
env_name = _rom_id_to_name(rom_id)
env_name = rom_id_to_name(rom_id)

env = gymnasium.make(f"ALE/{env_name}-v5").unwrapped

general_info_table = tabulate.tabulate(
[
["Make", f'gymnasium.make("ALE/{env_name}-v5")'],
["Action Space", str(env.action_space)],
["Observation Space", str(env.observation_space)],
],
headers=["", ""],
tablefmt="github",
)

if rom_id in atari_data:
env_data = atari_data[rom_id]
env_data = atari_data[rom_id]

env_description = env_data["env_description"]
Expand Down Expand Up @@ -101,7 +114,7 @@ def shortened_repr(values):
env_spec.id,
f'`"{env_spec.kwargs["obs_type"]}"`',
f'`{env_spec.kwargs["frameskip"]}`',
f'`{env_spec.kwargs["repeat_action_probability"]}`',
f'`{env_spec.kwargs["repeat_action_probability"]:.2f}`',
]
for env_spec in env_specs
]
Expand Down Expand Up @@ -129,16 +142,6 @@ def shortened_repr(values):
difficulty_mode_row, headers=difficulty_mode_header, tablefmt="github"
)

top_table = tabulate.tabulate(
[
["Action Space", str(env.action_space)],
["Observation Space", str(env.observation_space)],
["Import", f'`gymnasium.make("{env.spec.id}")`'],
],
headers=["", ""],
tablefmt="github",
)

env.close()

TEMPLATE = f"""---
Expand All @@ -154,7 +157,7 @@ def shortened_repr(values):

This environment is part of the <a href='..'>Atari environments</a>. Please read that page first for general information.

{top_table}
{general_info_table}

For more {env_name} variants with different observation and action spaces, see the variants section.

Expand Down Expand Up @@ -187,6 +190,8 @@ def shortened_repr(values):

{env_variant_table}

See the [version history page](https://ale.farama.org/environments/#version-history-and-naming-schemes) to implement previously implemented environments, e.g., `{env_name}NoFrameskip-v4`.

## Difficulty and modes

It is possible to specify various flavors of the environment via the keyword arguments `difficulty` and `mode`.
Expand Down
67 changes: 30 additions & 37 deletions docs/environments.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,21 +154,19 @@ The Atari environments observation can be

## Rewards

The exact reward dynamics depend on the environment and are usually documented in the game's manual. You can
find these manuals on [AtariAge](https://atariage.com/).
The exact reward dynamics depend on the environment and are usually documented in the game's manual. You can find these manuals on [AtariAge](https://atariage.com/).

## Stochasticity

As the Atari games are entirely deterministic, agents could achieve
state-of-the-art performance by simply memorizing an optimal sequence of actions while completely ignoring observations from the environment.
As the Atari games are entirely deterministic, agents can achieve state-of-the-art performance by simply memorizing an optimal sequence of actions while completely ignoring observations from the environment.

To avoid this, there are several methods to avoid this.

1. Sticky actions: Instead of always simulating the action passed to the environment, there is a small
probability that the previously executed action is used instead. In the v0 and v5 environments, the probability of
repeating an action is `25%` while in v4 environments, the probability is `0%`. Users can specify the repeat action
probability using `repeat_action_probability` to `make`.
2. Frameskipping: On each environment step, the action can be repeated for a random number of frames. This behavior
2. Frame-skipping: On each environment step, the action can be repeated for a random number of frames. This behavior
may be altered by setting the keyword argument `frameskip` to either a positive integer or
a tuple of two positive integers. If `frameskip` is an integer, frame skipping is deterministic, and in each step the action is
repeated `frameskip` many times. Otherwise, if `frameskip` is a tuple, the number of skipped frames is chosen uniformly at
Expand Down Expand Up @@ -207,38 +205,33 @@ Actions passed into the environment are then thresholded to discrete using the `

## Version History and Naming Schemes

All Atari games are available in three versions. They differ in the default settings of the arguments above.
The differences are listed in the following table:

| Version | `frameskip=` | `repeat_action_probability=` | `full_action_space=` |
|---------|--------------|------------------------------|----------------------|
| v0 | `(2, 5,)` | `0.25` | `False` |
| v4 | `(2, 5,)` | `0.0` | `False` |
| v5 | `4` | `0.25` | `False` |

> Version v5 follows the best practices outlined in [[2]](#2). Thus, it is recommended to transition to v5 and
customize the environment using the arguments above, if necessary.

For each Atari game, several different configurations are registered in Gymnasium. The naming schemes are analogous for
v0 and v4. Let us take a look at all variations of Amidar-v0 that are registered with gymnasium:

| Name | `obs_type=` | `frameskip=` | `repeat_action_probability=` |
|----------------------------|-------------|--------------|------------------------------|
| Amidar-v0 | `"rgb"` | `(2, 5,)` | `0.25` |
| AmidarDeterministic-v0 | `"rgb"` | `4` | `0.0` |
| AmidarNoFrameskip-v0 | `"rgb"` | `1` | `0.25` |
| Amidar-ram-v0 | `"ram"` | `(2, 5,)` | `0.25` |
| Amidar-ramDeterministic-v0 | `"ram"` | `4` | `0.0` |
| Amidar-ramNoFrameskip-v0 | `"ram"` | `1` | `0.25` |

Things change in v5: The suffixes "Deterministic" and "NoFrameskip" are no longer available. Instead, you must specify the
environment configuration via arguments passed to `gymnasium.make`. Moreover, the v5 environments
are in the "ALE" namespace. The suffix "-ram" is still available. Thus, we get the following table:

| Name | `obs_type=` | `frameskip=` | `repeat_action_probability=` |
|-------------------|-------------|--------------|------------------------------|
| ALE/Amidar-v5 | `"rgb"` | `4` | `0.25` |
| ALE/Amidar-ram-v5 | `"ram"` | `4` | `0.25` |
In v0.11, the number of registered Atari environments was significantly reduced from 960 to 210 to only register `{rom_name}NoFrameskip-v4` the most popular environment and `ALE/{rom_name}-v5` following the best practices outlined in [[2]](#2).

| Name | `obs_type=` | `frameskip=` | `repeat_action_probability=` | `full_ation_space=` |
|-------------------------|-------------|--------------|------------------------------|---------------------|
| AdventureNoFrameskip-v4 | `"rgb"` | `1` | `0.00` | `False` |
| ALE/Adventure-v5 | `"rgb"` | `4` | `0.25` | `False` |

Importantly, `repeat_action_probability=0.25` can negatively impact the performance of agents so when comparing training graphs, be aware of the parameters used for fair comparisons.

To create previously implemented environment use the following parameters, `gymnasium.make(env_id, obs_type=..., frameskip=..., repeat_action_probability=..., full_action_space=...)`.

| Name | `obs_type=` | `frameskip=` | `repeat_action_probability=` | `full_action_space=` |
|-------------------------------|-------------|--------------|------------------------------|----------------------|
| Adventure-v0 | `"rgb"` | `(2, 5,)` | `0.25` | `False` |
| AdventureDeterministic-v0 | `"rgb"` | `4` | `0.25` | `False` |
| AdventureNoframeskip-v0 | `"rgb"` | `1` | `0.25` | `False` |
| Adventure-ram-v0 | `"ram"` | `(2, 5,)` | `0.25` | `False` |
| Adventure-ramDeterministic-v0 | `"ram"` | `4` | `0.25` | `False` |
| Adventure-ramNoframeskip-v0 | `"ram"` | `1` | `0.25` | `False` |
| Adventure-v4 | `"rgb"` | `(2, 5,)` | `0.0` | `False` |
| AdventureDeterministic-v4 | `"rgb"` | `4` | `0.0` | `False` |
| AdventureNoframeskip-v4 | `"rgb"` | `1` | `0.0` | `False` |
| Adventure-ram-v4 | `"ram"` | `(2, 5,)` | `0.0` | `False` |
| Adventure-ramDeterministic-v4 | `"ram"` | `4` | `0.0` | `False` |
| Adventure-ramNoframeskip-v4 | `"ram"` | `1` | `0.0` | `False` |
| ALE/Adventure-v5 | `"rgb"` | `4` | `0.25` | `False` |
| ALE/Adventure-ram-v5 | `"ram"` | `4` | `0.25` | `False` |

## Flavors

Expand Down
32 changes: 11 additions & 21 deletions docs/environments/adventure.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,11 @@ title: Adventure

This environment is part of the <a href='..'>Atari environments</a>. Please read that page first for general information.

| | |
|-------------------|--------------------------------------|
| Action Space | Discrete(18) |
| Observation Space | Box(0, 255, (250, 160, 3), uint8) |
| Import | `gymnasium.make("ALE/Adventure-v5")` |
| | |
|-------------------|------------------------------------|
| Make | gymnasium.make("ALE/Adventure-v5") |
| Action Space | Discrete(18) |
| Observation Space | Box(0, 255, (250, 160, 3), uint8) |

For more Adventure variants with different observation and action spaces, see the variants section.

Expand Down Expand Up @@ -56,22 +56,12 @@ See variants section for the type of observation used by each environment id by
Adventure has the following variants of the environment id which have the following differences in observation,
the number of frame-skips and the repeat action probability.

| Env-id | obs_type= | frameskip= | repeat_action_probability= |
|-------------------------------|-------------|--------------|------------------------------|
| Adventure-v0 | `"rgb"` | `(2, 5)` | `0.25` |
| Adventure-ram-v0 | `"ram"` | `(2, 5)` | `0.25` |
| Adventure-ramDeterministic-v0 | `"ram"` | `4` | `0.25` |
| Adventure-ramNoFrameskip-v0 | `"ram"` | `1` | `0.25` |
| AdventureDeterministic-v0 | `"rgb"` | `4` | `0.25` |
| AdventureNoFrameskip-v0 | `"rgb"` | `1` | `0.25` |
| Adventure-v4 | `"rgb"` | `(2, 5)` | `0.0` |
| Adventure-ram-v4 | `"ram"` | `(2, 5)` | `0.0` |
| Adventure-ramDeterministic-v4 | `"ram"` | `4` | `0.0` |
| Adventure-ramNoFrameskip-v4 | `"ram"` | `1` | `0.0` |
| AdventureDeterministic-v4 | `"rgb"` | `4` | `0.0` |
| AdventureNoFrameskip-v4 | `"rgb"` | `1` | `0.0` |
| ALE/Adventure-v5 | `"rgb"` | `4` | `0.25` |
| ALE/Adventure-ram-v5 | `"ram"` | `4` | `0.25` |
| Env-id | obs_type= | frameskip= | repeat_action_probability= |
|-------------------------|-------------|--------------|------------------------------|
| AdventureNoFrameskip-v4 | `"rgb"` | `1` | `0.00` |
| ALE/Adventure-v5 | `"rgb"` | `4` | `0.25` |

See the [version history page](https://ale.farama.org/environments/#version-history-and-naming-schemes) to implement previously implemented environments, e.g., `AdventureNoFrameskip-v4`.

## Difficulty and modes

Expand Down
32 changes: 11 additions & 21 deletions docs/environments/air_raid.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,11 @@ title: AirRaid

This environment is part of the <a href='..'>Atari environments</a>. Please read that page first for general information.

| | |
|-------------------|------------------------------------|
| Action Space | Discrete(6) |
| Observation Space | Box(0, 255, (250, 160, 3), uint8) |
| Import | `gymnasium.make("ALE/AirRaid-v5")` |
| | |
|-------------------|-----------------------------------|
| Make | gymnasium.make("ALE/AirRaid-v5") |
| Action Space | Discrete(6) |
| Observation Space | Box(0, 255, (250, 160, 3), uint8) |

For more AirRaid variants with different observation and action spaces, see the variants section.

Expand Down Expand Up @@ -51,22 +51,12 @@ See variants section for the type of observation used by each environment id by
AirRaid has the following variants of the environment id which have the following differences in observation,
the number of frame-skips and the repeat action probability.

| Env-id | obs_type= | frameskip= | repeat_action_probability= |
|-----------------------------|-------------|--------------|------------------------------|
| AirRaid-v0 | `"rgb"` | `(2, 5)` | `0.25` |
| AirRaid-ram-v0 | `"ram"` | `(2, 5)` | `0.25` |
| AirRaid-ramDeterministic-v0 | `"ram"` | `4` | `0.25` |
| AirRaid-ramNoFrameskip-v0 | `"ram"` | `1` | `0.25` |
| AirRaidDeterministic-v0 | `"rgb"` | `4` | `0.25` |
| AirRaidNoFrameskip-v0 | `"rgb"` | `1` | `0.25` |
| AirRaid-v4 | `"rgb"` | `(2, 5)` | `0.0` |
| AirRaid-ram-v4 | `"ram"` | `(2, 5)` | `0.0` |
| AirRaid-ramDeterministic-v4 | `"ram"` | `4` | `0.0` |
| AirRaid-ramNoFrameskip-v4 | `"ram"` | `1` | `0.0` |
| AirRaidDeterministic-v4 | `"rgb"` | `4` | `0.0` |
| AirRaidNoFrameskip-v4 | `"rgb"` | `1` | `0.0` |
| ALE/AirRaid-v5 | `"rgb"` | `4` | `0.25` |
| ALE/AirRaid-ram-v5 | `"ram"` | `4` | `0.25` |
| Env-id | obs_type= | frameskip= | repeat_action_probability= |
|-----------------------|-------------|--------------|------------------------------|
| AirRaidNoFrameskip-v4 | `"rgb"` | `1` | `0.00` |
| ALE/AirRaid-v5 | `"rgb"` | `4` | `0.25` |

See the [version history page](https://ale.farama.org/environments/#version-history-and-naming-schemes) to implement previously implemented environments, e.g., `AirRaidNoFrameskip-v4`.

## Difficulty and modes

Expand Down
26 changes: 8 additions & 18 deletions docs/environments/alien.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,15 +13,15 @@ This environment is part of the <a href='..'>Atari environments</a>. Please read

| | |
|-------------------|-----------------------------------|
| Make | gymnasium.make("ALE/Alien-v5") |
| Action Space | Discrete(18) |
| Observation Space | Box(0, 255, (210, 160, 3), uint8) |
| Import | `gymnasium.make("ALE/Alien-v5")` |

For more Alien variants with different observation and action spaces, see the variants section.

## Description

You are stuck in a maze-like space ship with three aliens. You goal is to destroy their eggs that are scattered all over the ship while simultaneously avoiding the aliens (they are trying to kill you). You have a flamethrower that can help you turn them away in tricky situations. Moreover, you can occasionally collect a power-up (pulsar) that gives you the temporary ability to kill aliens.
You are stuck in a maze-like spaceship with three aliens. You goal is to destroy their eggs that are scattered all over the ship while simultaneously avoiding the aliens (they are trying to kill you). You have a flamethrower that can help you turn them away in tricky situations. Moreover, you can occasionally collect a power-up (pulsar) that gives you the temporary ability to kill aliens.

For a more detailed documentation, see [the AtariAge page](https://atariage.com/manual_html_page.php?SoftwareID=815)

Expand Down Expand Up @@ -60,22 +60,12 @@ You score points by destroying eggs, killing aliens, using pulsars, and collecti
Alien has the following variants of the environment id which have the following differences in observation,
the number of frame-skips and the repeat action probability.

| Env-id | obs_type= | frameskip= | repeat_action_probability= |
|---------------------------|-------------|--------------|------------------------------|
| Alien-v0 | `"rgb"` | `(2, 5)` | `0.25` |
| Alien-ram-v0 | `"ram"` | `(2, 5)` | `0.25` |
| Alien-ramDeterministic-v0 | `"ram"` | `4` | `0.25` |
| Alien-ramNoFrameskip-v0 | `"ram"` | `1` | `0.25` |
| AlienDeterministic-v0 | `"rgb"` | `4` | `0.25` |
| AlienNoFrameskip-v0 | `"rgb"` | `1` | `0.25` |
| Alien-v4 | `"rgb"` | `(2, 5)` | `0.0` |
| Alien-ram-v4 | `"ram"` | `(2, 5)` | `0.0` |
| Alien-ramDeterministic-v4 | `"ram"` | `4` | `0.0` |
| Alien-ramNoFrameskip-v4 | `"ram"` | `1` | `0.0` |
| AlienDeterministic-v4 | `"rgb"` | `4` | `0.0` |
| AlienNoFrameskip-v4 | `"rgb"` | `1` | `0.0` |
| ALE/Alien-v5 | `"rgb"` | `4` | `0.25` |
| ALE/Alien-ram-v5 | `"ram"` | `4` | `0.25` |
| Env-id | obs_type= | frameskip= | repeat_action_probability= |
|---------------------|-------------|--------------|------------------------------|
| AlienNoFrameskip-v4 | `"rgb"` | `1` | `0.00` |
| ALE/Alien-v5 | `"rgb"` | `4` | `0.25` |

See the [version history page](https://ale.farama.org/environments/#version-history-and-naming-schemes) to implement previously implemented environments, e.g., `AlienNoFrameskip-v4`.

## Difficulty and modes

Expand Down
Loading
Loading