Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move function make_optimizer_and_scheduler to policy #401

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

michel-aractingi
Copy link
Collaborator

What this does

Move function make_optimizer_and_scheduler in train.py to the policy class. The function becomes increasingly long as we add new policies. After this PR the optimizer and scheduler of each policy can be created by calling policy.make_optimizer_and_scheduler().

How it was tested

Tested by running a training command for diffusion policy, act and tdmpc to make sure there are not syntax issues. Also the unit tests in test policies.

#ACT
python lerobot/scripts/train.py     policy=act     env=aloha     env.task=AlohaInsertion-v0 dataset_repo_id=lerobot/aloha_sim_insertion_human

#diffusion
python lerobot/scripts/train.py    hydra.run.dir=outputs/train/diffusion_pusht   hydra.job.name=diffusion_pusht   policy=diffusion env=pusht   env.task=PushT-v0   dataset_repo_id=lerobot/pusht device=cuda

#tdmpc 
python lerobot/scripts/train.py env=pusht policy=tdmpc_pusht device=cuda

Copy link
Collaborator

@Cadene Cadene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I wanted to avoid this design (method from the policy instantiating the optimizer), but I dont think we can ^^

Comment on lines -217 to +216
optimizer, _ = make_optimizer_and_scheduler(cfg, policy)
optimizer, _ = policy.make_optimizer_and_scheduler(cfg)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add optimizer, _ = policy.make_optimizer_and_scheduler(cfg) to another place in our unit tests?

It feels like this code logic should be tested for all policies, not just act. Thanks ;)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cadene Should we change test_act_backbone_lr to a more general function for all policies like:

@pytest.mark.parametrize(
    "env_name,policy_name",
    [
        ("pusht", "tdmpc"),
        ("pusht", "diffusion"),
        ("pusht", "vqbet"),
        ("aloha", "act")
    ],
)
def test_policy_backbone_lr(env_name, policy_name):
    """
    Test that the ACT policy can be instantiated with a different learning rate for the backbone.
    """
    cfg = init_hydra_config(
        DEFAULT_CONFIG_PATH,
        overrides=[
            f"env={env_name}",
            f"policy={policy_name}",
            f"device={DEVICE}",
            "training.lr_backbone=0.001",
            "training.lr=0.01",
        ],
    )
....

Copy link
Collaborator

@alexander-soare alexander-soare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for drafting this up @michel-aractingi.

So, thinking about this more deeply, I don't think the policy should return the optimizer and scheduler objects. This is because someone should be able to freely decide what these should be. Eg: Want to use exponential decaying LR on ACT? Fine.

What I do think should be provided by the policy is a list of parameter groups for optimization (which the user may use or may ignore - but they are there as a suggestion and for convenience).

So, supposing you agree with me, I would consider:

  • Changing the method name to get_optimizer_param_groups, and returning a list of param groups for an optimizer.

  • Using a standard training.lr_scheduler parameter in all policies. It can be None.

  • Using a standard optimizer parameter in all policies. It is required.

  • Using Hydra's class instantiatiation tooling to handle creating the optimizer and scheduler objects. https://hydra.cc/docs/advanced/instantiate_objects/overview/

    • And using the param groups from the policy to pass to the optimizer class on instantiation.

    What we would achieve with this is 1 common interface for all policies for setting the optimizer and scheduler via the hydra config.

@alexander-soare alexander-soare self-assigned this Sep 2, 2024
@michel-aractingi michel-aractingi marked this pull request as draft September 6, 2024 09:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants