Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resume training from checkpoints #20361

Open
ArkashJ opened this issue Oct 23, 2024 · 2 comments
Open

Resume training from checkpoints #20361

ArkashJ opened this issue Oct 23, 2024 · 2 comments
Labels
docs Documentation related needs triage Waiting to be triaged by maintainers

Comments

@ArkashJ
Copy link

ArkashJ commented Oct 23, 2024

📚 Documentation

There's a lot of documentation out there about using the resume_from_checkpoint keyword in a pytorch trainer however this is wrong. In the latest pytorch version, one needs to provide the path to the checkpoint (.ckpt file) itself in the fit function for the trainer to get it going. here's some popular incorrect references -

  1. https://stackoverflow.com/questions/71961436/pytorch-lightning-resuming-from-checkpoint-with-new-data
  2. https://lightning.ai/forums/t/how-to-resume-training/432
  3. Resume training from checkpoint with new data #12845
  4. https://www.youtube.com/watch?v=V5KGEzIwAxQ

ChatGPT and claude also got this wrong:
Uploading Screenshot 2024-10-23 at 1.38.11 PM.png…

I wanted this to get visibility because knowing how to resume training from checkpoints is imperative and there's a lot of wrong information out there!

cc @Borda

@ArkashJ ArkashJ added docs Documentation related needs triage Waiting to be triaged by maintainers labels Oct 23, 2024
@arijit-hub
Copy link

Hye,

The correct definition is indeed mentioned in the official documentation: https://lightning.ai/docs/pytorch/stable/common/checkpointing_basic.html#resume-training-state

I think maybe because of a previous version, the wrong solutions have been popularized.

@ArkashJ
Copy link
Author

ArkashJ commented Oct 25, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation related needs triage Waiting to be triaged by maintainers
Projects
None yet
Development

No branches or pull requests

2 participants