Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When fine-tuning llama-7b, approximately how much GPU memory is required for training? #2

Open
zty07 opened this issue Oct 15, 2023 · 4 comments

Comments

@zty07
Copy link

zty07 commented Oct 15, 2023

When fine-tuning llama, approximately how much GPU memory is required for training?

@MJ10
Copy link
Contributor

MJ10 commented Feb 13, 2024

Hi @zty07, sorry for the extremely late response. Could you please clarify which experiment you are interested in running? The memory would depend on the task (specifically the sequence length). The quantization code is somewhat broken unfortunately but will be fixed soon which should help with lowering the memory requirements.

@isaacbmiller
Copy link

@MJ10 Did the quantization code ever get fixed?

@abdalgader-a
Copy link

abdalgader-a commented Apr 28, 2024

@MJ10 -- running the next sentence code with 2B/3B parameter size model thrown OOM? any suggestion to resolve?
(PS: I used A100 80GB 8GPUs)

@isaacbmiller
Copy link

@abdalgader-a I have managed to get it running on a single A100, but my num_samples is way less than 20

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants