readme updates for full DPO distributed recipe #2363

ebsmothers · 2025-02-07T21:56:23Z

Update our readme to include DPO full finetune distributed recipe now that #2275 has landed

pytorch-bot · 2025-02-07T21:56:27Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2363

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit a0b3a51 with merge base fb52557 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

SalmanMohammadi · 2025-02-08T14:13:23Z

README.md

@@ -72,7 +72,8 @@ torchtune provides the following finetuning recipes for training on one or more
 | DoRA/QDoRA Finetuning | ✅ | ✅ | ❌ | [lora_finetune_single_device](recipes/lora_finetune_single_device.py) <br> [lora_finetune_distributed](recipes/lora_finetune_distributed.py)| [Llama3 8B QDoRA single-device](recipes/configs/llama3/8B_qdora_single_device.yaml) <br> [Llama3 8B DoRA distributed](recipes/configs/llama3/8B_dora.yaml)
 | Quantization-Aware Training | ❌ | ✅ | ❌ | [qat_distributed](recipes/qat_distributed.py)| [Llama3 8B QAT](recipes/configs/llama3/8B_qat_full.yaml)
 | Quantization-Aware Training and LoRA Finetuning | ❌ | ✅ | ❌ | [qat_lora_finetune_distributed](recipes/qat_lora_finetune_distributed.py)| [Llama3 8B QAT](recipes/configs/llama3/8B_qat_lora.yaml)
-| Direct Preference Optimization | ✅ | ✅ | ❌ | [lora_dpo_single_device](recipes/lora_dpo_single_device.py) <br> [lora_dpo_distributed](recipes/lora_dpo_distributed.py) | [Llama2 7B single-device](recipes/configs/llama2/7B_lora_dpo_single_device.yaml) <br> [Llama2 7B distributed](recipes/configs/llama2/7B_lora_dpo.yaml)
+| Direct Preference Optimization: Full Finetuning | ❌ | ✅ | ❌ | [full_dpo_distributed](recipes/full_dpo_distributed.py) | [Llama3.1 8B DPO](recipes/configs/llama3_1/8B_full_dpo.yaml)
+| Direct Preference Optimization with LoRA | ✅ | ✅ | ❌ | [lora_dpo_single_device](recipes/lora_dpo_single_device.py) <br> [lora_dpo_distributed](recipes/lora_dpo_distributed.py) | [Llama2 7B single-device](recipes/configs/llama2/7B_lora_dpo_single_device.yaml) <br> [Llama2 7B distributed](recipes/configs/llama2/7B_lora_dpo.yaml)


Suggested change

| Direct Preference Optimization with LoRA | ✅ | ✅ | ❌ | [lora_dpo_single_device](recipes/lora_dpo_single_device.py) <br> [lora_dpo_distributed](recipes/lora_dpo_distributed.py) | [Llama2 7B single-device](recipes/configs/llama2/7B_lora_dpo_single_device.yaml) <br> [Llama2 7B distributed](recipes/configs/llama2/7B_lora_dpo.yaml)

| LoRA Direct Preference Optimization | ✅ | ✅ | ❌ | [lora_dpo_single_device](recipes/lora_dpo_single_device.py) <br> [lora_dpo_distributed](recipes/lora_dpo_distributed.py) | [Llama2 7B single-device](recipes/configs/llama2/7B_lora_dpo_single_device.yaml) <br> [Llama2 7B distributed](recipes/configs/llama2/7B_lora_dpo.yaml)

Also we have more up to date 3.1 8B configs we could point to, if you'd like : )

readme updates for full DPO distributed recipe

bc32679

ebsmothers requested a review from joecummings February 7, 2025 21:56

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 7, 2025

ebsmothers requested review from felipemello1 and acisseJZhong February 7, 2025 21:56

SalmanMohammadi reviewed Feb 8, 2025

View reviewed changes

SalmanMohammadi approved these changes Feb 8, 2025

View reviewed changes

felipemello1 approved these changes Feb 8, 2025

View reviewed changes

comments

a0b3a51

ebsmothers merged commit b3964af into pytorch:main Feb 10, 2025
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme updates for full DPO distributed recipe #2363

readme updates for full DPO distributed recipe #2363

ebsmothers commented Feb 7, 2025

pytorch-bot bot commented Feb 7, 2025 •

edited

Loading

SalmanMohammadi Feb 8, 2025 •

edited

Loading

SalmanMohammadi Feb 8, 2025

	\| Direct Preference Optimization with LoRA \| ✅ \| ✅ \| ❌ \| [lora_dpo_single_device](recipes/lora_dpo_single_device.py) <br> [lora_dpo_distributed](recipes/lora_dpo_distributed.py) \| [Llama2 7B single-device](recipes/configs/llama2/7B_lora_dpo_single_device.yaml) <br> [Llama2 7B distributed](recipes/configs/llama2/7B_lora_dpo.yaml)
	\| LoRA Direct Preference Optimization \| ✅ \| ✅ \| ❌ \| [lora_dpo_single_device](recipes/lora_dpo_single_device.py) <br> [lora_dpo_distributed](recipes/lora_dpo_distributed.py) \| [Llama2 7B single-device](recipes/configs/llama2/7B_lora_dpo_single_device.yaml) <br> [Llama2 7B distributed](recipes/configs/llama2/7B_lora_dpo.yaml)

readme updates for full DPO distributed recipe #2363

readme updates for full DPO distributed recipe #2363

Conversation

ebsmothers commented Feb 7, 2025

pytorch-bot bot commented Feb 7, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2363

✅ No Failures

SalmanMohammadi Feb 8, 2025 • edited Loading

Choose a reason for hiding this comment

SalmanMohammadi Feb 8, 2025

Choose a reason for hiding this comment

pytorch-bot bot commented Feb 7, 2025 •

edited

Loading

SalmanMohammadi Feb 8, 2025 •

edited

Loading