Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on fine-tuning Stable Diffusion #5

Open
TomLucidor opened this issue Sep 9, 2024 · 0 comments
Open

Question on fine-tuning Stable Diffusion #5

TomLucidor opened this issue Sep 9, 2024 · 0 comments

Comments

@TomLucidor
Copy link

TomLucidor commented Sep 9, 2024

  1. Would Diffusion-RPO be better for Stable Diffusion than DPO? ORPO can be used to replace SFT/RLHF/DPO in LLMs, since even "bad examples" (artifacts and misalignment) gets baked into the fine-tuning. https://arxiv.org/abs/2406.06382 https://arxiv.org/abs/2403.07691
  2. Can RPO and DPO apply to LoRA refinement, rather than just Checkpoint models? That way, LoRAs made from low-resource topics can be further adjusted after initial training is done (not even gonna start with POA and other down-sizing methods) https://www.arxiv.org/abs/2408.01031
  3. What other use cases are there for DPO? For example, there is this repo about it being used alongside "negative prompts" https://arxiv.org/abs/2407.01606v1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant