Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproduction of the performance #154

Open
cubeyoung opened this issue Jan 18, 2025 · 1 comment
Open

Reproduction of the performance #154

cubeyoung opened this issue Jan 18, 2025 · 1 comment
Labels
Answered Answered the question

Comments

@cubeyoung
Copy link

cubeyoung commented Jan 18, 2025

I can not reproduce the results of the paper

for 0.6B , FID 5.81
for 1.6B, FID 5.92

But, my reproduced results

for 0.6B , FID 15.31
for 1.6B, FID 20.23

is significantly different from the results of the paper.

I used checkpoint and pipeline "Sana_600M_1024px_diffusers" and "Sana_1600M_1024px_MultiLing_diffusers for 0.6B, 1.6b, respectively.
Following your code line and arguments, step = 20, guidance scale = 4.5,etc I generated mjhq30K samples and calculated FID with the reference image folder by using clean FID.

Are you averaged each FID Score for 10 categories in mjhq30k dataset?

and wonder if i use float32, it can affect to results?

Furthermore, I wonder the reported score is applied to other guidance, such as PAG??

@lawrence-cj
Copy link
Collaborator

'float32' will definitely affect the performance. Please use the default precision for inference.

Refer to this report: huggingface/diffusers#10336

@lawrence-cj lawrence-cj added the Answered Answered the question label Jan 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Answered Answered the question
Projects
None yet
Development

No branches or pull requests

2 participants