Reproduction of the performance #154

cubeyoung · 2025-01-18T05:09:04Z

I can not reproduce the results of the paper

for 0.6B , FID 5.81
for 1.6B, FID 5.92

But, my reproduced results

for 0.6B , FID 15.31
for 1.6B, FID 20.23

is significantly different from the results of the paper.

I used checkpoint and pipeline "Sana_600M_1024px_diffusers" and "Sana_1600M_1024px_MultiLing_diffusers for 0.6B, 1.6b, respectively.
Following your code line and arguments, step = 20, guidance scale = 4.5,etc I generated mjhq30K samples and calculated FID with the reference image folder by using clean FID.

Are you averaged each FID Score for 10 categories in mjhq30k dataset?

and wonder if i use float32, it can affect to results?

Furthermore, I wonder the reported score is applied to other guidance, such as PAG??

lawrence-cj · 2025-01-20T07:35:29Z

'float32' will definitely affect the performance. Please use the default precision for inference.

Refer to this report: huggingface/diffusers#10336

lawrence-cj added the Answered Answered the question label Jan 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproduction of the performance #154

Reproduction of the performance #154

cubeyoung commented Jan 18, 2025 •

edited

Loading

lawrence-cj commented Jan 20, 2025

Reproduction of the performance #154

Reproduction of the performance #154

Comments

cubeyoung commented Jan 18, 2025 • edited Loading

lawrence-cj commented Jan 20, 2025

cubeyoung commented Jan 18, 2025 •

edited

Loading