[SD3 ControlNet] bug in pipeline 'controlnet_pooled_projections' #9686

tobiasfshr · 2024-10-15T16:52:46Z

Describe the bug

Hi,

I think I found an issue that causes a misalignment between training and inference in SD3 ControlNet.

diffusers/src/diffusers/pipelines/controlnet_sd3/pipeline_stable_diffusion_3_controlnet.py

Line 977 in a3e8d3f

if controlnet_pooled_projections is None:

I think the if-else block starting there is not correct. It should be

        if controlnet_pooled_projections is None and pooled_prompt_embeds is None:
            controlnet_pooled_projections = torch.zeros_like(pooled_prompt_embeds)
        elif controlnet_pooled_projections is None:
            controlnet_pooled_projections = pooled_prompt_embeds

Given that in training, the pooled_prompt_embeds are fed to the model:

diffusers/examples/controlnet/train_controlnet_sd3.py

Line 1293 in a3e8d3f

pooled_projections=pooled_prompt_embeds,

Additionally, I am wondering if this line:

diffusers/examples/controlnet/train_controlnet_sd3.py

Line 1287 in a3e8d3f

controlnet_image = controlnet_image * vae.config.scaling_factor

Should be aligned with this line:

diffusers/examples/controlnet/train_controlnet_sd3.py

Line 1257 in a3e8d3f

    
           model_input = (model_input - vae.config.shift_factor) * vae.config.scaling_factor

This seems to be the more sensible approach, but will probably not make much difference since the ControlNet can also learn the shift. It might speed up convergence slightly.

Best,
Tobias

Reproduction

Train an SD3 ControlNet and during log_validation it will be executed.

Logs

No response

System Info

diffusers==0.30.3

Who can help?

@yiyixuxu @sayakpaul

xduzhangjiayu · 2024-10-16T01:55:54Z

I have tried train sd3 controlnet, but it seems the validation results are really bad, and the training loss was oscillating all the time, you can take a look the results at this discussion #9675

Maybe you have any suggestions to make training sd3 controlnet have better results? thank you！

egbertYeah · 2024-10-18T06:07:29Z

i also find this bug, but when i test https://huggingface.co/alimama-creative/SD3-Controlnet-Inpainting repo, controlnet_pooled_projections = torch.zeros_like(pooled_prompt_embeds) is right.

xduzhangjiayu · 2024-10-18T10:16:41Z

i also find this bug, but when i test https://huggingface.co/alimama-creative/SD3-Controlnet-Inpainting repo, controlnet_pooled_projections = torch.zeros_like(pooled_prompt_embeds) is right.

Could you please describe the bug? Maybe I have same bug like you

egbertYeah · 2024-10-18T10:33:29Z

i also find this bug, but when i test https://huggingface.co/alimama-creative/SD3-Controlnet-Inpainting repo, controlnet_pooled_projections = torch.zeros_like(pooled_prompt_embeds) is right.

Could you please describe the bug? Maybe I have same bug like you

controlnet_pooled_projections variable is different at inference and training time，when training use pooled_prompt_embeds，inference use torch.zeros_like(pooled_prompt_embeds)

tobiasfshr added the bug Something isn't working label Oct 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SD3 ControlNet] bug in pipeline 'controlnet_pooled_projections' #9686

[SD3 ControlNet] bug in pipeline 'controlnet_pooled_projections' #9686

tobiasfshr commented Oct 15, 2024

xduzhangjiayu commented Oct 16, 2024

egbertYeah commented Oct 18, 2024

xduzhangjiayu commented Oct 18, 2024 •

edited

Loading

egbertYeah commented Oct 18, 2024

[SD3 ControlNet] bug in pipeline 'controlnet_pooled_projections' #9686

[SD3 ControlNet] bug in pipeline 'controlnet_pooled_projections' #9686

Comments

tobiasfshr commented Oct 15, 2024

Describe the bug

Reproduction

Logs

System Info

Who can help?

xduzhangjiayu commented Oct 16, 2024

egbertYeah commented Oct 18, 2024

xduzhangjiayu commented Oct 18, 2024 • edited Loading

egbertYeah commented Oct 18, 2024

xduzhangjiayu commented Oct 18, 2024 •

edited

Loading