Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SD3 ControlNet] bug in pipeline 'controlnet_pooled_projections' #9686

Open
tobiasfshr opened this issue Oct 15, 2024 · 4 comments
Open

[SD3 ControlNet] bug in pipeline 'controlnet_pooled_projections' #9686

tobiasfshr opened this issue Oct 15, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@tobiasfshr
Copy link

Describe the bug

Hi,

I think I found an issue that causes a misalignment between training and inference in SD3 ControlNet.

I think the if-else block starting there is not correct. It should be

        if controlnet_pooled_projections is None and pooled_prompt_embeds is None:
            controlnet_pooled_projections = torch.zeros_like(pooled_prompt_embeds)
        elif controlnet_pooled_projections is None:
            controlnet_pooled_projections = pooled_prompt_embeds

Given that in training, the pooled_prompt_embeds are fed to the model:

pooled_projections=pooled_prompt_embeds,

Additionally, I am wondering if this line:

controlnet_image = controlnet_image * vae.config.scaling_factor

Should be aligned with this line:
model_input = (model_input - vae.config.shift_factor) * vae.config.scaling_factor

This seems to be the more sensible approach, but will probably not make much difference since the ControlNet can also learn the shift. It might speed up convergence slightly.

Best,
Tobias

Reproduction

Train an SD3 ControlNet and during log_validation it will be executed.

Logs

No response

System Info

diffusers==0.30.3

Who can help?

@yiyixuxu @sayakpaul

@tobiasfshr tobiasfshr added the bug Something isn't working label Oct 15, 2024
@xduzhangjiayu
Copy link
Contributor

I have tried train sd3 controlnet, but it seems the validation results are really bad, and the training loss was oscillating all the time, you can take a look the results at this discussion #9675

Maybe you have any suggestions to make training sd3 controlnet have better results? thank you!

@egbertYeah
Copy link

i also find this bug, but when i test https://huggingface.co/alimama-creative/SD3-Controlnet-Inpainting repo, controlnet_pooled_projections = torch.zeros_like(pooled_prompt_embeds) is right.

@xduzhangjiayu
Copy link
Contributor

xduzhangjiayu commented Oct 18, 2024

i also find this bug, but when i test https://huggingface.co/alimama-creative/SD3-Controlnet-Inpainting repo, controlnet_pooled_projections = torch.zeros_like(pooled_prompt_embeds) is right.

Could you please describe the bug? Maybe I have same bug like you

@egbertYeah
Copy link

i also find this bug, but when i test https://huggingface.co/alimama-creative/SD3-Controlnet-Inpainting repo, controlnet_pooled_projections = torch.zeros_like(pooled_prompt_embeds) is right.

Could you please describe the bug? Maybe I have same bug like you

controlnet_pooled_projections variable is different at inference and training time,when training use pooled_prompt_embeds,inference use torch.zeros_like(pooled_prompt_embeds)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants