You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I’ve been trying to use FLUX dev and schnell VAE in order to decode averaged latents.
A full, example workflow is:
Open two images
Get their respective latents with the VAE
Average the latents
Decode the result with the VAE
With close enough input images this can produce decent enough outputs, effectively interpolating between the two images.
so far I’ve been getting some blur and/or some patches at best in my results, though.
Another experiment I've been doing is sampling from the latent distribution to try and get close variations of an image.
The workflow is:
get an image
encode it with the VAE, get the underlying gaussian distributions tensor
pick several samples from it
decode the samples
Hopefully this produces different images. In practice this is not the case with FLUX VAE. The stds are very small relative to the mean, and perturbing the samples by adding many stds does not help in my xp.
Does anyone have knowledge about the FLUX VAE? That could help me refine this, eg knowledge about the latent space (I know the shape of the latents Il thinking more of its statistical properties). For example knowing the KL loss weight.
I understand that in an image generation context, the VAE is used mostly for performance reasons and priority is given to the reconstruction loss over the KL loss.
Hi,
I’ve been trying to use FLUX dev and schnell VAE in order to decode averaged latents.
A full, example workflow is:
With close enough input images this can produce decent enough outputs, effectively interpolating between the two images.
so far I’ve been getting some blur and/or some patches at best in my results, though.
Another experiment I've been doing is sampling from the latent distribution to try and get close variations of an image.
The workflow is:
Hopefully this produces different images. In practice this is not the case with FLUX VAE. The stds are very small relative to the mean, and perturbing the samples by adding many stds does not help in my xp.
Does anyone have knowledge about the FLUX VAE? That could help me refine this, eg knowledge about the latent space (I know the shape of the latents Il thinking more of its statistical properties). For example knowing the KL loss weight.
I understand that in an image generation context, the VAE is used mostly for performance reasons and priority is given to the reconstruction loss over the KL loss.
Happy to provide a snippet.
thanks,
@apolinario maybe?
The text was updated successfully, but these errors were encountered: