Question about conditioning #18

manigalati · 2024-02-15T15:13:31Z

Dear authors,

Thanks for the amazing work. I'm reaching out to ask you an opinion about a problem I am currently facing. I am working with an architecture that is based on StyleGAN2-ADA. My training dataset used to be uni-modal, then I shifted to 2 image modalities and currently I have 3 different classes. To condition the generation, I adopted the simple approach consisting of embedding the label into a 512-dimensional vector and concatenate it with z. The same label is also fed to the discriminator, so that the generation becomes conditional. Up to a certain point this conditioning works great, but the performance starts to deteriorate when fed with a 3-class dataset manifesting high variability between classes. Now we come to the question. You did a great job with StyleGAN-XL and here with StyleGAN-T to improve the conditioning. The improvement is really great, to the point that you can handle a multi-class datasets with really vast variability such as ImageNet. However, I cannot migrate to these architectures as they are too expensive in terms of parameters. Also, I don't really need a large boost of performance, my dataset is not that diverse as ImageNet, I'd just like to have a little improvement towards that direction. Among all the solutions you proposed in your papers (pretrained embeddings, using ViT, the CLF guidance and now the CLIP guidance), what do you suggest could be the easiest to plug into StyleGAN2 to boost the conditioning?

Thanks a lot in advance,
Francesco

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about conditioning #18

Question about conditioning #18

manigalati commented Feb 15, 2024

Question about conditioning #18

Question about conditioning #18

Comments

manigalati commented Feb 15, 2024