You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for the amazing work. I'm reaching out to ask you an opinion about a problem I am currently facing. I am working with an architecture that is based on StyleGAN2-ADA. My training dataset used to be uni-modal, then I shifted to 2 image modalities and currently I have 3 different classes. To condition the generation, I adopted the simple approach consisting of embedding the label into a 512-dimensional vector and concatenate it with z. The same label is also fed to the discriminator, so that the generation becomes conditional. Up to a certain point this conditioning works great, but the performance starts to deteriorate when fed with a 3-class dataset manifesting high variability between classes. Now we come to the question. You did a great job with StyleGAN-XL and here with StyleGAN-T to improve the conditioning. The improvement is really great, to the point that you can handle a multi-class datasets with really vast variability such as ImageNet. However, I cannot migrate to these architectures as they are too expensive in terms of parameters. Also, I don't really need a large boost of performance, my dataset is not that diverse as ImageNet, I'd just like to have a little improvement towards that direction. Among all the solutions you proposed in your papers (pretrained embeddings, using ViT, the CLF guidance and now the CLIP guidance), what do you suggest could be the easiest to plug into StyleGAN2 to boost the conditioning?
Thanks a lot in advance,
Francesco
The text was updated successfully, but these errors were encountered:
Dear authors,
Thanks for the amazing work. I'm reaching out to ask you an opinion about a problem I am currently facing. I am working with an architecture that is based on StyleGAN2-ADA. My training dataset used to be uni-modal, then I shifted to 2 image modalities and currently I have 3 different classes. To condition the generation, I adopted the simple approach consisting of embedding the label into a 512-dimensional vector and concatenate it with z. The same label is also fed to the discriminator, so that the generation becomes conditional. Up to a certain point this conditioning works great, but the performance starts to deteriorate when fed with a 3-class dataset manifesting high variability between classes. Now we come to the question. You did a great job with StyleGAN-XL and here with StyleGAN-T to improve the conditioning. The improvement is really great, to the point that you can handle a multi-class datasets with really vast variability such as ImageNet. However, I cannot migrate to these architectures as they are too expensive in terms of parameters. Also, I don't really need a large boost of performance, my dataset is not that diverse as ImageNet, I'd just like to have a little improvement towards that direction. Among all the solutions you proposed in your papers (pretrained embeddings, using ViT, the CLF guidance and now the CLIP guidance), what do you suggest could be the easiest to plug into StyleGAN2 to boost the conditioning?
Thanks a lot in advance,
Francesco
The text was updated successfully, but these errors were encountered: