Releasing tokenizer checkpoints of w/o VF loss and with VF loss (MAE) #8

xingjianleng · 2025-01-08T04:11:21Z

Hi authors,

Thank you for the interesting work.

There are three variants of tokenizer checkpoints mentioned in Section 5.1 of the paper. But for now, I can only find the checkpoint of VF loss (DINOv2) in this repository. I'm wondering whether the authors have some plan to release the other two shortly.

Thanks!

JingfengYao · 2025-01-09T11:09:07Z

Thanks for your interest in our work.

In fact, in Section 5, we trained 9 variants of VAEs with different latent dimensions and VF losses (see Table 2). The checkpoints have not been released as they were primarily exploratory in nature and underwent limited training epochs. The VA-VAE we released was ultimately trained for a longer period to ensure its final performance, and it is the one used in Table 3. We may consider releasing these experimental checkpoints in the future.

xingjianleng · 2025-01-09T11:21:07Z

Thank you for your response.

I have further questions about the hyperparameters used to train the released VAE. As you mentioned, different variants of the VAE serve as the purpose for ablation studies.

So, is this sentence "To accelerate convergence, we adjust the learning rate and global batch size to 1e-4 and 256, respectively. In contrast to previous settings, each tokenizer is trained on ImageNet 256 × 256 for 50 epochs." from the paper referring to only the hyperparameters for the ablations studies?

If so, is it possible to disclose the hyperparameters to train the released VAE, e.g., training epochs, learning rate, learning rate scale, batch size...

JingfengYao · 2025-01-09T11:51:59Z

Thank you for your reminder.

Indeed, we have mentioned this issue in Section 5.4. We employed a progressive strategy. Specifically, we used a fixed batch size of 256 and a learning rate of 1e-4. In the early stages of training, we used a larger w_hyper=0.5 and did not apply the margin to losses. Subsequently, at the 100th and 115th epochs, we reduced w_hyper to 0.1 and activated the margin strategy, respectively.

We will include a more detailed description of this part in the next version of the paper.

txytju · 2025-01-12T03:05:04Z

Thanks for your interest in our work.

In fact, in Section 5, we trained 9 variants of VAEs with different latent dimensions and VF losses (see Table 2). The checkpoints have not been released as they were primarily exploratory in nature and underwent limited training epochs. The VA-VAE we released was ultimately trained for a longer period to ensure its final performance, and it is the one used in Table 3. We may consider releasing these experimental checkpoints in the future.

Look forward to it

JingfengYao · 2025-01-16T08:13:19Z

Hi, thanks for your attention.

We have released more va-vae experimental variants here. Hope you like it. 😊

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releasing tokenizer checkpoints of w/o VF loss and with VF loss (MAE) #8

Releasing tokenizer checkpoints of w/o VF loss and with VF loss (MAE) #8

xingjianleng commented Jan 8, 2025

JingfengYao commented Jan 9, 2025

xingjianleng commented Jan 9, 2025

JingfengYao commented Jan 9, 2025

txytju commented Jan 12, 2025

JingfengYao commented Jan 16, 2025

Releasing tokenizer checkpoints of w/o VF loss and with VF loss (MAE) #8

Releasing tokenizer checkpoints of w/o VF loss and with VF loss (MAE) #8

Comments

xingjianleng commented Jan 8, 2025

JingfengYao commented Jan 9, 2025

xingjianleng commented Jan 9, 2025

JingfengYao commented Jan 9, 2025

txytju commented Jan 12, 2025

JingfengYao commented Jan 16, 2025