Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to obtain the Instruments.index codebook files? #14

Open
mitao-cat opened this issue Sep 7, 2024 · 1 comment
Open

How to obtain the Instruments.index codebook files? #14

mitao-cat opened this issue Sep 7, 2024 · 1 comment

Comments

@mitao-cat
Copy link

作者您好,请问一下您的google drive中的.index(item编码文件)是怎么得到的?我尝试按论文实验setting部分和本仓库结构复现物品编码,经过了如下步骤:

  1. amazon_text_emb.py的115行的plm_checkpoint设为huggingface的huggyllama/llama-7B并运行,生成dataset.emb-llama-td.npy
  2. 运行run.sh,生成RQ-VAE的ckpt(包含best_loss_model.pthbest_collision_model.pth
  3. generate_indices.py的line45设置为best_loss_model.pth然后运行,生成dataset.index

上述步骤如无特殊说明均使用默认参数。这样生成的物品编码分布和您提供的编码分布有差距,并且最终推荐效果有下降。想请教一下上面的步骤哪里需要修改,才能得到和google drive中相似的码本?十分感谢!!!

@zhengbw0324
Copy link
Collaborator

@mitao-cat 您好!
我们在实验中并没有严格使用best_loss_model.pth或者best_collision_model.pth,而是综合loss和collision在最后的几个ckpt中选择一个进行索引生成。另外,目前的RQ-VAE实现相比于最初版本进行了一些改变,如训练时使用lr_scheduler,因此获得的码本确实无法于google drive中完全一样。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants