Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyperparams for results reported in the paper #3

Open
pclucas14 opened this issue Oct 11, 2018 · 4 comments
Open

Hyperparams for results reported in the paper #3

pclucas14 opened this issue Oct 11, 2018 · 4 comments
Labels
question Further information is requested

Comments

@pclucas14
Copy link

Hello,

is it possible to have hyperparameters values that reproduce the NVRNN and LM results on the PTB dataset ?

Many thanks,
Lucas

@jiacheng-xu
Copy link
Owner

Hi Lucas,
thanks for your interest. I am sorry that I didn’t put all the commands on the github page because there are too many tables ( and it’s hard to keep track of all the configuration of all the results).
Here are some of my intuition about training PTB: 1) large learning rate with decay (--lr 10 for example) 2) train longer with sgd (--epochs 100) 3) gradient clip and dropout. This repo provides an amazing configuration of training a LM. PyTorch example of word language model.

I listed the instance name of saved results. Possibly you can reproduce the results given the hyper-parameters.
Detailed Configuration:
Note: zero=lm; nor=gaussian; vmf=von mises-fisher
Standard Setting:

  • Dataptb_Distvmf_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat50_lr10.0_drop0.5_kappa35.0_auxw0.0001_normfFalse_nlay1_mixunk0.0_inpzTrue_cdbit0_cdbow0
  • Dataptb_Distvmf_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat50_lr10.0_drop0.5_kappa5.0_auxw0.0001_normfFalse_nlay1_mixunk0.0_inpzTrue_cdbit0_cdbow200

Inputless Setting
Condition on Bag-of-words:

  • Dataptb_Distzero_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat100_lr10.0_drop0.5_kappa0.1_auxw0.0001_normfFalse_nlay1_mixunk1.0_inpzTrue_cdbit0_cdbow200
  • Dataptb_Distnor_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat100_lr10.0_drop0.5_kappa0.1_auxw0.0001_normfFalse_nlay1_mixunk1.0_inpzTrue_cdbit0_cdbow200
  • Dataptb_Distvmf_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat100_lr10.0_drop0.5_kappa50.0_auxw0.0001_normfFalse_nlay1_mixunk1.0_inpzTrue_cdbit0_cdbow200

Not Condition:

  • Dataptb_Distzero_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat100_lr10.0_drop0.5_kappa0.1_auxw0.0001_normfFalse_nlay1_mixunk1.0_inpzTrue_cdbit0_cdbow0
  • Dataptb_Distnor_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat100_lr10.0_drop0.5_kappa0.1_auxw0.0001_normfFalse_nlay1_mixunk1.0_inpzTrue_cdbit0_cdbow0
  • Dataptb_Distvmf_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat50_lr10.0_drop0.5_kappa80.0_auxw0.0001_normfFalse_nlay1_mixunk1.0_inpzTrue_cdbit0_cdbow0

@jiacheng-xu jiacheng-xu added the question Further information is requested label Oct 11, 2018
@pclucas14
Copy link
Author

Hi,

thanks for the fast response, and for your insight on PTB! is it also possible to get LM paramaters in the standard setting ?

@jiacheng-xu
Copy link
Owner

Hi,

thanks for the fast response, and for your insight on PTB! is it also possible to get LM paramaters in the standard setting ?

The configuration of the word language model example will help.

@pclucas14
Copy link
Author

great! thanks again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants