Hyperparams for results reported in the paper #3

pclucas14 · 2018-10-11T15:58:50Z

Hello,

is it possible to have hyperparameters values that reproduce the NVRNN and LM results on the PTB dataset ?

Many thanks,
Lucas

jiacheng-xu · 2018-10-11T18:25:58Z

Hi Lucas,
thanks for your interest. I am sorry that I didn’t put all the commands on the github page because there are too many tables ( and it’s hard to keep track of all the configuration of all the results).
Here are some of my intuition about training PTB: 1) large learning rate with decay (--lr 10 for example) 2) train longer with sgd (--epochs 100) 3) gradient clip and dropout. This repo provides an amazing configuration of training a LM. PyTorch example of word language model.

I listed the instance name of saved results. Possibly you can reproduce the results given the hyper-parameters.
Detailed Configuration:
Note: zero=lm; nor=gaussian; vmf=von mises-fisher
Standard Setting:

Dataptb_Distvmf_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat50_lr10.0_drop0.5_kappa35.0_auxw0.0001_normfFalse_nlay1_mixunk0.0_inpzTrue_cdbit0_cdbow0
Dataptb_Distvmf_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat50_lr10.0_drop0.5_kappa5.0_auxw0.0001_normfFalse_nlay1_mixunk0.0_inpzTrue_cdbit0_cdbow200

Inputless Setting
Condition on Bag-of-words:

Dataptb_Distzero_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat100_lr10.0_drop0.5_kappa0.1_auxw0.0001_normfFalse_nlay1_mixunk1.0_inpzTrue_cdbit0_cdbow200
Dataptb_Distnor_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat100_lr10.0_drop0.5_kappa0.1_auxw0.0001_normfFalse_nlay1_mixunk1.0_inpzTrue_cdbit0_cdbow200
Dataptb_Distvmf_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat100_lr10.0_drop0.5_kappa50.0_auxw0.0001_normfFalse_nlay1_mixunk1.0_inpzTrue_cdbit0_cdbow200

Not Condition:

Dataptb_Distzero_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat100_lr10.0_drop0.5_kappa0.1_auxw0.0001_normfFalse_nlay1_mixunk1.0_inpzTrue_cdbit0_cdbow0
Dataptb_Distnor_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat100_lr10.0_drop0.5_kappa0.1_auxw0.0001_normfFalse_nlay1_mixunk1.0_inpzTrue_cdbit0_cdbow0
Dataptb_Distvmf_Modelnvrnn_EnclstmBiFalse_Emb100_Hid400_lat50_lr10.0_drop0.5_kappa80.0_auxw0.0001_normfFalse_nlay1_mixunk1.0_inpzTrue_cdbit0_cdbow0

pclucas14 · 2018-10-11T18:44:06Z

Hi,

thanks for the fast response, and for your insight on PTB! is it also possible to get LM paramaters in the standard setting ?

jiacheng-xu · 2018-10-11T18:45:57Z

Hi,

thanks for the fast response, and for your insight on PTB! is it also possible to get LM paramaters in the standard setting ?

The configuration of the word language model example will help.

pclucas14 · 2018-10-11T18:50:09Z

great! thanks again

jiacheng-xu added the question Further information is requested label Oct 11, 2018

jiacheng-xu closed this as completed Oct 11, 2018

jiacheng-xu reopened this Oct 12, 2018

dongqian0206 mentioned this issue Jan 23, 2019

About Implementation #7

Closed

thequilo mentioned this issue Mar 9, 2019

Reconstruct Results / Implementation Details #8

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hyperparams for results reported in the paper #3

Hyperparams for results reported in the paper #3

pclucas14 commented Oct 11, 2018

jiacheng-xu commented Oct 11, 2018

pclucas14 commented Oct 11, 2018

jiacheng-xu commented Oct 11, 2018

pclucas14 commented Oct 11, 2018

Hyperparams for results reported in the paper #3

Hyperparams for results reported in the paper #3

Comments

pclucas14 commented Oct 11, 2018

jiacheng-xu commented Oct 11, 2018

pclucas14 commented Oct 11, 2018

jiacheng-xu commented Oct 11, 2018

pclucas14 commented Oct 11, 2018