Why not output samples from student to teacher? #12

neverjoe · 2018-03-27T07:31:58Z

Line 259 in 0913964

    
           cross_entropy = discretized_mix_logistic_loss(target_distribution, student_samples)

neverjoe · 2018-03-27T07:33:06Z

i think target_distribution should be compute by samples from student as output samples from student to teacher.

vincentherrmann · 2018-03-27T07:58:29Z

I don't really get what you mean. Do you think we should sample many inputs for the teacher network from the mu and s output of the student network? Then we had to calculate the whole teacher network multiple times which would be very computationally expensive. Also, if we sample the student output we lose the conditioning on the previous time-samples, so I don't think it makes sense. The output of mu and s of the student network exists only to compare the distributions of the student and the teacher network.

neverjoe · 2018-03-27T08:05:41Z

I got your idea, i have same worry, but paper said we need to estimate the distributions of teacher and student by sampling. By the way, the target_distribution and student_samples has different shape, is a bug ? Have u got any reasonable results?

vincentherrmann · 2018-03-27T08:14:53Z

In the paper it says that x = g(z), where z is the input noise. It think the whole point of equations (9)-(13) in the paper is to save us from having to calculate the teacher network multiple times.
The target_distribution is a parameterization of the teacher distribution, and student_samples are multiple samples from the student distribution, so they should have different shapes. For me it seems to work reasonably well, although the output of the parallel wavenet is noisier than the original one (but I haven't implemented the additional loss terms yet, so that might help).

neverjoe · 2018-03-27T08:19:20Z

Great! I think the power loss and contrastiveis loss is very important for good quality voice.

neverjoe · 2018-03-27T08:55:17Z

Can u show me your loss plot?My loss can't get coveraged for days.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why not output samples from student to teacher? #12

Why not output samples from student to teacher? #12

neverjoe commented Mar 27, 2018

neverjoe commented Mar 27, 2018

vincentherrmann commented Mar 27, 2018

neverjoe commented Mar 27, 2018

vincentherrmann commented Mar 27, 2018

neverjoe commented Mar 27, 2018 •

edited

Loading

neverjoe commented Mar 27, 2018

Why not output samples from student to teacher? #12

Why not output samples from student to teacher? #12

Comments

neverjoe commented Mar 27, 2018

neverjoe commented Mar 27, 2018

vincentherrmann commented Mar 27, 2018

neverjoe commented Mar 27, 2018

vincentherrmann commented Mar 27, 2018

neverjoe commented Mar 27, 2018 • edited Loading

neverjoe commented Mar 27, 2018

neverjoe commented Mar 27, 2018 •

edited

Loading