Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ssl/bestrq] happy ending---stable training #2614

Merged
merged 2 commits into from
Aug 16, 2024
Merged

Conversation

Mddct
Copy link
Collaborator

@Mddct Mddct commented Aug 16, 2024

No description provided.

@Mddct Mddct marked this pull request as ready for review August 16, 2024 11:05
@Mddct Mddct merged commit 2adf651 into main Aug 16, 2024
6 checks passed
@Mddct Mddct deleted the Mddct-bestrq-happy-ending branch August 16, 2024 14:15
@ncakhoa
Copy link

ncakhoa commented Aug 19, 2024

@Mddct Can you explain when you stack fbank feature, why you change from normalizing freq dimension to time dimension and please provide some training results for comparing two approaches. I think normalizing freq dimension is more correct than time dimension, since the model wants to learn local distributions.

@Mddct
Copy link
Collaborator Author

Mddct commented Aug 19, 2024

@Mddct Can you explain when you stack fbank feature, why you change from normalizing freq dimension to time dimension and please provide some training results for comparing two approaches. I think normalizing freq dimension is more correct than time dimension, since the model wants to learn local distributions.

Thank you for your attention

截屏2024-08-19 11 55 32

The normalization of input is just to get richer codes id. The stack input dim=T is normalized here to simulate the calculation method of cmvn of stack input, and experiments show that this normalization will be better than norm. d has a larger proportion of unique codes

@ncakhoa
Copy link

ncakhoa commented Aug 19, 2024

Can you show proportion of unique codes in two approaches. Also, have you tried to downstream model.

@Mddct
Copy link
Collaborator Author

Mddct commented Aug 19, 2024

Can you show proportion of unique codes in two approaches. Also, have you tried to downstream model.

Yes, some promising results will be released recently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants