Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language translation example added (#1131) #1240

Merged
merged 9 commits into from
Apr 2, 2024

Conversation

NoahSchiro
Copy link
Contributor

Hello all, this PR seeks to add language translation to the repo as requested in issue #1131. It utilizes transformers and closely follows the Attention is All You Need paper.

I understand there is a desire to reduce dependencies for the sake of simplicity, however I had to use the spacy library. Currently, torchtext does not offer a lot of language tokenizers, just basic English. As such, we need to lean on other tokenizers, which spacy provides. The word from torchtext devs is that spacy is an "optional dependency" (source: pytorch/text#178) so I hope you will accept it's necessity here in this example. The alternative would be to write tokenizers for every language we want to support, which I feel is antithetical to simplicity in these examples.

Other than that, this example tries to rely heavily on the tools provided in torchtext to better showcase the library as well as get users familiar with transformers and how they work!

Please let me know if there is anything you'd like to see changed.

Copy link

netlify bot commented Mar 24, 2024

Deploy Preview for pytorch-examples-preview canceled.

Name Link
🔨 Latest commit 15d4a09
🔍 Latest deploy log https://app.netlify.com/sites/pytorch-examples-preview/deploys/660c23af7f827400081c9b99

@msaroufim
Copy link
Member

Looks like the test failed could you please take a look?

@NoahSchiro
Copy link
Contributor Author

Hi! I think the issue was that the model (with default settings) is quite large. I'm thinking the test server was potentially running out of VRAM and crashed. In any case, I have passed the appropriate flags so that it is a lot smaller.

@NoahSchiro
Copy link
Contributor Author

Thanks @msaroufim. I was going to fix that tonight after work.

In the future, how can I make sure my environment matches the testing environment so we don't have to troubleshoot here? The tests were working on my machine but seems like my environment was different.

@msaroufim
Copy link
Member

TBH I test in CI XD - but on your end you can turn https://github.com/pytorch/examples/blob/main/.github/workflows/main_python.yml into a shell script and use that, should be mostly fine

@msaroufim msaroufim merged commit 7df10c2 into pytorch:main Apr 2, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants