-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support MADLAD-400 - multilingual machine translation model based on the T5 architecture #1560
Comments
I think @Ehsan-Jahanbakhsh converted it using the T5 template, how did it go ? |
Closing then, will be fine for next release. |
@Ehsan-Jahanbakhsh How to use MADLAD-400 with ctranslate2? I converted the model, use with:
Result:
I will be grateful for your tips. Greetings! |
See This. |
Same here. Conversion:
Test code:
Result:
config.json:
|
I was not able to reproduce this results. please run the inference on CPU with float32 or int8 and see if the problem persists. |
@Ehsan-Jahanbakhsh int8 CPU is ok, result:
Do you have any idea what might be causing it? Maybe something with the conversion? Can you share the models you converted? THx ✌️ |
I don't have the means, but it would be cool if someone tested float32 on a gpu. |
@Ehsan-Jahanbakhsh float32 on GPU is ok, result:
So I guess there is something wrong with the float16 conversion. Any idea? |
it's a known issue with T5 models. Google it you will find discussions on this. |
Ok. He'll look around.. Thx. |
@vince62s thanks for that ✌️ |
Hi Team,
please consider adding support for models from the collection: https://huggingface.co/collections/jbochi/madlad-400-65491e6a78726cac9a4b84b7
Short description:
MADLAD-400 is a multilingual machine translation model based on the T5 architecture that was trained on 250 billion tokens covering over 450 languages using publicly available data. It is competitive with models that are significantly larger.
Paper: https://huggingface.co/papers/2309.04662
Thank you very much for your work. Best regards 👍 🥇
The text was updated successfully, but these errors were encountered: