-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mixed precision conversion getting Assertion Error #89
Comments
Hi @ice-americano, |
Yeah, when I was using T5EncoderModel from my checkpoint(3b), script was able to convert(I was reshaping the encoder input ids before forward call). However, the output tensor of the Onnx FP16 model had nan for all.
So basically, before calling original forward, we reshape both input_ids and attention_mask from |
you are using T5 3b with external data ? if yes, retry with ORT compiled from |
I was using T5-3b, so I just assumed that package will default to using external data, isn't it?
Actually I saw your issue on ORT repo, so I built the ORT package from master two days ago. So ORT should be pretty up-to-date, just missing 2~3 commits. Also I just decided to use default encoder and everything looks fine(including merging graphs with if node) without converting them to fp16. Are you saying fp16 model generating |
😎 Yeah I pulled from master after seeing that
Any clue on what I could be doing something wrong? |
You may want to increase |
Oh okay. Default value I was using was 100, so I might try 150, 200, etc. |
Just something which comes to mind, have you checked that your transformations are correctly exported in Onnx? (by using onnx lib or netron) |
I think fp32 checkpoint is well exported, since generated tensor was equal to that of the torch model. I did it with
|
Just wondering, did you update the It has to be increased with the size of the model to take into account the rounding errors accumulating with the number of layers (no quality issue, these models are usually trained in FP16, just that it's required to compare with FP32). |
As the comments, I was using
|
We have completely rewrote this stuff. Can you recheck? |
Sorry, I was busy 😅 Yeah, I was going to try again, but just found the new notebook |
TBH, I think you will be disappointed (at least we were), the precision is very low, not tried but I would expect it shows in the accuracy measures whatever your task. Our understanding is that it's supposed to be used with mixed precision in mind, so you still have some casting here and there, just none for out of range values. |
Firstly, thanks for this deployment package! @pommedeterresautee
I was trying to optimize our finetuned(also pre, post forward code modified) T5 model from your recent T5 optimization notebook.
When I was converting encoder to mixed precision, I was getting an assertion error of
Graph is not a DAG
Here is the snippet for conversion:
Stacktrace:
I installed the
transformer-deploy
from the most recent repo as well. Could it be happening due to our modified forward call, which just involves reshaping and re-padding after original forward?The text was updated successfully, but these errors were encountered: