Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimising ONNX Graph either takes too long or doesn't seem to work #109

Open
accountForIssues opened this issue Jul 4, 2022 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@accountForIssues
Copy link

Using the GPT2 Notebook, I am trying to convert a gpt2 model to an optimised ONNX graph and I'm stuck at what seems to be random behaviour.

The export to ONNX works fine. However, while optimising the ONNX graph, I usually see warnings similar to this:
WARNING:symbolic_shape_infer:Cannot determine if Reshape_560_o0__d1 - sequence < 0
over and over again until I have to stop the kernel.

It did work once or twice (in the same environment) and it took about 30 seconds so I have no idea what changed.

I barely even changed the code. I'm just following the notebook.

What does the warning mean and how can I go back to a stable optimisation ?

@pommedeterresautee
Copy link
Member

Which version of PyTorch are you using ?

@accountForIssues
Copy link
Author

accountForIssues commented Jul 4, 2022 via email

@pommedeterresautee
Copy link
Member

I just rerun the notebook and had no issue.
I imagine it's a dependency version thing.
The ones I would check would be those related to ONNX and Pytorch as they are the only 2 things related to ONNX graph.

❯ pip list | grep onnx
onnx                      1.12.0
onnx-graphsurgeon         0.3.19
onnxconverter-common      1.9.0
onnxruntime-gpu           1.12.0
onnxruntime-tools         1.7.0
tf2onnx                   1.11.1
❯ pip list | grep torch
pytorch-quantization      2.1.2
torch                     1.11.0+cu113

@pommedeterresautee pommedeterresautee added the bug Something isn't working label Jul 5, 2022
@pommedeterresautee pommedeterresautee self-assigned this Jul 5, 2022
@accountForIssues
Copy link
Author

Maybe solved.

I created a new docker image using the latest cuda runtime and installed each package.

I can confirm that the latest torch causes this issue but I do remember when I used an older torch image that I got this error as well so I definitely think there's another package that could be causing this issue.

In any case, I will keep testing to see out if it breaks again. Hopefully, you come across this as well when you update the docker image and solve it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants