Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPT-J support #40

Open
JonathanLehner opened this issue Jan 5, 2022 · 5 comments
Open

GPT-J support #40

JonathanLehner opened this issue Jan 5, 2022 · 5 comments

Comments

@JonathanLehner
Copy link

How can we use GPT-J for inference?

@pommedeterresautee
Copy link
Member

pommedeterresautee commented Jan 12, 2022

Work to support of GPT-2/T5 architectures just started (right now it's focused on encoder only models). GPT-J follows GPT-2 architecture so hopefully it should be ready in the next few weeks.

@oborchers
Copy link

@pommedeterresautee: I caution that this is probably more difficult as expected due to (currently trying to figure that out):
onnx/onnx-tensorrt#818 and maybe even also https://forums.developer.nvidia.com/t/how-to-convert-a-large-pytorch-model-to-trt-model/182782/4

@pommedeterresautee
Copy link
Member

@oborchers Indeed, master branch of ORT has currently a bug with external data. Issue opened, someone from Msft is on it.

@YarrDOpanas
Copy link

Hi, everyone! @pommedeterresautee , is there any updates? Can't convert large models like gpt-j, GPT Neo 2.7B , while gpt-2 and openai-gpt works correctly.

@pommedeterresautee
Copy link
Member

Right now we focus more on Kernl / Triton, and plan (not now) to merge the 2 libs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants