GPT-J support #40

JonathanLehner · 2022-01-05T23:21:32Z

How can we use GPT-J for inference?

pommedeterresautee · 2022-01-12T12:10:07Z

Work to support of GPT-2/T5 architectures just started (right now it's focused on encoder only models). GPT-J follows GPT-2 architecture so hopefully it should be ready in the next few weeks.

oborchers · 2022-03-14T13:12:37Z

@pommedeterresautee: I caution that this is probably more difficult as expected due to (currently trying to figure that out):
onnx/onnx-tensorrt#818 and maybe even also https://forums.developer.nvidia.com/t/how-to-convert-a-large-pytorch-model-to-trt-model/182782/4

pommedeterresautee · 2022-05-24T13:03:10Z

@oborchers Indeed, master branch of ORT has currently a bug with external data. Issue opened, someone from Msft is on it.

YarrDOpanas · 2023-04-04T12:44:44Z

Hi, everyone! @pommedeterresautee , is there any updates? Can't convert large models like gpt-j, GPT Neo 2.7B , while gpt-2 and openai-gpt works correctly.

pommedeterresautee · 2023-04-14T12:09:51Z

Right now we focus more on Kernl / Triton, and plan (not now) to merge the 2 libs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPT-J support #40

GPT-J support #40

JonathanLehner commented Jan 5, 2022

pommedeterresautee commented Jan 12, 2022 •

edited

Loading

oborchers commented Mar 14, 2022

pommedeterresautee commented May 24, 2022

YarrDOpanas commented Apr 4, 2023

pommedeterresautee commented Apr 14, 2023

GPT-J support #40

GPT-J support #40

Comments

JonathanLehner commented Jan 5, 2022

pommedeterresautee commented Jan 12, 2022 • edited Loading

oborchers commented Mar 14, 2022

pommedeterresautee commented May 24, 2022

YarrDOpanas commented Apr 4, 2023

pommedeterresautee commented Apr 14, 2023

pommedeterresautee commented Jan 12, 2022 •

edited

Loading