ERROR: expected number of inputs between 1 and 3 but got 9 inputs for model #38

samzong · 2024-04-15T05:22:42Z

{"timestamp":"2024-04-15T05:20:55.796456Z","level":"ERROR","error":"AppError(error message received from triton: [request id: <id_unknown>] expected number of inputs between 1 and 3 but got 9 inputs for model 'myserving')","target":"openai_trtllm::routes::completions","span":{"headers":"{"host": "localhost:3030", "user-agent": "OpenAI/Python 1.17.1", "content-length": "55", "accept": "application/json", "accept-encoding": "gzip, deflate", "authorization": "Bearer test", "content-type": "application/json", "x-stainless-arch": "arm64", "x-stainless-async": "false", "x-stainless-lang": "python", "x-stainless-os": "MacOS", "x-stainless-package-version": "1.17.1", "x-stainless-runtime": "CPython", "x-stainless-runtime-version": "3.10.5"}","name":"non-streaming completions"},"spans":[{"http.request.method":"POST","http.route":"/v1/completions","network.protocol.version":"1.1","otel.kind":"Server","otel.name":"POST /v1/completions","server.address":"localhost:3030","span.type":"web","url.path":"/v1/completions","url.scheme":"","user_agent.original":"OpenAI/Python 1.17.1","name":"HTTP request"},{"headers":"{"host": "localhost:3030", "user-agent": "OpenAI/Python 1.17.1", "content-length": "55", "accept": "application/json", "accept-encoding": "gzip, deflate", "authorization": "Bearer test", "content-type": "application/json", "x-stainless-arch": "arm64", "x-stainless-async": "false", "x-stainless-lang": "python", "x-stainless-os": "MacOS", "x-stainless-package-version": "1.17.1", "x-stainless-runtime": "CPython", "x-stainless-runtime-version": "3.10.5"}","name":"completions"},{"headers":"{"host": "localhost:3030", "user-agent": "OpenAI/Python 1.17.1", "content-length": "55", "accept": "application/json", "accept-encoding": "gzip, deflate", "authorization": "Bearer test", "content-type": "application/json", "x-stainless-arch": "arm64", "x-stainless-async": "false", "x-stainless-lang": "python", "x-stainless-os": "MacOS", "x-stainless-package-version": "1.17.1", "x-stainless-runtime": "CPython", "x-stainless-runtime-version": "3.10.5"}","name":"non-streaming completions"}]}

use client/openai_completion.py

The text was updated successfully, but these errors were encountered:

samzong · 2024-04-15T06:13:04Z

I think I know the problem, my trition backend use trition with vllm.

Do we have a plan to support it?

npuichigo · 2024-04-15T06:59:43Z

it's not planned yet, but I think it's trivial to adapt the codes for your use case.

samzong · 2024-04-15T07:59:58Z

it's not planned yet, but I think it's trivial to adapt the codes for your use case.

Do you have any suggestions? I can try to implement it, and if I can, I can contribute this part of the code

npuichigo · 2024-04-15T08:56:14Z

Can you provide how to calling vllm-based triton backend? The grpc interface, the parameters for example to call the service.

samzong · 2024-04-16T09:22:36Z

okay, @npuichigo You can see an example here.

https://github.com/triton-inference-server/vllm_backend/blob/a01475157290bdf6fd0f50688f69aafea41b04c5/samples/client.py#L192

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "-m",
        "--model",
        type=str,
        required=False,
        default="vllm_model",
        help="Model name",
    )
    parser.add_argument(
        "-v",
        "--verbose",
        action="store_true",
        required=False,
        default=False,
        help="Enable verbose output",
    )
    parser.add_argument(
        "-u",
        "--url",
        type=str,
        required=False,
        default="localhost:8001",
        help="Inference server URL and its gRPC port. Default is localhost:8001.",
    )
    parser.add_argument(
        "-t",
        "--stream-timeout",
        type=float,
        required=False,
        default=None,
        help="Stream timeout in seconds. Default is None.",
    )
    parser.add_argument(
        "--offset",
        type=int,
        required=False,
        default=0,
        help="Add offset to request IDs used",
    )
    parser.add_argument(
        "--input-prompts",
        type=str,
        required=False,
        default="prompts.txt",
        help="Text file with input prompts",
    )
    parser.add_argument(
        "--results-file",
        type=str,
        required=False,
        default="results.txt",
        help="The file with output results",
    )
    parser.add_argument(
        "--iterations",
        type=int,
        required=False,
        default=1,
        help="Number of iterations through the prompts file",
    )
    parser.add_argument(
        "-s",
        "--streaming-mode",
        action="store_true",
        required=False,
        default=False,
        help="Enable streaming mode",
    )
    parser.add_argument(
        "--exclude-inputs-in-outputs",
        action="store_true",
        required=False,
        default=False,
        help="Exclude prompt from outputs",
    )

liyan77 · 2024-07-05T02:58:29Z

my trition backend also use trition with vllm，have a plan to support it？

crslen · 2024-07-28T20:37:27Z

Would be great if vllm option was supported.

ChaseDreamInfinity · 2024-09-17T22:57:54Z

I made some change to let it support vllm backend.
https://github.com/ChaseDreamInfinity/openai_triton_vllm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ERROR: expected number of inputs between 1 and 3 but got 9 inputs for model #38

ERROR: expected number of inputs between 1 and 3 but got 9 inputs for model #38

samzong commented Apr 15, 2024 •

edited

Loading

samzong commented Apr 15, 2024

npuichigo commented Apr 15, 2024

samzong commented Apr 15, 2024

npuichigo commented Apr 15, 2024

samzong commented Apr 16, 2024 •

edited

Loading

liyan77 commented Jul 5, 2024

crslen commented Jul 28, 2024

ChaseDreamInfinity commented Sep 17, 2024

ERROR: expected number of inputs between 1 and 3 but got 9 inputs for model #38

ERROR: expected number of inputs between 1 and 3 but got 9 inputs for model #38

Comments

samzong commented Apr 15, 2024 • edited Loading

samzong commented Apr 15, 2024

npuichigo commented Apr 15, 2024

samzong commented Apr 15, 2024

npuichigo commented Apr 15, 2024

samzong commented Apr 16, 2024 • edited Loading

liyan77 commented Jul 5, 2024

crslen commented Jul 28, 2024

ChaseDreamInfinity commented Sep 17, 2024

samzong commented Apr 15, 2024 •

edited

Loading

samzong commented Apr 16, 2024 •

edited

Loading