Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Performance benchmarks for FaqGen / DocSum are calculated incorrectly (we see negative numbers and very high values) #189

Open
amikoai opened this issue Nov 5, 2024 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@amikoai
Copy link

amikoai commented Nov 5, 2024

We are adapting OPEA applications for the AMD platform and faced the issue that launching the eval tests give us negative numbers for Input Tokens per Second, Input Tokens. And also Tokens per Second is too high: ~25K.
We use TGI LLM engine.
Our process:
From evals/benchmark/ we modify benchmark.yaml (picture is attached)

Screenshot 2024-11-05 at 17 44 25

Main settings:
we use ["faqgen"] in examples
deployment_type - should be set to "Docker," as we are using deployment via Docker.
service_port - The backend port where the service API is available, can be checked in the Docker Compose output when starting the service. Currently, 18881 is used for deployment with GPU, and 19888 for the service deployed on CPU.

To start the test, use the command:
`python benchmark.py

And here is result from the benchmark tests with strange numbers:
benchmark results

Can you please check that it works fine on your side?

@joshuayao joshuayao added the bug Something isn't working label Nov 8, 2024
@joshuayao joshuayao changed the title Performance benchmarks for FaqGen / DocSum are calculated incorrectly (we see negative numbers and very high values) [Bug] Performance benchmarks for FaqGen / DocSum are calculated incorrectly (we see negative numbers and very high values) Dec 6, 2024
@wangkl2 wangkl2 self-assigned this Dec 9, 2024
@wangkl2
Copy link
Collaborator

wangkl2 commented Dec 23, 2024

@amikoai Thanks for reporting this issue. I can reproduce this faqgen benchmark issue and fix it with this commit 5d717e8. Please try again with the main branch and let us know if it works on your end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants