Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Benchmarking script failing to capture logs in EKS #207

Open
srinarayan-srikanthan opened this issue Nov 27, 2024 · 2 comments
Open
Assignees

Comments

@srinarayan-srikanthan
Copy link

srinarayan-srikanthan commented Nov 27, 2024

I am using this script (https://github.com/opea-project/GenAIEval/tree/main/evals/benchmark) to capture data for chatqna. It works on baremetal but failing with the following error on EKS.

Exception: Tried to stop User in an unexpected state: stopping. This should never happen.

2024-11-27T00:39:12Z <Greenlet at 0x7fb4215eac20: <bound method User.stop of <aistress.AiStressUser object at 0x7fb428353610>>(force=False)> failed with Exception


[]

[]

[]

Exception when retrieving ConfigMap: (404)

Reason: Not Found

HTTP response headers: HTTPHeaderDict({'Audit-Id': 'bc671b52-6979-4a63-a49b-f62d27d59f6d', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': '4916798f-ba14-43f8-a168-6ef5d36d86e2', 'X-Kubernetes-Pf-Prioritylevel-Uid': '22c70de0-7569-474e-b81b-bf3e48a02906', 'Date': 'Wed, 27 Nov 2024 00:39:54 GMT', 'Content-Length': '208'})

HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"configmaps \"kubelet-config\" not found","reason":"NotFound","details":{"name":"kubelet-config","kind":"configmaps"},"code":404} 
@eero-t
Copy link

eero-t commented Nov 27, 2024

kubelet-config configMap seems to be created by kubeadm init when cluster is created: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/kubelet-integration/#workflow-when-using-kubeadm-init

I assume it's missing when something else is used to create the cluster.

Failing benchmark code should handle such case better:
https://github.com/opea-project/GenAIEval/blob/main/evals/benchmark/stresscli/commands/utils.py

Btw. Another bug in that same code is it assuming cluster to be using static policy when policy is != none, instead of using the policy name returned by the called function...

@poussa
Copy link
Collaborator

poussa commented Nov 28, 2024

You don't have kubelet-config configmap in EKS cluster. The benchmark script needs to handle that case and not to fail as it is doing now.

@poussa poussa assigned poussa and gavinlichn and unassigned poussa Nov 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants