-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bump nvidia-device-plugin
to v0.16.1
#242
Comments
Testing this on Cato provider that had 0 leases since yesterday. |
Figured the issue is because new nvidia-device-plugin 0.16.x helm-charts (0.16.0 rc1, 0.16.0, 0.16.1) are dropping Let's keep using nvidia-device-plugin 0.15.1 until NVIDIA/k8s-device-plugin#856 gets fixed or a better workaround is found instead of modifying/customizing the helm-chart manually. |
For the record: Restarting And it does not change the reported CUDA version upon |
WorkaroundThe quick workaround is to pass
|
Going to update our docs after a better fix is released to issue 856. |
k8s-device-plugin
v0.16.1 got released 3 days ago:They have updated CUDA base image version to
12.5.1
among the other changes https://github.com/NVIDIA/k8s-device-plugin/releasesNeed to test the following:
nvidia-device-plugin
helm chart up to0.16.1
without impacting existing GPU deployments (can probably pick some provider with least used GPUs; probably sandbox will do best)nvidia-smi | grep Version
(probably this isn't related, but still worth checking)0.15.1
to0.16.1
version in the docs https://akash.network/docs/providers/build-a-cloud-provider/gpu-resource-enablement/nvidia-device-plugin
across all the GPU providersThe text was updated successfully, but these errors were encountered: