-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Try to install NVIDIA driver if not present in the machine #328
Conversation
detect if NVIDIA driver is present by the /proc/driver/nvidia/version file and: * if present, doesn't try to install * if not present, try to install using the ubuntu-drivers pkg * even if installed, the machine might require reboot. To detect this, we check if nvidia-smi command is working.
In what cases would the nvidia driver be present already?
If the machine requires a reboot, can the charm set a blocked status with a message asking the user to reboot the machine? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need refactor strategy to two.
The driver is necessary to have NVIDIA GPUs being able to receive workloads, so it's very likely that once you receive the server you will install it manually even before installing juju or other necessary tools on it.
I asked to the server team and if installed with ubuntu-drivers it's not necessary to reboot it. |
Ah right of course, and it's a subordinate charm so it will definitely be sharing a server. 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(non-blocking) overall lgtm, but if you follow the existing code base, the nvidia driver strategy should be in this list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall is good to me. Thanks for the changing of NVIDIADriverStrategy. I also notice that multiple strategies can't fit into SnapExporter class. So instead of overwrite the exporter functions, please consider refactor the SnapExporter class.
detect if NVIDIA driver is present by the /proc/driver/nvidia/version file and:
See more info about the approach at this doc