Supporting licenses #3260
Replies: 5 comments 1 reply
-
In many of our jobspec-related discussions, I think the vision was always to do away with licenses as a unique thing and instead treat them as any other resource. I don't think it would be hard at all to include them as a child resource of |
Beta Was this translation helpful? Give feedback.
-
It might also be good to consider how we handle license acquisition in our datacenter currently. If I had to guess, we don't partition licenses across clusters but instead have a big pool for the entire datacenter, which are then polled in order to check their status and acquire them. I may be totally off there. Maybe @ryanday36 can comment? |
Beta Was this translation helpful? Give feedback.
-
Would be nice to support something more generic than license, but I'm not sure I have a good idea for an option that requests a set of resources at the top level. |
Beta Was this translation helpful? Give feedback.
-
This is an odd one, and probably shouldn't be a high priority for you all. The way that we theoretically use Slurm's license feature is to define a large number of licenses for each lustre file system in the slurm.conf and have users specify a license if their job needs access to a specific file system. The theory is then that, when a file system is down for updates, the system admins can update slurm.conf to set the number of licenses for that file system to 0, and jobs that need the file system won't start until the update is done and the slurm.conf gets updated again. In practice, this would require the system admins to modify the slurm.conf on each cluster that mounts the file system and then reconfigure slurm a couple of times, and I don't think anyone ever does it. I'd personally rather see a solution to the underlying problem that we tried to hack around with Slurm's licenses than a re-implementation of Slurm's licenses. I'm not really sure what that would like, but since the license hack isn't really being used anyway, I wouldn't spend a lot of time on this now. Maybe failover will get good enough to allow some sort of rolling updates to Lustre clusters and solve the whole problem for us... |
Beta Was this translation helpful? Give feedback.
-
Some out-of-band emails that I got from Phil:
|
Beta Was this translation helpful? Give feedback.
-
How do we want to support the equivalent to Slurm's
--license
, which is defined asBeta Was this translation helpful? Give feedback.
All reactions