Skip to content

diviyank/slurmlib

Repository files navigation

The python interface for clusters under SLURM: running jobs on a cluster easily.

Sending code over ssh to the Cluster ; create an interactive job (option for batching).

  1. Define a user config.
  2. Check if the job is runnable : if yes, run it ; else: ask for queuing.
  3. Send back the print out of the script.
  4. Clean everything after the job has been successfully terminated, get back all output data over ssh.

Example of what it would be :

import slurmlib
import torch
import pandas


# define a job here:
def Net(torch.nn.Module):
    ...
# define a function to run here.
def train(arg1, arg2, kwarg1=v, kwarg2=v2):
    net = Net()
    for i in ...
    	...
    output.to_csv("results.csv")

if __name__ == '__main__':
   slurmlib.Job('gpu'=1, 'feature'='tesla')
   slurmlib.run(train, arg1, arg2, kwarg1=v, kwarg2=v2)
   # Function has to save results in file for them to be pulled back from slurm.
   output = pandas.read_csv("results.csv")

Installation:

You need to have a setup with the cluster which works first by using ssh !

git clone https://github.com/Diviyan-Kalainathan/slurmlib.git
cd slurmlib
python setup.py install
cd ~/.ssh  # Configure files default_ssh_config and slurm_config.yml

About

A python interface for clusters under SLURM.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages