Skip to content

ukceh-rse/orchid-ollama-testbed

Repository files navigation

This repository demonstrates how LLMs can be run on JASMIN's Orchid GPU cluster using Ollama served using Singularity.

Requirements

  • JASMIN uses Singularity v3.7 so it is recommended that you install the same version locally to build and deploy the runner locally (see documentation).
  • To run on JASMIN you will need to sign up and request access to the jasmin-login service to be able to access the login servers and the scientific computing VMs. This will also give you access to the LOTUS batch computing cluster, but to access the Orchid (GPU) cluster you will also need to request access to the orchid service.

Setup

Create runner image

To get started, first build the runner locally:

./build-runner.sh

This will create a .sif file, which is an image file that Singularity can run as a container.

On some Linux systems, building with the --fakeroot option may not be possible. To get around this problem you can either try configuring the user namespace or remove the --fakeroot option from build-runner.sh and run with sudo.

Running locally

Once the runner is created you should test locally to ensure everything works correctly. run.sbatch contains slurm directives so that it can be run on Orchid but it can also be run locally as a script. The runner will query the LLM model with the prompts defined in input.txt. The file is a list of inputs for the LLM that can be changed with whatever prompts you would like to give.

Note: These prompts are individual, they do not constitute a chat history i.e. the LLM will not be aware of previous prompts when it responds to the subsequent.

To run:

./run.sbatch

Assuming everything worked correctly, this should produce output.txt, which contains the output of all the prompts submitted to the LLM in the format:

Query: What is the capital of Sri Lanka?

Response: The capital of Sri Lanka is Colombo, and it's also the largest city in the country by population.


Query:

...

Configuration

You can configure the parameters for your run by editing params.sh. The input and output files are defined using the INPUTFILE and OUTPUT_FILE respectively. You can also change the LLM to use by modifying MODEL. The model you select must be available on Ollama, e.g. to use the llama3.2 model, modify the line in params.sh to:

MODEL=llama3.2

Note: Larger models can take some time to download and will be very slow to run if you GPU does not have enough VRAM. It is best to test locally with a smaller model (e.g. tinyllama, llama3.2) and then use larger models when running on Orchid.

Running on JASMIN

If you've not used JASMIN before, it is best to familiarise yourself via the getting started steps in the documentation. The following instructions assume you have setup an SSH key for secure access.

Transfer files

To run on JASMIN's Orchid cluster, first bundle up the local files and transfer them to your JASMIN workspace via the login server. A convenience script to tarball only the necessary files is provided tarball.sh (where $JASMIN_USER is your JASMIN username).

./tarball.sh
scp orchid-ollama-testbed.tar.gz [email protected]:/home/users/$JASMIN_USER/

Submit slurm job

Now you can login to JASMIN and access one of the sci VMs, e.g. to access sci-vm-03 via login-02:

ssh -A [email protected]
ssh sci-vm-03.jasmin.ac.uk

Next extract the bundled files:

tar -xf orchid-ollama-testbed.tar.gz

Finally, submit the slurm job:

cd orchid-ollama-testbed
sbatch run.sbatch

By default, the standard error and output should appear in files $JOB_NUMER.err and $JOB_NUMBER.out.

Check status

The status of your jobs, submitted on JASMIN, can be checked via:

squeue -u $JASMIN_USER

Typical status codes for the job include:

  • PD=pending
  • R=running
  • CD=completed
  • F=failed

Once the job is complete, check to ensure the query responses have been successfully sent to output.txt.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published