Skip to content

4. Installation

Felix Thalén edited this page Oct 20, 2023 · 33 revisions

1. Resolving Dependencies

Patchwork requires the sequence aligner DIAMOND and the programming language Julia to run.

1.1 Installing Julia

Simply download and install Julia by following the instructions at https://julialang.org/downloads/. Instructions for installing Julia 1.9.3 on Linux-based operating system (OS) are shown here.

mkdir -p ~/opt # create directory unless it doesn't exist
cd ~/opt
VERSION=1.9.3
wget "https://julialang-s3.julialang.org/bin/linux/x64/1.9/julia-${VERSION}-linux-x86_64.tar.gz"
tar -xvzf "julia-${VERSION}-linux-x86_64.tar.gz"
rm "julia-${VERSION}-linux-x86_64.tar.gz"
echo export PATH=\"\$PATH:$HOME/opt/julia-$VERSION/bin\" >> ~/.bashrc
unset VERSION
exec bash      # restart the current Bash session
cd -           # return to the previous directory

Next, check whether Julia was successfully installed by running the following.

which julia
julia --version

1.2 Installing DIAMOND

Instructions for installing DIAMOND can be found on their GitHub page: (https://github.com/bbuchfink/diamond/wiki/2.-Installatio).

To install DIAMOND 2.1.8 on Linux into the directory /opt, simply put:

mkdir -p ~/bin # create directory unless it doesn't exist
cd ~/bin
VERSION=2.1.8
wget "http://github.com/bbuchfink/diamond/releases/download/v${VERSION}/diamond-linux64.tar.gz"
tar -zxvf diamond-linux64.tar.gz
rm diamond-linux64.tar.gz
echo export PATH=\"\$PATH:${HOME}/bin\" >> ~/.bashrc
unset VERSION
exec bash      # restart the current Bash session
cd -           # return to the previous directory

Display the location and version of the DIAMOND installation by running:

which diamond
diamond --version

1.3 Installing dependencies via Conda

Alternatively, we could also install both DIAMOND and Julia by using the package manager Conda.

conda create -n patchwork -c bioconda -c conda-forge julia diamond

This will produce a Conda environment called patchwork which contains installations of DIAMOND and Julia. Both programs will only be accessible after activating the environment, which you can do by typing the following:

conda activate patchwork

Once the environment is active, you can go ahead and run Patchwork. When done, you can deactivate the environment again by running:

conda deactivate

2. Installing Patchwork

You can install Patchwork in three different ways:

  1. Run Patchwork without compiling (recommended)
  2. Compiling Patchwork (optional)
  3. Run Patchwork using Docker

2.1 Obtaining the source code

Regardless of how you want to install Patchwork, start by cloning the GitHub repository into your directory of choice:

git clone https://github.com/fethalen/patchwork
cd patchwork   # move into the repository's top folder

2.2 Run Patchwork without compilation (recommended)

Before running Patchwork, we need to ensure that all required packages are installed with the correct versions. This is done by running instantiate while in the repository's top folder:

julia --project=. -e "import Pkg; Pkg.instantiate()"

To test that the environment was resolved while also displaying a help menu, run:

julia --project=. src/Patchwork.jl --help

2.3 Compiling Patchwork (optional)

Julia uses just-in-time (JIT) compilation, meaning that compilation is performed—as needed—at runtime. Within a session, fast compiled functions may be reused but every time you restart Julia, the compiled work is lost. In practice, this results in high startup times.

If you expect to use Patchwork often, compiling Patchwork could be a good idea to reduce latency at startup. The downside with this approach is (i) that the compilation process takes a few minutes and (ii) that the resulting executable is quite large (just over 1GB in size, as of writing), since it has to contain the Julia executable in addition to Patchwork itself. The upside is that you get an executable that can be sent and run on other machines without Julia being installed on that machine.

To compile patchwork, go into the build directory and run build_app.jl:

cd build
./build_app.jl

The resulting binary, patchwork, will be stored in /path/to/patchwork/build/compiled/bin/, where /path/to/ is the path leading up to patchwork. You may also add this executable to your path by adding the following line to your ~/.bashrc file:

export PATH="/path/to/patchwork/build/compiled/bin:$PATH"

For instance, my instance of Patchwork is located in ~/opt so I put the following line into my ~/.bashrc:

export PATH="$HOME/opt/patchwork/build/compiled/bin:$PATH"

To reload the settings within your ~/.bashrc file, type source ~/.bashrc in your current terminal.

2.4 Run Patchwork using Docker

Patchwork is available in the Biocontainers repository on Docker Hub. To pull the Patchwork Image to your PC and run it inside a Docker Container, you must have Docker installed.

Installing Docker

Go to the Docker Docs, select your platform and follow the instructions to set up the Docker repositories and install Docker Engine or Docker Desktop. On Ubuntu, you can use the following commands to install Docker:

Install packages to allow apt access to repositories over HTTPS:

$ sudo apt-get update
$ sudo apt-get install apt-transport-https ca-certificates curl gnupg lsb-release

Add Docker's official GPG key:

$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg

Set up the stable repository:

$ echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] \
  https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

Then you can install Docker Engine using the following command:

$ sudo apt-get update
$ sudo apt-get install docker-ce docker-ce-cli containerd.io

Verify the installation by running the hello-world image:

$ sudo docker run hello-world

Pulling the Patchwork Image from Docker Hub

To obtain the image from the Biocontainers repository, run:

$ sudo docker pull biocontainers/patchwork:0.5.0_cv1

This image contains everything you need to run Patchwork inside a Docker Container, i.e. the Patchwork binary itself as well as its runtime dependency DIAMOND. You can have a look at the currently available version tags if you want to pull a specific version of the Patchwork Image. View your available images by running the following command:

$ sudo docker images

Running Patchwork inside a Docker Container

Now that you have the Patchwork Image on your local computer, you can run the program inside a Docker Container. To create a container from the image, modify the following command according to your preferences. The options and flags used in the command will be explained in more detail underneath.

$ sudo docker run -it --name NAME --user $(id -u) \
  --mount type=bind,src=SOURCE,dst=DESTINATION REPO/IMAGE:TAG COMMAND
  • -it: The container will start in interactive mode with an open terminal window.
  • --name NAME: The name you want to give your container, e.g. patchwork.
  • --user $(id -u): This option is necessary if you want to run Patchwork inside the container and e.g. create an output directory that you can access from your local computer afterwards. On your local computer, the files and directories belong to you, and that allows you to modify them. Inside the Docker Container however, the pre- configured user might have a different ID that would prevent them from messing with your files on the host computer. The --user option overrides the pre-configured user; you enter the container with your local user ID. That means you are allowed to change files from your host system that you have access to from inside the container. The files you can access are determined by the next option --mount.
  • --mount type=bind,src=SOURCE,dst=DESTINATION: Patchwork requires at least one contigs fasta file and one reference fasta file or database to run. These data are normally not available inside the Docker Container but you can make them accessible with a bind mount. src requires the absolute path to the directory on the host computer that contains your data. dst is the path inside the container that you want to access the data from. So, say your contigs and reference are in src=/home/yourname/data and you set dst=/home/patchwork/data. Then you can access all the files and subdirectories in /home/yourname/data via /home/patchwork/data. To learn more about on bind mounts, see the Docker Docs.Thus, you would call the program like
$ patchwork --contigs /home/patchwork/data/contigs.fa --reference /home/patchwork/data/reference.fa \
  --output-dir /home/patchwork/data/out
  • REPO/IMAGE:TAG: The image you would like to use, e.g. biocontainers/patchwork:v0.1.2_cv1. The tag is optional, when omitted the default tag latest will be used.
  • COMMAND: The command you want to run whenever the container starts. After executing this command, the container will stop automatically. For example, using bash as a command will keep the container's terminal open and the container running until you type exit. While the container is running, you can enter other commands, e.g. call Patchwork, in the terminal.

The docker run command above will create and start the new Docker Container. If you want to restart the container called patchwork, run

$ sudo docker start patchwork
$ sudo docker attach patchwork

More options for using Docker can be viewed in the help menu:

$ docker --help

2.5 Run Patchwork using Apptainer/Singularity

Unfortunately, Docker is not compatible with shared HPC environments but can be run using Apptainer.

Note: In March 2022, Singularity became a Linux Foundation supported project and was then renamed Apptainer. The instructions here could also apply to an old Singularity installation, simply by replacing apptainer with singularity when running the commands.

Installing

To install Apptainer, follow the official installation instructions. If you are trying to run Apptainer from within a shared HPC environment, it is likely that you already have Apptainer/Singularity installed.

List available modules:

module avail

Assuming that the module apptainer is available, activate it by running:

module load apptainer

Building the Patchwork image

Build the Apptainer image patchwork.sif from the Apptainer definition file patchwork.def:

apptainer build --force patchwork.sif patchwork.def

Running the Patchwork image

To run Patchwork from within the container, simply put:

apptainer run patchwork.sif --help

To enter a shell from within the container, instead run the following.

apptainer shell patchwork.sif