This is the repository of the Genomics Core High Performance Computing for Genomics workshop. It contains all scripts, data and presentations.
This is a complete guide, which is an introduction to every aspect of the HPC for regular usage. Most examples are configured in a Biological background. The appendices contain a very usefull cheat sheet (contains all commands, and a brief overview), the excersises of the workshop, including all the answers.
This github repository contains 4 presentations, a Linux prelude, a Part 1, a Part 2 and a Part 3.
The Prelude includes an small overview of commands needed to be be able to follow the other parts. For a good, more complete tutorial see this Linux Guide for Beginners
Part 1 contains an introduction to the HPC:
- hardware description (cpu, storage, ...)
- credit usage
- simple job submission
Part 2 contains the "genomics" application part of the course:
- a project driven approach in the exercises, containing an assembly, mapping and variant calling
- parallel processing
Part 3 contains a few advanced topics:
- creation of temporary directories
- a howto on the installation of own tools, and to make them available in the module system
- a howto on the basic management of your groups storage, access, software and resources
All used data and scripts are available in this repository. The data do not need transformations or editing. The scripts will have to be changed in order to be able to run.
The generate_modules script, needed for the installation of modules.