Skip to content

Tutorial Getting Started

Andre Kahles edited this page Jul 12, 2015 · 1 revision

Thank you for downloading and trying out SplAdder. This page will give you a brief overview on the first steps towards running the pipeline on your samples.

Installation

You can find details regarding the installation procedure in the README section of the repository. For python, once you are meeting the dependency requirements, there is nothing to set up and you can just get going. For Matlab, some compiling needs to be done and you can use the configure script to assist you with that.

Example Data

SplAdder comes with a small example data set that not only can be used to check if your installation is doing what it is supposed to do but also serves as a sandbox for our first steps here. So before you start, please run ./example_run.sh to download the example data and to check if your installation is in a working state. You will be able to find indexed alignment files in bam format under the examples sub-directory. We will use these files in the next steps of the tutorial.

If the download of the example data was successful, you should be able to see the following:

$> cd examples
$> ls *.bam*
NMD_DBL1.tiny.bam  NMD_DBL1.tiny.bam.bai  NMD_DBL2.tiny.bam  NMD_DBL2.tiny.bam.bai  NMD_WT1.tiny.bam  NMD_WT1.tiny.bam.bai  NMD_WT2.tiny.bam  NMD_WT2.tiny.bam.bai

Further contained in the examples directory is an annotation file in GFF format. This particular annotation is for Arabidopsis thaliana as is our example RNA-Seq data. You need to make sure that the annotation file you are using refers to the same organism and to the same genome version as has been used for generating the alignment files. If this is not the case, SplAdder will produce inaccurate or even false results. When looking at the first lines of the annotation file, you should be able to see the following:

$> head -n 10 TAIR10_GFF3_genes.tiny.gff
Chr1    TAIR10  gene    7615614 7618605 .       +       .       ID=AT1G21690;Note=protein_coding_gene;Name=AT1G21690
Chr1    TAIR10  mRNA    7615622 7618605 .       +       .       ID=AT1G21690.1;Parent=AT1G21690;Name=AT1G21690.1;Index=1
Chr1    TAIR10  protein 7615675 7618362 .       +       .       ID=AT1G21690.1-Protein;Name=AT1G21690.1;Derives_from=AT1G21690.1
Chr1    TAIR10  exon    7615622 7615718 .       +       .       Parent=AT1G21690.1
Chr1    TAIR10  five_prime_UTR  7615622 7615674 .       +       .       Parent=AT1G21690.1
Chr1    TAIR10  CDS     7615675 7615718 .       +       0       Parent=AT1G21690.1,AT1G21690.1-Protein;
Chr1    TAIR10  exon    7615805 7615883 .       +       .       Parent=AT1G21690.1
Chr1    TAIR10  CDS     7615805 7615883 .       +       1       Parent=AT1G21690.1,AT1G21690.1-Protein;
Chr1    TAIR10  exon    7616028 7616107 .       +       .       Parent=AT1G21690.1
Chr1    TAIR10  CDS     7616028 7616107 .       +       0       Parent=AT1G21690.1,AT1G21690.1-Protein;

Now you should be ready to proceed to the next step and augment the annotation.

Home > Tutorial

  • [1a: Getting Started] (Tutorial-Getting-Started)
  • [1b: Augment Annotation] (Tutorial-Augment-Annotation)
  • [1c: Detect Events] (Tutorial-Event-Detection)
Clone this wiki locally