Skip to content

sgupta1524/Genome-Annotation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Gene Prediction Pipeline

Overview:

A Gene Prediction pipeline that predicts coding and non-coding genes from assembled genomes using various ab-initio and homology based programs and tools. For predicting coding genes the pipeline uses GeneMarkS-2 and Prodigal, meanwhile, for predicting non-coding genes it uses ARAGORN, BARRNAP, RNAmmer and Infernal. BLAST is used to validate the results of the coding genes and provides results as false-positive or true-positives in FASTA/.fna format.

Pipeline Requirements:

  1. PRODIGAL. Or: conda install -c bioconda prodigal
  2. GeneMarkS-2.

NOTE: If GeneMarkS-2 is being ran/downloaded on a MacOS then you would have to download the "64 bit key" along with GeneMarkS-2 and execute the following command once the files have been downloaded: cp gm_key_64 ~/.gm_key_64

  1. BLAST. Or: install -c bioconda blast
  2. BEDTools. Or: conda install -c bioconda bedtools
  3. Perl. Or: conda install -c anaconda perl

NOTE: Once downloaded, all tools are assumed to be installed onto your PATH.

Script execution:

python3 pipeline.py -i <assembled genome(s)> -org_cds <organism of interest's CDS file> -o <output directory name>

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages