Group 3 - Team 2 - Functional Annotation

This repository contains a pipeline to annotate features of predicted genes from assembled genomes in fasta format and predicted genes or proteins in fasta and gff format.

Tools and Databases Used by this Pipeline

Phaster
eggNOG-mapper hmmer/diamond
VFDB/Victor Database (virulence factors)
Pilercr
rgi
LipoP1.0
SignalP
DOOR (BLAST)
rfam

Usage

  master script located in ./final/pipeline.sh
  
  ./pipeline.sh
  	-d diamond (runs all tools and eggNOG-mapper in diamond mode)
	-d hmmer -t gproNOG (runs all tools and eggNOG-mapper in hmmer mode)
  	-f Enable PHASTER to screen for prophage regions (SLOW MODE)
	-p Path to query genomes (Required for PHASTER)
	-r Path to ncRNA .gff files
example usage:
  	(move to final directory first)
./pipeline.sh -d diamond -f -p /Path/to/query/genomes/ -r /Path/to/ncRNA/gff_files

Inputs and Outputs

  Inputs: Automated. Takes in Nucleotide sequences and Translated proteins files from Gene Prediction and genome files from Genome Assembly.
  Output(s): GFF files for each tool (divided by SRR)
  		 Merged GFF annotation file for all tools.

Directory Structure

-scripts(additional set of scripts)

- phaster2gff.sh
- run_phaster.sh
- pilercr2gff.pl
- pilercr.sh
- run_operon.pl
- genbankonly.sh
- prokka.sh
- rgi2gff.py
- Clustering_graph.R
- changecol_operons.pl
- changecol_vf.pl
- converttogff_virfact.py
- converttogff_operons.py
- virulenceFactors.sh
README.md
final_results
temp
test
final
- scripts(scripts to run tools. called upon by pipeline.sh)
- - append_headers.pl
- - run_operon.pl
- - run_eggnog.sh
- - sort_by_length.pl~
- - reput.pl~
- - reput.pl
- - sort_by_length.pl
- - virulencemapper.pl
- - virulenceFactors.sh
- - mapper_proteins.pl~
- - mapper_proteins.pl
- - run_eggnog.sh~
- - run_rgi.py~
- - rgi2gff.py
- - cluster_meta.pl
- - eggnog_to_gff.pl~
- - eggnog_to_gff.pl
- - converttogff_operons.py
- - changecol_operons.pl
- - database
- - - finalGenes.fas
- - - finalGenes.fas.nhr
- - - finalGenes.fas.nin
- - - finalGenes.fas.nsq
- - run_rgi.py
- - changecol_vf.pl
- - converttogff_virfact.py
- clusters (created files upon start of pipeline.sh)
- - assembled100_proteins_nucl.uc
- - assembled100_nucl.fna
- - assembled97_nucl.uc
- - assembled97_nucl.fna
- - assembled100_proteins.uc
- - assembled100_proteins.faa
- - assembled97_proteins.uc
- - assembled97_proteins.faa
- pipeline.sh(master file. Run this to perform annotation)
- run_rgi2.py~
- run_rgi2.py
- blastp
- pipeline.sh~
- tools(dependencies for scripts)
- - usearch
- - rgi
- - blastdb_aliastool
- - blastdbcheck
- - blastdbcmd
- - blast_formatter
- - blastn
- - blastp
- - blastx
- - convert2blastmask
- - deltablast
- - dustmasker
- - legacy_blast.pl
- - makeblastdb
- - makembindex
- - makeprofiledb
- - psiblast
- - rpsblast
- - rpstblastn
- - segmasker
- - tblastn
- - tblastx
- - update_blastdb.pl
- - windowmasker
- tmp
- - placeholderfile.txt
- tool_gff(output for each individual tool post-mapping)
- - rgi
- - phaster
- - VFDB
- - DOOR
- - rfam
- - Pilercr
- - lipoP
- - eggNOG

Installation & Dependencies

  biopython
  diamond (already installed in eggNOG. However, ensure that diamond isn't being called from any place other than located in the bin folder of eggnog-mapper)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Group 3 - Team 2 - Functional Annotation

Tools and Databases Used by this Pipeline

Usage

Inputs and Outputs

Directory Structure

Installation & Dependencies

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
final		final
final_results		final_results
scripts		scripts
README.md		README.md

compgenomics2018/g3-annotation-team2

Folders and files

Latest commit

History

Repository files navigation

Group 3 - Team 2 - Functional Annotation

Tools and Databases Used by this Pipeline

Usage

Inputs and Outputs

Directory Structure

Installation & Dependencies

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages