Team 1 - Functional Annotation

List of scripts in our GitHub repo.

door2.pl -- script used to run door2
extractSequences.py -- a script which extracts sequences from a fasta file based on a headers given in second file
functionalAnnotationPipeline.sh -- the final pipeline
getFastaHeaders.sh -- script which extracts all headers from a fasta file and stores in a new file
outputParser.py -- script which parses the output of all tools, uClust and the original GFFs to create new GFFs with annotations
parseUclustOutput.py -- script which reads in the .uc file generated from uClust and creates an index file and a sizes file
pilerCr.sh -- script used to run pilerCR (not included in final pipeline)
reformatFasta.py -- script which changes the gene names in the fasta file, reformats the file so that all sequences are in 1 line and also appends the SRR ID in front of the gene name.
reformatGff.py -- script which changes column 1 of the GFF to the gene name and also appends the SRR ID in front of the gene name.

Other files in our GitHub repo are

These are the default databases used for door2, CARD and VFDB.

Provide feedback

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.gitignore		.gitignore
README.md		README.md
VFDB_setB_nt.fas		VFDB_setB_nt.fas
door2.pl		door2.pl
extractSequences.py		extractSequences.py
functionalAnnotationPipeline.sh		functionalAnnotationPipeline.sh
getFastaHeaders.sh		getFastaHeaders.sh
kleb_all.opr		kleb_all.opr
kleb_gid.txt		kleb_gid.txt
kop_final.table		kop_final.table
outputParser.py		outputParser.py
parseUclustOutput.py		parseUclustOutput.py
pilerCr.sh		pilerCr.sh
protein_fasta_protein_homolog_model.fasta		protein_fasta_protein_homolog_model.fasta
reformatFasta.py		reformatFasta.py
reformatGff.py		reformatGff.py