This repository contains a set of comparative genomics scripts for Team II's comparative genomics group.
For general info, see the project's wiki page.
The repository is organized as follows:
.
├── strain.sh # Strain determination script
├── whole_genome # Whole genome related scripts
│ ├── clustering # Clustering based on whole genome similarity
│ ├── gwas # Bacterial GWAS scripts and results
├── phylogeny # Phylogeny related scripts
│ ├── ksnp # kSNP scripts and results
│ ├── strain_detect # Scripts and dependencies for strain detection
└
To determine strain of an assembled genome, run
./strain.sh -i assembly.fasta
Some of the scripts might only work on biogenome2018b.biology.gatech.edu
server since they rely on specific files present on this server.
-
All the necessary dependencies for
strain.sh
are present in thephylogeny/strain_detect
directory. -
Database required for strain detection can be downloaded from:
OR
- A Klebsiella-specific (2.5GB) Database can be accessed on the
biogenome2018b.biology.gatech.edu
server at/projects/data/important_data/reference_DB/seeker_genomes/kleb_DB
- A Klebsiella-specific (2.5GB) Database can be accessed on the
-
The database need to be put in the
phylogeny/strain_detect/database
directory