-
Notifications
You must be signed in to change notification settings - Fork 1
mpi2/crisprcon
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
######################################## # CrisprCon # # Crispr/Cas allele analysis. # # EMBL-EBI # ######################################## # PROJECT URLs: https://github.com/mpi2/crisprcon http://www.mousephenotype.org/ # CONTACT: International Mouse Phenotyping Consortium (IMPC): [email protected] Mouse Informatics Project leader: Terry Meehan ([email protected]) ############ # CONTENTS # ############ 1. General Information 2. Variant calling (crispFa) 3. gRNA consequence prediction (crispRNA) 4. References # Last updated: June 2018 ########################## # 1. General Information # ########################## # Reference genome All variant calls are relative to a reference genome, e.g: C57BL/6J (GRCm38). An example of the last major release of the reference genome can be found here: ftp://ftp.ncbi.nlm.nih.gov/genomes/archive/old_genbank/Eukaryotes/vertebrates_mammals/Mus_musculus/GRCm38/ # Annotation file A GFF file is used for the consequence prediction of the variants. An example of a GFF file can be found here: ftp://ftp.ensembl.org/pub/release-92/gff3/mus_musculus/Mus_musculus.GRCm38.92.gff3.gz # Software requirements CrisprCon requires that BWA, SAMtools, BCFtools and BEDtools are in the path. BCFtools csq runs with the option --force. Therefore, a version of BCFtools higher or equal than 1.9 is required. Usage: ./crisprcon.py [-h] {crispFa,crispRNA} crispFa Variant caller for CRISPR edited alleles crispRNA gRNA consequence predictor. It takes the coordinates of gRNA and it outputs a VCF with the predicted consequences ##################################### # 2. Variant calling (crispFa) # ##################################### Usage: ./crisprcon.py crispFa [-h] [-i INPUT_FA] [-o OUTPUT_VCF] [-f REF_GENOME] [-r COORDS] [-g GFF] [-d OUT_DIR] [-t TEMP_DIR] [-p PREFIX] [-c GRNA GRNA] Positional arguments: -i INPUT_FA, --input INPUT_FA Input FASTA. -o OUTPUT_VCF, --output OUTPUT_VCF Output VCF file. -f REF_GENOME, --fasta-ref REF_GENOME Reference file in fasta format (must be .fai indexed) -r COORDS, --region COORDS Coordinates of CRISPR/Cas target region (ie: chr:start-end) -g GFF, --gff-annot GFF gff3 annotation file Optional arguments: -d OUT_DIR, --dir OUT_DIR Optional: Directory for the output files. If not present, it will be created. Default: current directory -t TEMP_DIR, --temp TEMP_DIR Optional: Directory for temporary and intermediate files. Default: creates "temp" directory at the output directory -p PREFIX, --prefix PREFIX Optional: Prefix for temporary files. Default: input_name -c GRNA GRNA, --crispr-grna GRNA GRNA Optional: gRNA pair coordinates used for CRISPR. Advised for validation of indels and potential false positives. Use different flags for more than one pair. Eg: -c 72206978 72207000 -c 72206982 72207004 -h, --help Show this help message and exit ############################################## # 3. gRNA consequence prediction (crispRNA) # ############################################## Usage: crispCon crispRNA [-h] [-o OUTPUT_VCF] [-f REF_GENOME] [-g GFF] [-c GRNA GRNA] [-s DONOR] [-d OUT_DIR] [-t TEMP_DIR] [-p PREFIX] Positional arguments: -o OUTPUT_VCF, --output OUTPUT_VCF Output VCF file. -f REF_GENOME, --fasta-ref REF_GENOME Reference file in fasta format (must be .fai indexed) -g GFF, --gff-annot GFF gff3 annotation file -c GRNA GRNA, --crispr-grna GRNA GRNA gRNA pair coordinates and sequence used for CRISPR (sequence chr:start-end). Use different flags for more than one pair. Eg: -c GATTCTGGCATCATCTATGTGGG 1:72206978-72207000 -c GCTTCTGCCAATATCTATTTGGG 1:72206982-72207004 -s DONOR, --seq-donor DONOR Sequence of the oligo donor Optional arguments: -d OUT_DIR, --dir OUT_DIR Optional: Directory for the output files. If not present, it will be created. Default: current directory -t TEMP_DIR, --temp TEMP_DIR Optional: Directory for temporary and intermediate files. Default: creates "temp" directory at the output directory -p PREFIX, --prefix PREFIX Optional: Prefix for temporary files. Default: input_name -h, --help Show this help message and exit ################# # 5. References # ################# Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R; 1000 Genome Project Data Processing Subgroup. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009 Aug 15;25(16):2078-9. Epub 2009 Jun 8. PubMed PMID: 19505943; PubMed Central PMCID: PMC2723002. Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, McVean G, Durbin R; 1000 Genomes Project Analysis Group. The variant call format and VCFtools. Bioinformatics. 2011 Aug 1;27(15):2156-8. Epub 2011 Jun 7. PubMed PMID: 21653522; PubMed Central PMCID: PMC3137218. Petr Danecek, Shane A McCarthy; BCFtools/csq: haplotype-aware variant consequences, Bioinformatics, Volume 33, Issue 13, 1 July 2017, Pages 2037–2039. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009 Jul 15;25(14):1754-60. Epub 2009 May 18. PubMed PMID: 19451168; PubMed Central PMCID: PMC2705234. Aaron R. Quinlan, Ira M. Hall; BEDTools: a flexible suite of utilities for comparing genomic features, Bioinformatics, Volume 26, Issue 6, 15 March 2010, Pages 841–842.
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published