1. Create an environment with one step
git clone https://github.com/Hiscox-lab/AioMinor/AioMinor.git
cd AioMinor
conda env create -f my_environment.yml
source activate AioMinor
2. Create an environment step by step
Third party dependencies:
samtools(>=1.9)
bowtie2(=2.4.1)
minimap2(=2.24)
All these third party tool dependencies should be exported to PATH, so that AioMinor can find them.
Perl module dependencies:
Getopt::Long
File::Basename
Data::Dumper
IO::File
Math::CDF
List::Util
Text::NSP
Bio::DB::Sam
Bio::SeqIO
Please see the details of each parameter by:
Required options:
-platform sequencing platform "nanopore" or "illumina".
-method method of sequencing Library Preparation "amplicon" or "cDNA".
-maxins maximum fragment length of in your amplicon Library with paired-end sequencing.
-ref reference genome sequence in fasta file.
-codingRegion coding regions in your reference genome.
-primerbed primer scheme in bed file.
-fq fastq(.gz) file for single-end sequencing.
-fq1 fastq(.gz) file for paired-end sequencing mate 1s.
-fq2 fastq(.gz) file for paired-end sequencing mate 2s.
-primerbed amplicon primer information in bed file.
Optional options:
-samplename sample name, "alignment" by default.
-t/-thread number of threads, 1 by default.
-minins minimum fragment length of in your amplicon Library with paired-end sequencing, "50" by default.
-o output path, "./" by default.
-v/-version Print version information.
-h/-help Print help message.\n\n";
1. ARTIC Illumina sequencing data
To analyse ARTIC-Illumina amplicon sequencing data with your chosen primer scheme. (The ARTIC primer bed can be found in the "Primerbeds" folder, and reference genome and coding region of SARS-CoV-2 NC_045512.2 can be found in the "References" folder):
perl AioMinor.pl -t 16 -platform illumina -method amplicon -maxins 500 -ref genome.fasta -codingRegion CodingRegion.txt -primerbed nCoV-2019.primer_V3.bed -fq1 example_R1.fastq.gz -fq2 example_R2.fastq.gz -o example_output
2. ARTIC Nanopore sequencing data
To analyse ARTIC-Nanopore amplicon sequencing data with your chosen primer scheme. (The ARTIC primer bed can be found in the "Primerbeds" folder, and reference genome and coding region of SARS-CoV-2 NC_045512.2 can be found in the "References" folder):
perl AioMinor.pl -t 16 -platform nanopore -method amplicon -maxins 1600 -ref genome.fasta -codingRegion CodingRegion.txt -primerbed nCoV-2019.primer_V3.bed -fq example.fastq.gz -o example_output
3. Normal Illumina sequencing data
To analyse Normal Illumina sequencing data with your chosen primer scheme. (The reference genome and coding region of SARS-CoV-2 NC_045512.2 can be found in the "References" folder):
perl AioMinor.pl -t 16 -platform nanopore -method amplicon -maxins 1600 -ref genome.fasta -codingRegion CodingRegion.txt -fq1 example_R1.fastq.gz -fq2 example_R2.fastq.gz -o example_output
The results can be found under the in output path.
*_entropy.txt file in this folder prvides the raw nucletide varation frequency compared to the consensus genome,inlcuding Transitions and transversions. *_AA.txt file in this folder prvides the raw amino acid varation frequency compared to the consensus genome,inlcuding synonymous and non-synonymous substitution.
*_filter.txt file in this folder prvides the filtered nucletide varation frequency,inlcuding Transitions and transversions.
*_AA_all_AA_filtered.txt file in this folder prvides the details of minor and major amino acids at each amino acid site after filtration in 2_Syn_NonSyn_filter. *_AA_all_condon_filtered.txt file in this folder prvides the details of minor and major condons at each amino acid site after filtration in 2_Syn_NonSyn_filter. *_minor_change_filtered.txt file in this folder prvides frequency of synonymous and non-synonymous substitution compared to the consensus genome after filtration.
consensus.txt file in this folder prvides consensus genome sequences.
There a test data obtained from ARTIC (V3) Illumina sequencing of a cell culture sample in the Testdata folder. AioMinor can be tested with this data in the AioMinor directory.
perl AioMinor.pl -t 16 -platform illumina -method amplicon -maxins 500 -ref ./References/genome_NC_045512.2.fasta -codingRegion ./References/CodingRegion_NC_045512.2.txt -primerbed ./Primerbeds/nCoV-2019.primer_V3.bed -fq1 ./Testdata/cell_illumina_R1.fastq.gz -fq2 ./Testdata/cell_illumina_R2.fastq.gz -o cell_illumina_output