-
Notifications
You must be signed in to change notification settings - Fork 14
Chr21 simulation experiment on AWS
Glenn Hickey edited this page Jun 8, 2018
·
9 revisions
This sequence of commands will construct graphs, simulate reads, and generate various mapping and calling ROC plots on chromosome 21 for sample HG00096. Other chromosomes and samples (from 1kg or HG002) can be used by replacing each occurrence below.
#Create an EC2 leader node on which to run all toil-vg commands:
scripts/create-ec2-leader.sh leader my-keypair-name
#Construct thread graphs from which to simulate from
scripts/construct-hs37d5-ec2.py my-job-store my-out-store --leader leader --chroms 21 --sample HG002 --haplo_graph --xg --out_name baseline
#Construct test graphs to use for mapping
scripts/construct-hs37d5-ec2.py my-job-store my-out-store --leader leader --chroms 21 --control HG00096 --gcsa --xg
#Simulate ~50X coverage reads from the thread graphs
scripts/sim-ec2.py my-job-store my-out-store s3://my-out-store/baseline_HG00096_haplo 65000000 --leader leader
#Simulate ~50X coverage reads from the thread graphs with no errors
scripts/sim-ec2.py my-job-store my-out-store s3://my-out-store/baseline_HG00096_haplo 6500000 --leader leader --sim_opts "-p 570 -v 65 -S 0 -i 0 -I"
#Run mapping evaluation on the error-free reads
scripts/mapeval-ec2.py my-job-store my-out-store snp1kg_21 baseline_HG00096_21_HG00096_haplo_sim_6.5M_trained-p570-v65-S0-i0-I.gam --leader leader --fasta s3://cgl-pipeline-inputs/vg_cgl/HS37D5/HS37D5_chr21.fa --names primary HG00096 --outname trained_no_error
#Run calling evaluation on the error-free reads
scripts/calleval-ec2.py my-job-store my-out-store snp1kg_21 --leader leader --fasta s3://cgl-pipeline-inputs/vg_cgl/HS37D5/HS37D5_chr21.fa --truth s3://my-out-store/ALL.chr21.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes_HG00096.vcf.gz --chroms 21 --names primary HG00096 --outname trained_570_65 --chroms 21
#Terminate the leader
toil destroy-cluster leader