Name		Name	Last commit message	Last commit date
parent directory ..
hisat2		hisat2
real_read_experiments		real_read_experiments
README.md		README.md
assess_mappability.py		assess_mappability.py
combine_reads.py		combine_reads.py
giraffe_accuracy_kubernetes		giraffe_accuracy_kubernetes
giraffe_accuracy_primary_kubernetes		giraffe_accuracy_primary_kubernetes
giraffe_control_accuracy_kubernetes		giraffe_control_accuracy_kubernetes
giraffe_gbwts_accuracy_kubernetes		giraffe_gbwts_accuracy_kubernetes
giraffe_speed.sh		giraffe_speed.sh
giraffe_yeast_experiment.sh		giraffe_yeast_experiment.sh
graphaligner_accuracy_kubernetes		graphaligner_accuracy_kubernetes
graphaligner_speed.sh		graphaligner_speed.sh
hisat2_accuracy_kubernetes		hisat2_accuracy_kubernetes
hisat2_speed.sh		hisat2_speed.sh
linear_mappers_accuracy_kubernetes		linear_mappers_accuracy_kubernetes
linear_mappers_speed.sh		linear_mappers_speed.sh
map_accuracy_kubernetes		map_accuracy_kubernetes
map_accuracy_primary_kubernetes		map_accuracy_primary_kubernetes
map_speed.sh		map_speed.sh

README.md

Graph Mapping Scripts

These run Giraffe and competing mappers, so that speed and accuracy can be assessed.

To run them, you will need a machine with the aws command configured.

You will also need kubectl set up to access a Kubernetes cluster which has a service account named vg-svc and a secret named shared-s3-credentials with the contents of a ~/.aws directory that grants single-factor access to AWS. You may need to adjust the scripts to reference a different service account or secret name if you cannot provide resources with these names. The Kubernetes cluster will need to be able to fulfil requests for up to 24 cores, 250 Gi of memory, and 300 Gi of ephemeral storage.

These scripts expect to write to the s3://vg-k8s bucket. You will need to have write access to this bucket, or you will need to replace these writes with writes to a bucket where you intend to store your generated artifacts.

The yeast graph experiment script expect to read files from the mount point /public/groups/cgl/users/daheller/yeast_graph. You will need to provide this mount point, with the relevant files from the data archive, or else edit the script to reference a different local path for these files.

The folder hisat2/ contains scripts for creating indexes, mapping reads and evaluating alignments for HISAT2.

The folder real_read_experiments/ contains scripts for running linear and graph mappers against HG002 and HG003 real readsets.

Running the scripts

All the Kubernetes scripts can be run at the same time:

./*_kubernetes

Results will be available when the jobs they create finish.

The speed scripts expect to run on standard-sized AWS instances, as described in the paper, so results are comparable between them. To run them, transfer each to an AWS instance of that type with aws and the versions of vg and the other mappers to be tested installed, and with access to the appropriate output data bucket, and run them from a directory with sufficient fast storage.

These scripts test map and giraffe in vg:

./map_speed.sh

./giraffe_speed.sh

This script tests the speed of GraphAligner:

./graphaligner_speed.sh

This script tests the speed of hisat2:

./hisat2_speed.sh

This script tests the linear genome mappers bowtie2, minimap2, and bwa mem, all of which must be installed:

./linear_mappers_speed.sh

The yeast mapping experiment script doesn't depend on any of the others. If run after the yesst graph build script, it will re-use the constructed graphs stored in local files, but if not it will construct them itself. It is run like this:

./giraffe_yeast_experiment.sh

None of the .py scripts need to be run; they are used by other scripts or not at all.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mapping

mapping

README.md

Graph Mapping Scripts

Running the scripts

Files

mapping

Directory actions

More options

Directory actions

More options

Latest commit

History

mapping

Folders and files

parent directory

README.md

Graph Mapping Scripts

Running the scripts