Skip to content

Latest commit

 

History

History
60 lines (50 loc) · 3.78 KB

README.md

File metadata and controls

60 lines (50 loc) · 3.78 KB

Scripts and data for reproducing MLA's evaluation

Software used

Running MLA

Simulating reads

Software for parsing read simulation outputs and alignment results

  • bbmap (v38.86)
  • samtools (experiments were run with v1.17)

Sequence-to-graph alignment tools for comparison

Additional software for evaluating alignment results

Data used for evaluations

Setting up the environment

  1. To install most of the required software for the evaluation, set up a conda environment using the provided environment.yml file: conda env create -f environment.yml
  2. Activate the environment: conda activate mla
  3. Download the pre-compiled binaries of metagraph, PLAST, kmc, and parse_plast from here and place them in your working directory.
  4. Download the genomes from here to a directory named references. A list of accessions is in accession_list
  5. Download the random entropy source file seed2 (used to generate query sets)
  6. Download the accession-ID augmented taxonomic tree from the augmented directory here

Constructing a simulated joint assembly graph and the query sets

  1. Simulate reads for each genome:
    for a in references/*.fa; do ./make_sample.sh $a; done
    for a in illumina hifi clr ont; do ./make_subset.sh $a; done
    
  2. Build the MetaGraph, PLAST, and GFA indexes by running ./make_graph.sh. In this script, you can set the number of threads in the variable $NTHREADS. We provide precomputed indexes for MetaGraph here and for PLAST here.
  3. Generate query reads by running ./make_subset.sh

Run the alignments and classify the reads

for a in query_reads/*.fa; do
    ./map_query_sca.sh $a
    ./map_query_mla.sh $a
    ./map_query_plast.sh $a
    ./map_query_ga.sh $a
done

Notes

  • To compile some of the software yourself from source
    • parse_plast.cpp: we have provided a Makefile.
    • PLAST: we have provided a template CMakeLists.txt file in plast_cmake. Please edit it to point to the include directory and static lib of Bifrost.
    • MetaGraph (with packaged kmc): follow the instructions here. Remember to checkout the commit listed above from the hm/aln_alt_label_change branch before compiling.
  • Run metagraph align to view the help menu listing alignment parameters. To list a more advanced set of parameters, run metagraph align --advanced.