Skip to content

Latest commit

 

History

History
32 lines (23 loc) · 1.63 KB

databases.md

File metadata and controls

32 lines (23 loc) · 1.63 KB

Currently available Phanta databases

Each database should include the following files:

Kraken2 database

  1. hash.k2d
  2. taxo.k2d
  3. opts.k2d
  4. seqid2taxid.map

Bracken databases (built for use with various read lengths N):

  1. databaseNmers.kmer_distrib

Additional files required for pipeline to run:

  1. inspect.out
  2. taxonomy/nodes.dmp
  3. taxonomy/names.dmp
  4. library/species_genome_size.txt

For use with post-processing scripts:

  1. host_prediction_to_genus.tsv
  2. species_name_to_vir_score.txt

Note: Phanta was developed with human gut metagenomes in mind. Phanta's default database was built based on human-gut viral and bacterial genomes. If you wish to apply Phanta on non human gut metagenomes you'll probably need to supply a custom database. In such cases please open new discussion and we can discuss the best way to help/collaborate on that.

The total tar.gz file should be about 20-25 GB (depends on the exact version).

Version 1