Currently available Phanta databases

Each database should include the following files:

Kraken2 database

hash.k2d
taxo.k2d
opts.k2d
seqid2taxid.map

Bracken databases (built for use with various read lengths N):

databaseNmers.kmer_distrib

Additional files required for pipeline to run:

inspect.out
taxonomy/nodes.dmp
taxonomy/names.dmp
library/species_genome_size.txt

For use with post-processing scripts:

host_prediction_to_genus.tsv
species_name_to_vir_score.txt

Note: Phanta was developed with human gut metagenomes in mind. Phanta's default database was built based on human-gut viral and bacterial genomes. If you wish to apply Phanta on non human gut metagenomes you'll probably need to supply a custom database. In such cases please open new discussion and we can discuss the best way to help/collaborate on that.

The total tar.gz file should be about 20-25 GB (depends on the exact version).

Version 1

Default database (as described in our manuscript)- http://ab_phanta.os.scg.stanford.edu/Phanta_DBs/database_V1.tar.gz
Prophage masked database (as described in our manuscript) http://ab_phanta.os.scg.stanford.edu/Phanta_DBs/masked_db_v1.tar.gz
Default database that uses the GTDB taxonomy for bacteria and Archaea (instead of NCBI taxonomy). http://ab_phanta.os.scg.stanford.edu/Phanta_DBs/unmasked_db_v1_gtdb.tar.gz This taxonomy is equivalent to that provided by HumGut, with the exception that taxonomic IDs for GTDB nodes starts with 5,000,000 rather than 4,000,000. See Humgut documentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

databases.md

databases.md

Currently available Phanta databases

Version 1

Files

databases.md

Latest commit

History

databases.md

File metadata and controls

Currently available Phanta databases

Version 1