Skip to content

Commit

Permalink
add more explanation on open tree of life
Browse files Browse the repository at this point in the history
  • Loading branch information
AquaAuma committed Jul 20, 2023
1 parent d935bd8 commit cec730c
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion data_processing/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ COL is a biodiversity data aggregator with the more recent aim of proposing one
The genetic data hosted by the National Center for Biotechnology Information (NCBI, https://www.ncbi.nlm.nih.gov/) offers the most public genetic information across taxonomic groups worldwide. The NCBI Taxonomy comprises names and ranks for all organisms represented by genetic sequence data within the NCBI database. We downloaded the relevant taxonomies in November 2021. To do this, we downloaded the entire taxonomy for the taxonomic id number of the highest taxonomic rank encompassing each of our nine groups.

## Phylogenetics
The phylogenetic tree hosted by OneZoom on their Tree of Life Explorer (http://www.onezoom.org/) covers all taxa selected for the analysis. We downloaded v.3.3 of this taxonomy (Open Tree of Life reference taxonomy version 3.3).
The phylogenetic tree hosted by OneZoom on their Tree of Life Explorer (http://www.onezoom.org/) covers all taxa selected for the analysis. We downloaded v.3.3 of this taxonomy (Open Tree of Life reference taxonomy version 3.3). The way the data is structured required to select the highest taxonomic rank corresponding to our groups of interests that includes all relevant names. Unfortunately, sometimes these highest taxonomic ranks and names are not all formally recognized in the field, which has made the interoperability more challenging. It sometimes required to search lower ranked names for the entire group, and may have led to higher rates of mismatch with this database than with others.

## Global spatial data
The Global Biodiversity Information Facility was selected to reflect spatial occurrence point data (GBIF, https://www.gbif.org/). For each taxonomic group, we queried a list of species names with occurrence data. This search excluded fossil species. [add date]
Expand Down

0 comments on commit cec730c

Please sign in to comment.