From cec730c439d8e90eb12528c4db4539b605a45657 Mon Sep 17 00:00:00 2001 From: Aurore Maureaud Date: Thu, 20 Jul 2023 08:59:38 -0400 Subject: [PATCH] add more explanation on open tree of life --- data_processing/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/data_processing/README.md b/data_processing/README.md index 0a6f593..a3b0128 100644 --- a/data_processing/README.md +++ b/data_processing/README.md @@ -10,7 +10,7 @@ COL is a biodiversity data aggregator with the more recent aim of proposing one The genetic data hosted by the National Center for Biotechnology Information (NCBI, https://www.ncbi.nlm.nih.gov/) offers the most public genetic information across taxonomic groups worldwide. The NCBI Taxonomy comprises names and ranks for all organisms represented by genetic sequence data within the NCBI database. We downloaded the relevant taxonomies in November 2021. To do this, we downloaded the entire taxonomy for the taxonomic id number of the highest taxonomic rank encompassing each of our nine groups. ## Phylogenetics -The phylogenetic tree hosted by OneZoom on their Tree of Life Explorer (http://www.onezoom.org/) covers all taxa selected for the analysis. We downloaded v.3.3 of this taxonomy (Open Tree of Life reference taxonomy version 3.3). +The phylogenetic tree hosted by OneZoom on their Tree of Life Explorer (http://www.onezoom.org/) covers all taxa selected for the analysis. We downloaded v.3.3 of this taxonomy (Open Tree of Life reference taxonomy version 3.3). The way the data is structured required to select the highest taxonomic rank corresponding to our groups of interests that includes all relevant names. Unfortunately, sometimes these highest taxonomic ranks and names are not all formally recognized in the field, which has made the interoperability more challenging. It sometimes required to search lower ranked names for the entire group, and may have led to higher rates of mismatch with this database than with others. ## Global spatial data The Global Biodiversity Information Facility was selected to reflect spatial occurrence point data (GBIF, https://www.gbif.org/). For each taxonomic group, we queried a list of species names with occurrence data. This search excluded fossil species. [add date]