Skip to content

Latest commit

 

History

History
121 lines (79 loc) · 6.26 KB

README.md

File metadata and controls

121 lines (79 loc) · 6.26 KB

Biodiversity Hackathon: OBIS Resources

Binder Open In Colab

See instructions to open notebooks on Binder/Colab here

This repository contains materials and instructions for participants of the Biodiversity Hackathon. Here we outline the various tools, demos, and resources that can be used to access and use the biological and biogeographical data in OBIS.

Access OBIS data

There are several options available to download data from OBIS, some of which include:

robis

The robis R package connects to the OBIS API from R. The package can be installed from CRAN or from GitHub (latest development version).

# install from CRAN
install.packages("robis")

# latest development version
remotes::install_github("iobis/robis")

You can use the package to obtain a list of datasets, a taxon checklist, or raw occurrence data by supplying e.g. a taxon name or WoRMS AphiaID. You can also specify whether to include absence records when obtaining occurrence data. To download this data, simply export R objects with the write.csv function. If we wanted to obtain Mollusc data from OBIS, some options would be:

library(robis)

# obtain occurrence data
moll <- occurrence("Mollusca")
moll_abs <- occurrence(“Mollusca”, absence = "include") # include absence records
write.csv(moll, "mollusca-obis.csv") # save the data to csv

# obtain a list of datasets for a taxon
molldata <- dataset(scientificname = "Mollusca")

#obtain a checklist of Mollusc species in a certain area
mollcheck <- checklist(scientificname = "Mollusca", geometry = "POLYGON ((2.3 51.8, 2.3 51.6, 2.6 51.6, 2.6 51.8, 2.3 51.8))")

Filter datasets by keyword

You can use robis to obtain all datasets and then filter based on keywords in the title and/or abstract. See example below where we filter to find datasets related to seamounts. Multiple keywords can be provided by using | to separate each word, e.g. "seamount|deepsea|benthos".

search_terms <- "seamount" # define your search terms

datasets <- robis::dataset() # obtain datasets from OBIS

seamount_datasets <- datasets[
  grepl(paste(search_terms, collapse = "|"), datasets$title, ignore.case = TRUE) |
  grepl(paste(search_terms, collapse = "|"), datasets$abstract, ignore.case = TRUE),]

Full data exports

A full data export of OBIS data is available for download as a Parquet file, here. Note the following:

  • These exports do not include measurement data, dropped records, or absence records
  • The exported file will be a single, flattened Occurrence table
  • The table includes all provided Event and Occurrence data, as well as 68 fields added by the OBIS Quality Control Pipeline, including taxonomic information obtained from WoRMS

OBIS homepage

From the OBIS homepage, you can search for data in the search bar in the middle of the page. You can search by particular taxonomic groups, common names, dataset names, OBIS nodes, institute name, areas (e.g., Exclusive Economic Zone (EEZ)), or by the data provider’s country. See here for more details.

OBIS Mapper

The OBIS Mapper lets you visualize and filter OBIS data by taxonomy, location, time, and data quality, with options to combine layers and download them as CSV. For more details, see the OBIS manual.

speciesgrids

speciesgrids is a Python package to build WoRMS aligned combined OBIS and GBIF species distribution datasets. The resulting dataset is available in a few resolutions on AWS S3. The dataset can be downloaded locally for best performance, or queried directly from the S3 bucket. For more details about downloading and using the dataset, see the speciesgrids README or the notebook.

Notebook demos

We have prepared several JupyterHub Notebooks that can be used for reference, see: https://github.com/iobis/hackathon/tree/master/notebooks. The notebooks cover several topics including OBIS data access, data cleaning, environmental information extraction, and data visualization.

You can also access the notebooks through the Binder link.

Instructions for Binder/Colab

Binder already have the requirements installed and comes with RStudio, but is slower. For Colab, you need to install the needed packages, but is faster and have other nice features. The easiest way to install the requirements is to add a code cell in the notebook and run this:

For Python

!pip install -r https://raw.githubusercontent.com/iobis/hackathon/refs/heads/master/requirements.txt

For R

source('https://raw.githubusercontent.com/iobis/hackathon/refs/heads/master/requirements-r-colab.txt')

Other Resources

Here is a list of other OBIS-relevant resources: