Skip to content

Cleaning and compiling various work on trait correlations across species of UK flora.

Notifications You must be signed in to change notification settings

seancanobrien/UK_Flora

Folders and files

NameName
Last commit message
Last commit date
Jun 30, 2023
Jun 25, 2023
Jun 25, 2023
Jun 30, 2023
May 22, 2023
Jun 4, 2023
May 22, 2023

Repository files navigation

UK_Flora

Cleaning and compiling various work on trait correlations across species of UK flora.

Script for pipeline is in src/. The script main.R will scrape all necessary data from the ecoflora, save that data, determine fertilization mode for that data (written to a new column myFert3), then finally save that dataframe including myFert3 to a different file.

Datasets can be found in data/. These include old data, including:

  • Old raw data scraped from the ecoflora in June 2020.
  • That old scrape data which was processed to determine myFert3 using an old Python script.

I am trying to reproduce the scrape data and determine characteristics of that data functionality of old Python scripts written in the summer of 2020. I am not able to test this for the scrape aspect, since the website from which the data was scraped has changed in the intervening time. However I can test that the Python scripts to determine myFert3 are reproduced. In sketch_test/check_old_vs_new_data_and_methods.R I test two things:

  • The new R script for determining myFert3 acts identically to the old Python script on the old scrape data.
  • Similarly for the new scrape data.

These tests were passed, but in doing so, a couple of bugs and areas for obvious improvement were identified in the old Python code. At the moment these have been reproduced in the R scripts.

The spread of myFert3 has changed quite significantly:

Fert Mode n (June 2020) n (June 2023)
selfing 60 57
mixed 787 643
outcrossing 235 185
NA 2192 2390

The important aspects of determining myFert3 are in the file src/determine_fertilization_characteristics.R. The main function of this file is assign_3_group_fertilization(). In this file, characteristics of species are determined atomically in functions. These characteristics are combined to determine myFert3 in assign_3_group_fertilization().

About

Cleaning and compiling various work on trait correlations across species of UK flora.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published