This repository contains machine learning based Pedotransfer models that predict saturated hydraulic conductivity (Ks) including the training data and all R scripts used to build the models. Detailed description of this work is available in our paper: Araya and Ghezzehei (2019).
Summaries of data preparation procedures for the machine learning from the USKSAT data is found here. Summary report of analysis we did on the partial effect of bulk density and organic carbon concentration on Ks is found here.
I have developed an app with GUI based on shiny. To run the app locally is easy with RStudio editor:
- Download the
ptfapp
folder and all it's contents. Download the models from this Google Drive link - Open the RStudio project file
ptfapp.Rproj
in your machine. - Open the
ui.R
script in your RStudio editor; RStudio will recognize the Shiny script and provide a Run App button (at the top of the editor). - Clicking Run App button. Before running the app for the first time, you may need to install the required R packages by running the following codes in the R console.
# Shiny packages
install.packages("shiny")
install.packages("shinyjs")
install.packages("htmltools")
# Machine learning related packages
install.packages("caret")
install.packages("gbm")
# Table manipulation packages
install.packages("DT")
install.packages("dplyr")
install.packages("readr")
-
R Scripts
- USKsat_Data_Prep.R (Formatted file here): Tidy up and pre-process USKSAT database to prepare for model building.
- Structure_Partial_Effect.R (Formatted file here): Scripts to do analysis of partial effect of soil structural variables.
- Predict_Ksat.R: Script to predict Ksat using selected model. More description given in 'How To Use Models' section below.
- Other_PTF_Test.R: Script to run predictions using nine alternative PTF models we compared.
- Tune_Models_HPC: Sample scripts used to tune models on multi-core cluster.
-
Data
- USKSAT database that has been aggregated and gone through some tidying up (USKSAT_OpenRefined.csv).
- Cleaned USKSAT database (USKsat_tidydata5.rds, USKsat_tidydata5.csv).
- Metadata for USKsat_Tidy (USKsat_Tidydata_METADATA.xlsx).
- Training and Testing subsets of Cleaned USKSAT (USKsatTrain and USKsatTest files, same files with _dim include sample dimension variable).
- Pre-processing data used to center and scale data created by
preProcess
function from caret package (USKsat_preProc.rds).
-
- All hierarchy models and the pre-processing data. Download the models from UC Merced Dash or from this Google Drive folder. Models were too large for GitHub.
-
Functions
- Functions_TextureRelated.R: Set of functions to assign textural class, calculate percentile sizes and complexed organic carbon.
- Function_OtherPTF.R: Set of functions to calculate Ksat using nine other PTF models.
- ModelPerformanceFunction.R: Function to calculate model performance.
- Function_Predict.R: Function to predict Ksat using our PTF models.
You can run the models to predict the saturated hydraulic conductivity of soils using the Predict_Ksat.R
script (See a sample run of the Predict_Ksat.R
here ). To run the models in your machine:
- Download at least these five items (save them in the same directory, check scripts to fix file locatoins in your machine.):
- a model of your choice and the
USKsat_preProc.rds
file from UC Merced Dash or from this Google Drive folder, - the
Soil_Variable_Template.csv
file, - the
Predict_Ksat.R
script, and - the
Function_Predict.R
file from the Functions folder.
- Fill and save the
Soil_Variables_Template.csv
table with your soil variables. - Modify lines 22 to 27 in the
Predict_Ksat.R
as needed.
For the prediction to run on your machine, you must have the caret
package and either gbm
or randomForest
packages installed depending on whether you are using the BRT or the RF models. You should be able to install the packages prior to running Predict_Ksat.R
as follows.
install.packages('caret', repos = 'https://cran.r-project.org')
install.packages('gbm', repos = 'https://cran.r-project.org')
install.packages('randomForest', repos = 'https://cran.r-project.org')
This work is licensed under a Creative Commons Attribution 4.0 International License. - see the LICENSE.md file for details