Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
smroecker committed Mar 4, 2024
2 parents ef1225e + b00a312 commit a411358
Show file tree
Hide file tree
Showing 54 changed files with 5,385 additions and 5,301 deletions.
9 changes: 6 additions & 3 deletions Part2/003-numerical-taxonomy.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,7 @@ Missing data are a fact of life. Soil scientists are quite familiar with missing
* Estimate the missing data values from know relationships to other properties or a group-wise mean or median, or
* Remove records containing any missing data.


[Missing Data vignette](https://ncss-tech.github.io/aqp/articles/missing-data.html)

### Visualizing Pair-Wise Distances: The Dendrogram

Expand Down Expand Up @@ -542,6 +542,8 @@ Additional examples / elaboration:
* [Competing Soil Series](http://ncss-tech.github.io/AQP/soilDB/competing-series.html)
* [AQP Paper, 2013](http://dx.doi.org/10.1016/j.cageo.2012.10.020)
* [Maynard et al., 2020](https://acsess.onlinelibrary.wiley.com/doi/full/10.1002/saj2.20119)
* [Beaudette et al., 2023](https://ncss-tech.github.io/AQP/presentations/2023-NCSS-NCSP-poster.pdf)
* [Hydrologic Ordering of Geomorphic Proportions](https://ncss-tech.github.io/AQP/sharpshootR/geomorphic-summaries-and-ordering.html)


### Final Discussion
Expand All @@ -554,8 +556,7 @@ Additional examples / elaboration:
* Application to soil survey and ESD


## Excercises
This is the fun part.
## Elaboration with Examples

### Set Up the R Session
Install R packages as needed. Open a new R script file to use as you follow along.
Expand Down Expand Up @@ -601,6 +602,8 @@ Tinker with some `SoilProfileCollection` objects.

The `aqp` package provides two functions for checking the fraction of missing data within a `SoilProfileCollection` object. The first function (`evalMissingData`) generates an index that ranges from 0 (all missing) to 1 (all present) for each profile. This index can be used to subset or rank profiles for further investigation. The second function (`missingDataGrid`) creates a visualization of the *fraction* of data missing within each horizon. Both functions can optionally filter-out horizons that don't typically have data, for example Cr, Cd, and R horizons.

The new [Missing Data vignette](https://ncss-tech.github.io/aqp/articles/missing-data.html) covers recent changes to the `evalMissingData()` function in aqp.

The following examples are based on the `gopheridge` sample dataset.

**evalMissingData**
Expand Down
31 changes: 31 additions & 0 deletions Part2/local-functions.R
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,37 @@

## TODO (DEB): consider placing this in sharpshootR

# from 2023 NCSS poster on NCSP
makeCP <- function(.dist, new.order = NULL, .cp = hcl.colors(n = 25, palette = 'Zissou 1'), mar = c(0.1, 0, 0.5, 0.8), order = 'original', ...) {

# convert reduced distance matrix to matrix
.m <- as.matrix(.dist)

# optionally re-order
if(!is.null(new.order)) {
.m <- .m[new.order, new.order]
}

.res <- corrplot(
.m,
col = .cp,
is.corr = FALSE,
diag = FALSE,
col.lim = c(0, 1),
method = "color",
type = "upper",
tl.pos = "td",
tl.cex = 0.8,
tl.col = 'black',
# cl.pos = "t",
order = order,
...
)

invisible(.res)
}


# compare pair-wise distances between 3 individuals
distPlot <- function(ex, vars, individuals, id, scale.data=FALSE, show.distances=TRUE, ...) {
par(mar=c(5,5,1,1))
Expand Down
6 changes: 2 additions & 4 deletions Part2/packages.bib
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,7 @@ @Manual{R-bookdown
title = {bookdown: Authoring Books and Technical Documents with R Markdown},
author = {Yihui Xie},
year = {2023},
note = {R package version 0.37,
https://pkgs.rstudio.com/bookdown/},
note = {R package version 0.37},
url = {https://github.com/rstudio/bookdown},
}

Expand All @@ -28,8 +27,7 @@ @Manual{R-rmarkdown
title = {rmarkdown: Dynamic Documents for R},
author = {JJ Allaire and Yihui Xie and Christophe Dervieux and Jonathan McPherson and Javier Luraschi and Kevin Ushey and Aron Atkins and Hadley Wickham and Joe Cheng and Winston Chang and Richard Iannone},
year = {2023},
note = {R package version 2.25,
https://pkgs.rstudio.com/rmarkdown/},
note = {R package version 2.25},
url = {https://github.com/rstudio/rmarkdown},
}

Expand Down
Binary file modified Part2/s4ssbook.rds
Binary file not shown.
66 changes: 33 additions & 33 deletions book2/002-uncertainty.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,14 +50,14 @@ Below is a simulated example demonstrating the affect of sample size and standar
## # Groups: sd [2]
## sd n med_min med_mean med_max
## <chr> <fct> <dbl> <dbl> <dbl>
## 1 sd = 1 n = 10 6.12 6.92 7.74
## 2 sd = 1 n = 30 6.67 7.02 7.41
## 3 sd = 1 n = 60 6.67 6.96 7.37
## 4 sd = 1 n = 100 6.75 7.00 7.17
## 5 sd = 2 n = 10 4.66 6.95 8.07
## 6 sd = 2 n = 30 5.63 6.79 7.67
## 7 sd = 2 n = 60 6.44 7.03 7.67
## 8 sd = 2 n = 100 6.54 7.02 7.63
## 1 sd = 1 n = 10 5.56 7.00 8.04
## 2 sd = 1 n = 30 6.58 7.02 7.50
## 3 sd = 1 n = 60 6.60 6.97 7.40
## 4 sd = 1 n = 100 6.75 6.99 7.21
## 5 sd = 2 n = 10 5.90 7.13 8.60
## 6 sd = 2 n = 30 5.97 7.03 7.99
## 7 sd = 2 n = 60 5.96 6.98 7.61
## 8 sd = 2 n = 100 6.71 7.07 7.59
```

<img src="002-uncertainty_files/figure-html/unnamed-chunk-2-1.png" width="768" />
Expand All @@ -84,7 +84,7 @@ sqrt(SS / (length(test$pH) - 1))
```

```
## [1] 1.569779
## [1] 1.581714
```

Note below how our estimate of the variance can vary widely, particularly for simulated datasets with a inherent standard deviation of 2.
Expand All @@ -95,14 +95,14 @@ Note below how our estimate of the variance can vary widely, particularly for si
## # Groups: sd [2]
## sd n sd2_min sd2_mean sd2_max
## <chr> <fct> <dbl> <dbl> <dbl>
## 1 sd = 1 n = 10 0.604 1.00 1.59
## 2 sd = 1 n = 30 0.736 0.970 1.34
## 3 sd = 1 n = 60 0.763 1.00 1.17
## 4 sd = 1 n = 100 0.876 0.994 1.09
## 5 sd = 2 n = 10 1.19 1.94 2.98
## 6 sd = 2 n = 30 1.61 2.06 2.58
## 7 sd = 2 n = 60 1.52 1.93 2.21
## 8 sd = 2 n = 100 1.75 1.98 2.38
## 1 sd = 1 n = 10 0.526 1.04 1.64
## 2 sd = 1 n = 30 0.669 0.992 1.27
## 3 sd = 1 n = 60 0.856 1.01 1.17
## 4 sd = 1 n = 100 0.857 0.994 1.15
## 5 sd = 2 n = 10 0.938 1.85 3.31
## 6 sd = 2 n = 30 1.50 1.95 2.41
## 7 sd = 2 n = 60 1.41 1.98 2.33
## 8 sd = 2 n = 100 1.76 2.01 2.34
```

Now let's see Standard Error (standard deviation / square root of n) below. The results show how our estimates become more precise as the sample size increases.
Expand All @@ -113,14 +113,14 @@ Now let's see Standard Error (standard deviation / square root of n) below. The
## # Groups: sd [2]
## sd n SE_min SE_mean SE_max
## <chr> <fct> <dbl> <dbl> <dbl>
## 1 sd = 1 n = 10 0.191 0.317 0.502
## 2 sd = 1 n = 30 0.134 0.177 0.244
## 3 sd = 1 n = 60 0.0985 0.129 0.151
## 4 sd = 1 n = 100 0.0876 0.0994 0.109
## 5 sd = 2 n = 10 0.376 0.614 0.942
## 6 sd = 2 n = 30 0.294 0.376 0.471
## 7 sd = 2 n = 60 0.197 0.249 0.285
## 8 sd = 2 n = 100 0.175 0.198 0.238
## 1 sd = 1 n = 10 0.166 0.329 0.520
## 2 sd = 1 n = 30 0.122 0.181 0.231
## 3 sd = 1 n = 60 0.111 0.131 0.151
## 4 sd = 1 n = 100 0.0857 0.0994 0.115
## 5 sd = 2 n = 10 0.297 0.584 1.05
## 6 sd = 2 n = 30 0.273 0.357 0.439
## 7 sd = 2 n = 60 0.182 0.255 0.301
## 8 sd = 2 n = 100 0.176 0.201 0.234
```

## Theory of Uncertainty
Expand Down Expand Up @@ -186,7 +186,7 @@ quantile(boot_stats$vars)

```
## 0% 25% 50% 75% 100%
## 0.8815688 1.0457885 1.1422593 1.2261761 1.5571980
## 0.8651723 1.0760254 1.1567047 1.2473041 1.5686374
```

```r
Expand All @@ -210,7 +210,7 @@ quantile(boot_stats$means, c(0.025, 0.975))

```
## 2.5% 97.5%
## 5.792807 6.178015
## 5.817445 6.227386
```

```r
Expand Down Expand Up @@ -864,12 +864,12 @@ summary(lm_cv)

```
## RMSE R2
## Min. :0.4551 Min. :0.8453
## 1st Qu.:0.4630 1st Qu.:0.8485
## Median :0.4662 Median :0.8531
## Mean :0.4690 Mean :0.8527
## 3rd Qu.:0.4753 3rd Qu.:0.8556
## Max. :0.4847 Max. :0.8620
## Min. :0.4530 Min. :0.8407
## 1st Qu.:0.4643 1st Qu.:0.8487
## Median :0.4702 Median :0.8510
## Mean :0.4690 Mean :0.8526
## 3rd Qu.:0.4753 3rd Qu.:0.8552
## Max. :0.4795 Max. :0.8667
```

#### Subsample (Resampling or sample simulation)
Expand Down
Binary file modified book2/002-uncertainty_files/figure-html/unnamed-chunk-10-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified book2/002-uncertainty_files/figure-html/unnamed-chunk-11-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified book2/002-uncertainty_files/figure-html/unnamed-chunk-12-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified book2/002-uncertainty_files/figure-html/unnamed-chunk-16-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified book2/002-uncertainty_files/figure-html/unnamed-chunk-2-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified book2/002-uncertainty_files/figure-html/unnamed-chunk-6-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit a411358

Please sign in to comment.