-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
segment needs tests #180
Comments
OK, I took a look at number 3 in your list and found 3 more things that need addressing.
Here is a reprex, which is also stored in Once the correct behavior is ironed out, they can easily become new tests. library(aqp, warn.conflicts = FALSE)
#> This is aqp 1.27
library(testthat)
#> Warning: package 'testthat' was built under R version 4.0.3
data(jacobs2000)
# use glom to create some fake missing data (REMOVE 50-100cm zone)
input.spc <- glom(jacobs2000[4,], 50, 100,
truncate = TRUE, invert = TRUE)
##################################
### .:. # 5 cannot segment a single profile if the interval contains missing data (gaps)
##################################
# Error: the 50:100 zone is missing; and only one profile
(output.spc <- segment(input.spc, 50:100))
#> Error in data.frame(..., check.names = FALSE): arguments imply differing number of rows: 0, 1
# Error: even with data above and below 50/150 -- single profile
(output.spc <- segment(input.spc, 25:150))
#> Error in data.frame(..., check.names = FALSE): arguments imply differing number of rows: 0, 1
# Error: just 1 cm of missing data in single profile case breaks it
(output.spc <- segment(input.spc, 0:51))
#> Error in data.frame(..., check.names = FALSE): arguments imply differing number of rows: 0, 1
# Works
segment(input.spc, 0:50)
#> SoilProfileCollection with 1 profiles and 50 horizons
#> profile ID: id | horizon ID: hzID
#> Depth range: 50 - 50 cm
#>
#> ----- Horizons (6 / 50 rows | 10 / 19 columns) -----
#> id hzID top bottom name sand clay matrix_color time_saturated
#> 92-4 1 0 1 Ap 88 2 #83796FFF 0
#> 92-4 2 1 2 Ap 88 2 #83796FFF 0
#> 92-4 3 2 3 Ap 88 2 #83796FFF 0
#> 92-4 4 3 4 Ap 88 2 #83796FFF 0
#> 92-4 5 4 5 Ap 88 2 #83796FFF 0
#> 92-4 6 5 6 Ap 88 2 #83796FFF 0
#> matrix_color_munsell
#> 10YR 5/1
#> 10YR 5/1
#> 10YR 5/1
#> 10YR 5/1
#> 10YR 5/1
#> 10YR 5/1
#> [... more horizons ...]
#>
#> ----- Sites (1 / 1 rows | 1 / 1 columns) -----
#> id
#> 92-4
#>
#> Spatial Data:
#> [,1]
#> [1,] NA
#> [1] NA
# Add another profile
input.spc <- combine(input.spc, jacobs2000[7,])
# Works
output.spc <- segment(input.spc, 50:100)
##################################
### .:. #6 output ID order does not match input order or input profiles
##################################
# _NEEDS_ FIX: not only does it have same IDs... it appears to change the order of inputs
(all(profile_id(input.spc) == profile_id(output.spc))) # FALSE
#> [1] FALSE
##################################
### .:. #7 does not provide new IDs for re-sampled profiles
##################################
# Error: Cannot combine because the transformed SPC has the same ID
# (this is a feature not a bug, resampled horizons != original data)
(show.spc <- combine(input.spc, output.spc))
#> Error in pbindlist(objects): non-unique profile IDs detected
# Inspect the default segment ID output -- this is wrong
par(mfrow=c(1,2),mar=c(0,0,2,1))
plot(input.spc, max.depth=200, color="matrix_color", plot.depth.axis=FALSE)
plot(output.spc, max.depth=200, color="matrix_color", name=NA) # # segment should assign new profile IDs automatically -- like slice, permute_profile, etc
# # the correct IDs are something like thisthis:
output.spc2 <- output.spc
fix.idx <- match(profile_id(output.spc), profile_id(input.spc))
profile_id(output.spc2) <- paste0(profile_id(input.spc)[fix.idx], "segmented")
par(mfrow=c(1,2),mar=c(0,0,2,1))
plot(input.spc, max.depth=200, color="matrix_color", plot.depth.axis=FALSE)
plot(output.spc2, max.depth=200, color="matrix_color", name=NA) Note: that I am deliberately accounting for reversed order w/ match /
Suggestions: Under normal circumstances, it is impossible to make an SPC like the segment output from the base data
Temporary solutions: output.spc3 <- rebuildSPC(output.spc)
#> Warning: Horizon top depths contain NA! Check depth logic with
#> aqp::checkHzDepthLogic()
#> Warning: Horizon bottom depths contain NA! Check depth logic with
#> aqp::checkHzDepthLogic()
#> using `hzID` as a unique horizon ID
par(mfrow=c(1,2),mar=c(0,0,2,1))
plot(input.spc, max.depth=200, color="matrix_color")
plot(output.spc3, max.depth=200, color="matrix_color", divide.hz=FALSE, name=NA) Long term solution:
Opinion: |
I wanted to use segment for an example I was working on -- and I came up with this example which is a more concise demo of the phenomenon I showed above, and shows why library(aqp, warn.conflicts=FALSE)
#> This is aqp 1.27
data(sp4)
depths(sp4) <- id ~ top + bottom
# does not work with unit-length segment; needs informative error
segment(sp4, 20)
#> Error in `$<-.data.frame`(`*tmp*`, "id", value = "-"): replacement has 1 row, data has 0
# this appears to work correctly, and rebuildSPC warns about the fact that NAPA and others have no data / NA depths
segment(sp4, 20:21)
#> Warning: Horizon top depths contain NA! Check depth logic with
#> aqp::checkHzDepthLogic()
#> Warning: Horizon bottom depths contain NA! Check depth logic with
#> aqp::checkHzDepthLogic()
#> SoilProfileCollection with 10 profiles and 10 horizons
#> profile ID: id | horizon ID: hzID
#> Depth range: 21 - 21 cm
#>
#> ----- Horizons (6 / 10 rows | 10 / 15 columns) -----
#> id hzID top bottom name K Mg Ca CEC_7 ex_Ca_to_Mg
#> colusa 1 20 21 Bt1 0.1 23.2 1.9 23.7 0.08
#> glenn 2 20 21 Bt 0.3 18.9 4.5 27.5 0.20
#> kings 3 20 21 Bt2 0.8 17.7 4.4 20.0 0.25
#> mariposa 4 20 21 Bt2 0.3 44.3 6.2 34.1 0.14
#> mendocino 5 20 21 Bt2 0.2 30.5 3.7 22.9 0.12
#> napa 8 NA NA <NA> NA NA NA NA NA
#> [... more horizons ...]
#>
#> ----- Sites (6 / 10 rows | 1 / 1 columns) -----
#> id
#> colusa
#> glenn
#> kings
#> mariposa
#> mendocino
#> napa
#> [... more sites ...]
#>
#> Spatial Data: [EMPTY]
par(mar=c(0,0,2,1))
plot(sp4, color="clay")
abline(h = 20, lty=2, lwd=2) checkHzDepthLogic(sp4)
#> id valid depthLogic sameDepth missingDepth overlapOrGap
#> 1 colusa TRUE FALSE FALSE FALSE FALSE
#> 2 glenn TRUE FALSE FALSE FALSE FALSE
#> 3 kings TRUE FALSE FALSE FALSE FALSE
#> 4 mariposa TRUE FALSE FALSE FALSE FALSE
#> 5 mendocino TRUE FALSE FALSE FALSE FALSE
#> 6 napa TRUE FALSE FALSE FALSE FALSE
#> 7 san benito TRUE FALSE FALSE FALSE FALSE
#> 8 shasta TRUE FALSE FALSE FALSE FALSE
#> 9 shasta-trinity TRUE FALSE FALSE FALSE FALSE
#> 10 tehama TRUE FALSE FALSE FALSE FALSE napa, san benito and tehama do not have data in 20 - 21cm segment
|
Updated example, was curious if the various horizon subsetting approaches give the same results. This is not an extensive test and includes none of the common "problem" soil profiles. The issue name wasn't well specified--there are a lot of questions, updates, fixes, and unresolved issues such as "what should be returned when asking for a depth range without data?". I'm not sure if Anyway, I'm going to start a vignette that explores the various ways one might subset and aggregate soil profiles. Perhaps by outlining several use cases and examples, we can better help ourselves and others understand the various approaches. library(aqp)
# example SPC from data.frame
data(sp4)
depths(sp4) <- id ~ top + bottom
hzdesgnname(sp4) <- 'name'
# test effect of potential sorting alpha vs. numeric
# all seem to work
# profile_id(sp4) <- as.character(1:length(sp4))
profile_id(sp4) <- sprintf("%0.2d", 1:length(sp4))
# profile_id(sp4) <- letters[1:length(sp4)]
testIt <- function(spc, top, bottom) {
# keep old ID
spc$.oldID <- profile_id(spc)
# various approaches
# dice() fills missing depth intervals with NA
# default when given a formula
.fm <- as.formula(sprintf("%s:%s ~ .", top, bottom - 1))
d <- dice(spc, fm = .fm)
# force filling missing depth intervals with NA
g <- glom(spc, z1 = top, z2 = bottom, fill = TRUE)
gt <- glom(spc, z1 = top, z2 = bottom, truncate = TRUE, fill = TRUE)
# single NA horizon, with NA depths
st <- hz_segment(spc, intervals = c(top, bottom))
s <- hz_segment(spc, intervals = c(top, bottom), trim = FALSE)
# normalize profile IDs
# so that all can be combined / viewed together in a single SPC
profile_id(d) <- sprintf("%s\nD", profile_id(d))
profile_id(g) <- sprintf("%s\nG", profile_id(g))
profile_id(gt) <- sprintf("%s\nGT", profile_id(gt))
profile_id(s) <- sprintf("%s\nS", profile_id(s))
profile_id(st) <- sprintf("%s\nST", profile_id(st))
profile_id(spc) <- sprintf("%s\n", profile_id(spc))
x <- combine(spc, d, g, gt, s, st)
par(mar = c(0, 0, 3, 0))
plotSPC(x, color = 'CEC_7', name.style = 'center-center', width = 0.4, id.style = 'top', col.palette = hcl.colors(25, palette = 'viridis'), depth.axis = list(line = -5))
segments(x0 = 0, x1 = length(x) + 1, y0 = c(top, bottom), y1 = c(top, bottom), lwd = 2, lty = 3, col = 'red')
invisible(x)
}
testIt(sp4, top = 0, bottom = 25)
# check for all-NA horizons, and equal number of sites / original ID
table(a$.oldID)
a <- testIt(sp4, top = 15, bottom = 35)
# check for all-NA horizons, and equal number of sites / original ID
table(a$.oldID)
a <- testIt(sp4, top = 20, bottom = 21)
# check for all-NA horizons, and equal number of sites / original ID
table(a$.oldID)
a <- testIt(sp4, top = 50, bottom = 60)
# check for all-NA horizons, and equal number of sites / original ID
table(a$.oldID) |
- failed to insert filled horizons when specified interval intersected no horizons
You are right, that is a bug in the case where no profiles have horizons in the specified interval, should be fixed in f25246d and added test in 3684b12
Sounds good! |
Cool thanks. Do we have a sample dataset with each of the typical hz logic errors? That might be really handy for tests. |
This functions should have tests for expected output given:
data.frame
inputSoilProfileCollection
inputslab()
The text was updated successfully, but these errors were encountered: