-
Notifications
You must be signed in to change notification settings - Fork 288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A couple numbers wrong in the https://www.tmwr.org/performance.html#performance-metrics-and-inference. #327
Comments
I currently get 73.1% accuracy, using the code we have in Ch 9: library(tidymodels)
data(ad_data)
set.seed(245)
ad_folds <- vfold_cv(ad_data, repeats = 5)
logistic_reg() %>%
fit_resamples(Class ~ (Genotype + male + age)^2, ad_folds) %>%
collect_metrics()
#> ! Fold01, Repeat1: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold01, Repeat1: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold02, Repeat1: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold02, Repeat1: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold03, Repeat1: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold03, Repeat1: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold04, Repeat1: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold05, Repeat1: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold05, Repeat1: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold06, Repeat1: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold06, Repeat1: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold07, Repeat1: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold07, Repeat1: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold08, Repeat1: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold08, Repeat1: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold09, Repeat1: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold09, Repeat1: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold10, Repeat1: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold10, Repeat1: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold01, Repeat2: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold01, Repeat2: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold02, Repeat2: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold02, Repeat2: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold03, Repeat2: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold03, Repeat2: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold04, Repeat2: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold04, Repeat2: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold05, Repeat2: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold05, Repeat2: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold06, Repeat2: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold07, Repeat2: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold07, Repeat2: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold08, Repeat2: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold08, Repeat2: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold09, Repeat2: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold09, Repeat2: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold10, Repeat2: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold10, Repeat2: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold01, Repeat3: preprocessor 1/1, model 1/1: glm.fit: algorithm did not converge, glm.fit: fitted probabilities numer...
#> ! Fold01, Repeat3: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold02, Repeat3: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold02, Repeat3: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold03, Repeat3: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold03, Repeat3: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold04, Repeat3: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold04, Repeat3: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold05, Repeat3: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold05, Repeat3: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold06, Repeat3: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold06, Repeat3: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold07, Repeat3: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold07, Repeat3: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold08, Repeat3: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold08, Repeat3: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold09, Repeat3: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold10, Repeat3: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold10, Repeat3: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold01, Repeat4: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold01, Repeat4: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold02, Repeat4: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold02, Repeat4: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold03, Repeat4: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold03, Repeat4: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold04, Repeat4: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold04, Repeat4: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold05, Repeat4: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold05, Repeat4: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold06, Repeat4: preprocessor 1/1, model 1/1: glm.fit: algorithm did not converge, glm.fit: fitted probabilities numer...
#> ! Fold06, Repeat4: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold07, Repeat4: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold08, Repeat4: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold08, Repeat4: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold09, Repeat4: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold09, Repeat4: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold10, Repeat4: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold10, Repeat4: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold01, Repeat5: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold01, Repeat5: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold02, Repeat5: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold02, Repeat5: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold03, Repeat5: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold03, Repeat5: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold04, Repeat5: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold04, Repeat5: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold05, Repeat5: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold05, Repeat5: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold06, Repeat5: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold06, Repeat5: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold07, Repeat5: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold07, Repeat5: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold08, Repeat5: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold08, Repeat5: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold09, Repeat5: preprocessor 1/1, model 1/1: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> ! Fold09, Repeat5: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> ! Fold10, Repeat5: preprocessor 1/1, model 1/1 (predictions): prediction from a rank-deficient fit may be misleading
#> # A tibble: 2 × 6
#> .metric .estimator mean n std_err .config
#> <chr> <chr> <dbl> <int> <dbl> <chr>
#> 1 accuracy binary 0.731 50 0.00978 Preprocessor1_Model1
#> 2 roc_auc binary 0.691 50 0.0159 Preprocessor1_Model1 Created on 2022-09-01 with reprex v2.0.2 I wonder if we are missing a seed somewhere? Or is this a cache issue? |
Given that the wrong numbers are both 72.7% (which is the correct baseline number), I am guessing that the inline code chunk, which was used to render the online version of the book, was just looking up the wrong value. In theory somebody could check the blame/history to see if that was the root of the problem.... I think if the book is rendered, using the current code, it should fix itself. 🤞 |
The numbers in the paragraph before the note in Section 9.1 in the online rendered version of TMWR (https://www.tmwr.org/performance.html#performance-metrics-and-inference) do not match the printed book (the second to last paragraph on page 113) or the code that is produced by the repo (line 110 of https://github.com/tidymodels/TMwR/blob/main/09-judging-model-effectiveness.Rmd).
The physical book says and when I spot checked the repo produces:
The book website currently shows:
I didn't pull the repo to test but I think you just need to rerender the current files to fix the online version of the book.
The text was updated successfully, but these errors were encountered: