Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use new k threshold #235

Merged
merged 48 commits into from
Feb 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
09acf21
use new k threshold
avehtari Dec 28, 2023
51249bf
fix some of the tests (still some failures)
jgabry Jan 22, 2024
9778955
more updates
avehtari Jan 24, 2024
a54d577
Merge branch 'new-pareto-k-threshold' of github.com:stan-dev/loo into…
avehtari Jan 24, 2024
26e8a7b
fix some tests
avehtari Jan 24, 2024
5abf6e6
Apply suggestions from Noa's scode review
avehtari Jan 24, 2024
61c0725
explain 2200
avehtari Jan 24, 2024
6565f45
fix threshold in print
avehtari Jan 26, 2024
12373da
simplify use of ps_khat_threshold
avehtari Jan 26, 2024
7ffd67c
fix threshold argument handling
avehtari Jan 26, 2024
eaef505
fix tests
avehtari Jan 26, 2024
955b4d5
Apply suggestions from code review by jgabry
avehtari Jan 26, 2024
b36bc13
reduce r_eff warnings
avehtari Jan 28, 2024
a378f7f
doc and refs updates
avehtari Jan 28, 2024
57e9061
fix tests
avehtari Jan 28, 2024
77793cd
Merge branch 'new-pareto-k-threshold' of github.com:stan-dev/loo into…
avehtari Jan 28, 2024
0e03606
fixes suggested by Jonah
avehtari Jan 28, 2024
a329a13
regenerate doc
jgabry Jan 29, 2024
3366ad5
Apply suggestions from code review by n-kall
avehtari Jan 31, 2024
4013c15
another doc threshold update
avehtari Jan 31, 2024
2fb87a3
Merge branch 'new-pareto-k-threshold' of github.com:stan-dev/loo into…
avehtari Jan 31, 2024
e6fdfda
replace r_eff warning with informative message in print
avehtari Jan 31, 2024
eb68807
regenerate doc
jgabry Jan 31, 2024
8c0d449
a few minor doc edits
jgabry Jan 31, 2024
2af2fbf
update Rd files
jgabry Jan 31, 2024
35ca220
Merge branch 'master' into new-pareto-k-threshold
jgabry Jan 31, 2024
cf50f9f
save RDS files in a way compatible with older versions of R
jgabry Jan 31, 2024
be9dee1
update glossary and FAQ
avehtari Feb 1, 2024
448eb3e
fix r_eff default and doc in tis
avehtari Feb 1, 2024
52ce79e
print r_eff summary as part of mcse summary
avehtari Feb 1, 2024
bb7d3e2
Monte Carlo SE -> MCSE
avehtari Feb 1, 2024
4b6fdea
regenerate doc
jgabry Feb 1, 2024
900adb1
Merge branch 'new-pareto-k-threshold' of github.com:stan-dev/loo into…
avehtari Feb 1, 2024
be22253
Merge branch 'master' into new-pareto-k-threshold
jgabry Feb 2, 2024
90ab04a
fix print_reff_summary to work well with old objects
avehtari Feb 2, 2024
cfe1f74
4/9 vignettes fixed
avehtari Feb 2, 2024
122aa10
fixedd loo2-mixis vignette
avehtari Feb 2, 2024
df325d6
fixed loo2-non-factorized vignette
avehtari Feb 2, 2024
43632a8
fixed loo2-weights vignette
avehtari Feb 2, 2024
8cd84b0
fixed loo2-with-rstan vignette
avehtari Feb 2, 2024
133ea10
fixed loo2-lfo vignette
avehtari Feb 2, 2024
1a818f3
Remove forgotten testing line in vignettes/loo2-lfo.Rmd
avehtari Feb 5, 2024
2afe583
Merge branch 'master' into new-pareto-k-threshold
jgabry Feb 7, 2024
8d73b37
typo fix
avehtari Feb 7, 2024
5f78648
Add Aki's news items to NEWS.md with a few minor edits
jgabry Feb 7, 2024
372cb92
diagnostics.R: a few minor doc edits
jgabry Feb 7, 2024
bd33010
Merge branch 'master' into new-pareto-k-threshold
jgabry Feb 12, 2024
3c87b7f
Merge branch 'master' into new-pareto-k-threshold
jgabry Feb 13, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 35 additions & 5 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,43 @@
# loo 2.6.0.9000

### New features
### Major changes

* Use of new sample size specific diagnostic threshold for Pareto `k`. The pre-2022 version
of the [PSIS paper](https://arxiv.org/abs/1507.02646) recommended diagnostic
thresholds of
`k < 0.5 "good"`, `0.5 <= k < 0.7 "ok"`,
`0.7 <= k < 1 "bad"`, `k>=1 "very bad"`.
The 2022 revision of the PSIS paper now recommends
`k < min(1 - 1/log10(S), 0.7) "good"`, `min(1 - 1/log10(S), 0.7) <= k < 1 "bad"`,
`k > 1 "very bad"`, where `S` is the sample size.
There is now one fewer diagnostic threshold (`"ok"` has been removed), and the
most important threshold now depends on the sample size `S`. With sample sizes
`100`, `320`, `1000`, `2200`, `10000` the sample size specific part
`1 - 1/log10(S)` corresponds to thresholds of `0.5`, `0.6`, `0.67`, `0.7`, `0.75`.
Even if the sample size grows, the bias in the PSIS estimate dominates if
`0.7 <= k < 1`, and thus the diagnostic threshold for good is capped at
`0.7` (if `k > 1`, the mean does not exist and bias is not a valid measure).
The new recommended thresholds are based on more careful bias-variance analysis
of PSIS based on truncated Pareto sums theory. For those who use the Stan
default 4000 posterior draws, the `0.7` threshold will be roughly the same, but
there will be fewer warnings as there will be no diagnostic message for `0.5 <=
k < 0.7`. Those who use smaller sample sizes may see diagnostic messages with a
threshold less than `0.7`, and they can simply increase the sample size to about
`2200` to get the threshold to `0.7`.

* There are no more warnings if the `r_eff` argument is not provided, and the
default is now `r_eff = 1`. The summary print output showing MCSE and ESS now
shows diagnostic information on the range of `r_eff`. The change was made to
reduce unnecessary warnings. The use of `r_eff` does not change the expected
value of `elpd_loo`, `p_loo`, and Pareto `k`, and is needed only to estimate
MCSE and ESS. Thus it is better to show the diagnostic information about `r_eff`
only when MCSE and ESS values are shown.

### Other changes

* `E_loo` now allows `type="sd"`.


### Bug fixes

* Fix bug in `E_loo` when `type=variance`.


# loo 2.6.0

Expand Down
6 changes: 3 additions & 3 deletions R/crps.R
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ loo_crps.matrix <-
log_lik,
...,
permutations = 1,
r_eff = NULL,
r_eff = 1,
cores = getOption("mc.cores", 1)) {
validate_crps_input(x, x2, y, log_lik)
repeats <- replicate(permutations,
Expand Down Expand Up @@ -154,7 +154,7 @@ loo_scrps.matrix <-
log_lik,
...,
permutations = 1,
r_eff = NULL,
r_eff = 1,
cores = getOption("mc.cores", 1)) {
validate_crps_input(x, x2, y, log_lik)
repeats <- replicate(permutations,
Expand All @@ -175,7 +175,7 @@ EXX_compute <- function(x, x2) {
}


EXX_loo_compute <- function(x, x2, log_lik, r_eff = NULL, ...) {
EXX_loo_compute <- function(x, x2, log_lik, r_eff = 1, ...) {
S <- nrow(x)
shuffle <- sample (1:S)
x2 <- x2[shuffle,]
Expand Down
Loading
Loading