Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leap time handling #445

Open
wants to merge 3 commits into
base: climatological
Choose a base branch
from
Open

Leap time handling #445

wants to merge 3 commits into from

Conversation

dsweber2
Copy link
Contributor

Checklist

Please:

  • Make sure this PR is against "dev", not "main".
  • Request a review from one of the current epipredict main reviewers:
    dajmcdon.
  • Make sure to bump the version number in DESCRIPTION and NEWS.md.
    Always increment the patch version number (the third number), unless you are
    making a release PR from dev to main, in which case increment the minor
    version number (the second number).
  • Describe changes made in NEWS.md, making sure breaking changes
    (backwards-incompatible changes to the documented interface) are noted.
    Collect the changes under the next release number (e.g. if you are on
    0.7.2, then write your changes under the 0.8 heading).
  • Consider pinning the epiprocess version in the DESCRIPTION file if
    • You anticipate breaking changes in epiprocess soon
    • You want to co-develop features in epipredict and epiprocess

Change explanations for reviewer

So weeks and days both have to deal with inconsistent numbers of days. What this does:

  1. separate out the "weird" day/week into a separate bin, labeled 999 and only use up to the "normal" number of days/weeks (so 365/52)
  2. if the window goes over the boundary (so includes both week 52 and week 1, or both day 59 and day 60), then add in the data from
  3. if the window is around the leap day/week, include the window days before and the window days after (so days 57, 58, 59, leap_day, 60, 61, 62)

Magic GitHub syntax to mark associated Issue(s) as resolved when this is merged into the default branch

  • Resolves #{issue number}

@dsweber2 dsweber2 requested a review from dajmcdon as a code owner February 27, 2025 23:28
@dsweber2 dsweber2 self-assigned this Feb 27, 2025
Copy link
Contributor

@dajmcdon dajmcdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions, but I think there's a logic error when rolling with the iter. (Probably was there before as well)

.idx = x$time_aggr(time_value), .weights = wts,
.idx = (.idx - x$forecast_ahead) %% modulus,
.idx = dplyr::case_when(.idx == 0 ~ modulus, TRUE ~ .idx)
.idx = time_value %m-% ahead_period,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.idx = time_value %m-% ahead_period,
# subtracts a month w/o rollover (usual behavior on weeks/days)
.idx = time_value %m-% ahead_period,

@@ -341,12 +350,63 @@ roll_modular_multivec <- function(col, .idx, weights, aggr, window_size, modulus
tidyr::nest(data = c(col, weights), .by = .idx)
out <- double(nrow(tib))
for (iter in seq_along(out)) {
entries <- (iter - window_size):(iter + window_size) %% modulus
# +1 from 1-indexing
entries <- ((iter - window_size):(iter + window_size) %% modulus)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
entries <- ((iter - window_size):(iter + window_size) %% modulus)
entries <- (iter - window_size):(iter + window_size) %% modulus

Don't need the outer paren do we?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm concerned this logic fails if given partial years. Same with keying on the iteration below. Suppose that I give you only data for January-March for 25 years (and ask for daily). Then max(iter)=91 so the leap branch will never happen. The same would be true if I removed the summer for every year and asked for epiweek climate. I think it needs to key on the .idx rather than iter.

The tests all presume you have every possible .idx covered and arranged.

} else if (modulus == 52) {
# we need to grab just the window around the leap week on the leap week
if (iter == 53) {
entries <- ((53 - window_size):(53 + window_size - 1) %% 52)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
entries <- ((53 - window_size):(53 + window_size - 1) %% 52)
entries <- (53 - window_size):(53 + window_size - 1) %% 52

# we need to grab just the window around the leap day on the leap day
if (iter == 366) {
entries <- ((59 - window_size):(59 + window_size - 1) %% modulus)
entries <- c(entries, 366)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need a

entries[entries == 0] <- 365

here?

Comment on lines +392 to +402
yday_leap <- function(time_value) {
ifelse(
!lubridate::leap_year(time_value),
lubridate::yday(time_value),
ifelse(
(lubridate::month(time_value) == 2) & lubridate::day(time_value) == 29,
999,
lubridate::yday(time_value) - (lubridate::month(time_value) > 2)
)
)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
yday_leap <- function(time_value) {
ifelse(
!lubridate::leap_year(time_value),
lubridate::yday(time_value),
ifelse(
(lubridate::month(time_value) == 2) & lubridate::day(time_value) == 29,
999,
lubridate::yday(time_value) - (lubridate::month(time_value) > 2)
)
)
}
leap_day <- function(x) lubridate::month(x) == 2 & lubridate::day(x) == 29
yday_leap <- function(time_value) {
dplyr::case_when(
!lubridate::leap_year(time_value) ~ lubridate::yday(time_value),
leap_day(time_value) ~ 999,
TRUE ~ lubridate::yday(time_value) - as.numeric(lubridate::month(time_value) > 2L)
)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought for clarity, avoiding nested ifelse() calls. ifelse() can be brittle. See the warning in ?ifelse().

#' epiweek, but it assigns week 53 the value of -1 instead so it mirrors the assignments in yday_leap
#' @keywords internal
epiweek_leap <- function(time_value) {
lubridate::epiweek(time_value) %>% ifelse(. == 53, 999, .)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Piping and . can be hard to discern. Same below.

Suggested change
lubridate::epiweek(time_value) %>% ifelse(. == 53, 999, .)
w <- lubridate::epiweek(time_value)
w[w == 53] <- 999
w

#' isoweek, but it assigns week 53 the value of -1 instead so it mirrors the assignments in yday_leap
#' @keywords internal
isoweek_leap <- function(time_value) {
lubridate::isoweek(time_value) %>% ifelse(. == 53, 999, .)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
lubridate::isoweek(time_value) %>% ifelse(. == 53, 999, .)
w <- lubridate::isoweek(time_value)
w[w == 53] <- 999
w

)
)
}
#' epiweek, but it assigns week 53 the value of -1 instead so it mirrors the assignments in yday_leap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean this to say -1 rather than 999?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants