Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add %Y M %M and %M M %Y formats #221

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 21 additions & 2 deletions R/yearmonth.R
Original file line number Diff line number Diff line change
Expand Up @@ -65,8 +65,27 @@ yearmonth.Date <- function(x) {

#' @export
yearmonth.character <- function(x) {
assertDate(x)
new_yearmonth(anydate(x))
key_words <- regmatches(x, gregexpr("[[:alpha:]]+", x))
if (all(grepl("^[[:digit:]]{1~4}[[:space:]]*(m|mon|month)[[:space:]]*[[:digit:]]{1~4}$",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you assign the regular expression to a variable?
Is it too strict to include [[:space:]] here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason to include [[:space]]* is to allow the possibility, since I've seen both 2018M01 and 2018 M01. I haven't seen 2018M 1 before, but it's easy enough to allow for, and I can imagine 2018 month 1 in a spreadsheet (probably created by concatenation).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial reason I decided to go with such a strict format compared to the quarterly thing, honestly, is that m is a far more common letter than q. I'm far more worried about something that isn't a date but looks like one for the first few lines with m than q.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've assigned the regular expression to a variable in a subsequent commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Including [[:space:]]* doesn't seem to make any difference between 2018M01 and 2018M01. Can you remove [[:space:]]*?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done and done.

key_words, ignore.case = TRUE))) {
yr_mon <- regmatches(x, gregexpr("[[:digit:]]+", x))
digits_lgl <- map_lgl(yr_mon, ~ !has_length(.x, 2))
digits_len <- map_int(yr_mon, ~ sum(nchar(.x)))
digits_ind <- nchar(flatten_chr(yr_mon))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this line exactly do? Why it can't be length of 3?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we allow years before 1000AD using this format, there is no way to tell the difference between %y and %Y for 2-digit years. Also, it seems more likely that a 3-digit year specification is an error than that it refers to a year before 1000AD (especially using this format).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's okay to throw an error for 3-digit years for now. If there's a compelling use case using 3-digit years, we'll accomodate this later.

if (any(digits_lgl) || any(digits_len < 5) || any(digits_len > 6) || 3 == digits_ind) {
abort("Character strings are not in a standard unambiguous format.")
}
yr_lgl <- map(yr_mon, ~ grepl("[[:digit:]]{4}", .x))
yr <- as.integer(map2_chr(yr_mon, yr_lgl, ~ .x[.y]))
mon <- as.integer(map2_chr(yr_mon, yr_lgl, ~ .x[!.y]))
if (any(mon > 12)) {
abort("Months can't be greater than 12.")
}
yearmonth(12 * (yr - 1970) + mon - 1)
} else {
assertDate(x)
new_yearmonth(anydate(x))
}
Comment on lines +81 to +88
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (any(mon > 12)) {
abort("Months can't be greater than 12.")
}
yearmonth(12 * (yr - 1970) + mon - 1)
} else {
assertDate(x)
new_yearmonth(anydate(x))
}
x <- paste(yr, mon)
}
assertDate(x)
new_yearmonth(anydate(x))

}

#' @export
Expand Down