-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More clarity about expected lunisolar calendar behavior for large dates #2869
Comments
Temporal gives 3 ways of representing a particular
Being able to convert between all three representations without ambiguity is I think the most important invariant. I will call this the equivalence relation. Temporal also defines the following invariants for the scalar properties:
Following from these definitions and the equivalence relation are the following arithmetic invariants that I coded into ICU4X in unicode-org/icu4x#4904: The following operations must be equivalent: adding or subtracting 1 day to ISO, adding or subtracting 1 day to Codes, and adding or subtracting 1 day to Scalars. One can write a proof that these invariants must be true for the above definitions to hold. Calendars that seem like they don't obey these invariants should be modified to do so. For example, For reasons the champions have discussed previously, I think it is wise for Temporal to enforce these invariants. It allows careful developers to craft calendar-independent logic: no matter which calendar is in use, there are certain operations that are always sound, operations derived from the above invariants. |
We have finally finalized a proposal named Hijri week calendar (HWC) that is a counterpart of the ISO calendar for Hijri calendars. We have a working Temporal implementation for it. When doing so we realised that some Hijri calendars like the I am mentioning this as to consider a fix for these calendars to make them compatible with the HWC as we are exploring to port the HWC to CLDR. |
I'll try to answer the above questions to the best of my knowledge, others please chime in if you feel that I missed the mark:
Yes.
I don't have any reason to pick a particular range so I'll arbitrarily say "1000 years before and after the present." That said, we assume the Gregorian calendar is proleptic (extended arbitrarily far into the past and the future), so we should do so for other calendars. Note, we also assume that existing time zone DST rules continue arbitrarily far into the future, until defined to be otherwise; so That said I think this is always going to be on a best-effort basis, at least to some degree. I don't have any good ideas on how to make sure that is a cross-browser best effort, not differing between browsers, especially when it's sometimes not even clear what past dates in a calendar were.
I'd say that depends on the calendar and the cultural expectations of its users.
I'd say no, we should not recommend this if we wouldn't recommend it for the Gregorian calendar.
Extremely important, otherwise one of our fundamental assumptions breaks down.
Here also I'd say we should not recommend breaking these invariants if we wouldn't recommend it for the Gregorian calendar. E.g. we would not fix all far-future years to have 365 days and all far-future Februaries to have 28.
These I'd put in the same category as ISO roundtripping: extremely important. |
These answers are incompatible with each other: something has to give here. We have found examples of dates where the invariants start falling apart for both the islamic and chinese calendars. Out of the following three things, one must go:
And, stepping back, I don't even think accurate for me to describe these calendars as having a "formula" in the first place, it doesn't work that way, as I've sketched out in the issue.
I don't think that assumption is as easy to make for lunisolar calendars, since the Gregorian calendar is rather mathematical and they are not (except for Hebrew). The gregorian calendar is one where it actually makes some sense to have a "proleptic" version, for these there is no clear answer as to what that means. I gave three differing answers as to what a proleptic version might mean in the issue above, only one of them is actually implementable in computers and there can still be multiple formulae that are equally valid but give different answers. Part of the point of this issue is to tease out what we mean when we are trying to construct a proleptic version of these calendars.
I don't think that justification holds. "Fall back to a simplified mathematical model after X years" works perfectly well for the Gregorian calendar, it is a simplified mathematical model.
This isn't the same thing: the time zone is defined in a way that lets us very easily make a proleptic version. That is not true for lunisolar calendars. |
Also, stepping back a bit, it appears that some of the justifications here are because we don't want Gregorian to be "special", which is a reasonable concern in an i18n context. But I'd argue that forcing other calendars to conform to the expectations of the Gregorian calendar — that they be well defined for arbitrary periods of time — is treating Gregorian as special in a far worse way. |
If we must prioritize, then I'd suggest that ISO roundtripping is the most important, because all of Temporal stores dates using the ISO calendar. You'd end up with really unexpected behavior if you couldn't guarantee that the input to APIs like For the next priority, I think it's "Stay faithful to general calendrical invariants" because if you can't depend on behavior like "adding one day always produce the next day in the month (or the first day in the next month)", then it will also break apps. This is the kind of issue that would, for example, break fuzz-testing of apps that expect calendrical invariants to hold, although the breaks wouldn't be nearly as common or obvious as breaks in ISO roundtripping. So I think that leaves faithfulness to the formula as the one that gets the short straw. That said, I don't exactly know what this means. Does it mean that you just define new formula that is similar to the old one for near dates but diverges somewhat in order to retain calendar invariants when the year is far from today? If so honestly that sounds fine, because formulas (like all calendar calculations since the dawn of time!) have always been approximations of celestial behavior, and those approximations have a habit of being revised and improved from time to time. Perfection is not a reasonable expectation for the far past or far future. I definitely agree with @ptomato that falling back to Gregorian would be unexpected and confusing for users, so I'd suggest an imprecise formula is better than precise (but clearly wrong culturally) Gregorian dates.
Note that this is *not* an invariant, even in Gregorian, if there's a month or year in the duration, because of the complexity of handing arithmetic around variable-length months and (in lunisolar calendars, years too). The only cross-calendar invariant is that it should roundtrip only for durations with fixed-length date units like days or weeks. (And even weeks might get a bit dicey when rounding is involved, although I'd have to think it through more to know if it's OK or not.) |
Calendars which are subject to rounding and observational errors involving the moon, sun, stars, and planets (such as Chinese and Islamic) are only well-defined from the point at which they were created to a point perhaps a few decades into the future, or as long as published almanacs reach. Since Temporal requires all calendars to be well-defined over the entire Temporal range, which is far greater than these calendars are actually well-defined, my preference is that these calendars fall back to a somewhat-reasonable formula outside of a certain range, a formula which obeys the roundtrip and arithmetic invariants and keeps the year lengths roughly consistent over the whole Temporal range. I am also okay with a solution where out-of-range dates fall back to the proleptic Gregorian calendar. |
This is helpful information, thanks! I didn't get that out of the original post. I was assuming "proleptic" meant "the math truth", as you phrased it, since that is the only computable one. I didn't realize you were saying that the math truth already leads to contradictions. Out of curiosity, what happens in these cases? Is this what you were talking about with Islamic years and Chinese months with the wrong number of days? |
In Islamic, the number of days isn't always 29 or 30. In Chinese, the new year drifts around. The Chinese new year is supposed to be between January 20 and February 21, I think, but according to the formulas we're using, it starts drifting to like January 19, January 18, ..., and eventually it even gets into December or earlier, for dates about 10,000 years into the future, which is within the Temporal range. I also discovered that Pope Gregory's calendar also still has a drift, just a much longer one (it loses a day every ~3000 years instead of every ~300 years), so I'm not sure who is mainly at fault: Gregorian drift, imprecise solstice calculations, or sidereal drift (movement of the north star, which impacts the solstices). |
I guess I misspoke here, there are two kinds of calendar invariants at play. I talk about them in the issue but didn't list both here: There are "general" calendrical invariants like "adding a day gives the next day in the month or the first day in the next month" and "calendar internal" invariants, like "Islamic months are always 29 or 30 days". Do we prioritize these differently? Generally I think that we're going to have to sacrifice faithfulness anyway so maintaining some of these invariants seems to not be too costly. It is worth noting that "adding a day gives you the next day in the month" isn't quite a general calendrical invariant: The Hindu calendar follows a lunar notion of "day" that for everyday use is mapped to the solar day. That mapping is not one-to-one, in common reckoning you can have "merged" days and "double" days, which work about how you'd expect for religious observances. There are some details on the mapping https://books.google.com/books?id=Fb9Zc0yPVUUC&pg=PA20#v=onepage&q&f=false: basically the solar day takes its name from the recentmost lunar day that started before its sunrise. Due to variation in length of lunar day (it's defined in terms of angular displacement, but hte moon's orbit is elliptical), the day can be both shorter or longer than a sunrise-to-sunrise solar day, so you can have sunrise-to-sunrise periods with no new lunar day (leading to a double/extra/leap day), and sunrise-to-sunrise periods with two days (leading to a deleted/merged day). Festivities seem to more commonly follow a similar rule but around noon instead. It's complicated. |
This issue is a potential blocker for us enabling Temporal on Nightly builds of Firefox; as it stands, the ICU4X code we use will hit debug assertions with large dates that might prevent us from fuzz testing our implementation properly. We can workaround this, but having a resolution here would be preferable :) |
About calendars with lunar days (my favorite is the Hawaiian calendar but it sounds like Hindu does this too?): in my opinion we let the ship sail when we said in Temporal that there is a 1-to-1 mapping to ISO-8601. So, a calendar with lunar days should use solar days for arithmetic purposes. However, a calendar with lunar days can absolutely add a new field to access this information, such as |
Yeah, the main problem is that this affects formatting: the name of the solar day derives from the lunar day(s) it ties to. Which isn't a huge deal for Temporal, but may be annoying for ICU4X. I do think Anyway, not exactly the point of this discussion. |
Let me see if I can put forth a reasonable conclusion here: We have the following inviolable invariants:
Calendar implementations, in decreasing order of priority:
|
If we need to choose, I'd like to think about deprioritizing "Dates must be supported in the full Temporal range." It seems less important to me than either of the SHOULDs in that list. |
Personally that was my hope but it doesn't seem to be the way this discussion has gone so far. |
@ptomato, how do you propose handling dates that are outside of a calendar's supported range? What I don't think we should do is make it a data-driven exception. The following code should not throw an exception for some users but not others: let calendar = new Intl.DateTimeFormat().resolvedOptions().calendar;
myTemporalDate.withCalendar(calendar); So, if we made this change, Temporal should normatively specify the range of dates that ICU4X needs to support. This is what it currently does, but the range is just too big a range, such that we need to have these discussions. |
If I had to choose, I'd prefer a data-driven exception above giving dates contrary to the known ground-truth in the near-present. (I'm not sure if those things are directly in opposition to each other, anyway.) |
Hm? ICU4X would be buggy if it returns " dates contrary to the known ground-truth in the near-present". This is talking about dates far away from the present. |
I'm talking about
Both of those things seem higher priority to me than supporting the entire half-million-year range. If I have to choose between getting a wrong date for 1066-09-20 or getting a data-driven exception for -271821-04-20, I'd choose the latter. |
Personally I'm very happy with deprioritizing support for the "whole Temporal range". I think it is better UX to give users exceptions than pretend to give meaningful answers. It appeared to be that people in prior discussions considered the range to be a sacred invariant, so I didn't want to poke at it too much. I do think, as Shane says, we should spec what the validity range is per-calendar rather than based on what implementations decide. |
I don't think anyone is proposing that we give the wrong date for 1066-09-20. I'm okay with the following three proposals, in order of preference:
Things I don't currently support:
|
I am also in support of all three options there. My strong preference is having some consensus here rather than leaving it up to implementors. My weak preference is probably I also agree with Shane's points 1 and 3 for "things I don't currently support". I'm okay with the date ranges being per-calendar, but I think doing across-calendar ranges is reasonably doable. My caveat for Shane Proposal 3 is that implementation should still have the freedom to switch to algorithmic approximations if they have trouble fitting the smaller range of dates, but the proposal switches the priority order in a way that makes that less necessary. |
I agree with @ptomato that it's OK if the full Temporal range is not supported for some calendars. I also think that @sffc's suggestion is reasonable that we should specify a range that does *not* throw for all built-in calendars. If we do this, then I'd suggest that a reasonable range would be -10_000 to 10_000 which I expect would cover all calendars' recorded history.
I'm not a fan of this option for most calendars where there are only 1-2 eras, so adding another would be confusing for users. Doing this in Japanese doesn't seem as bad because Japanese users already know to expect a large number of eras, so adding another seems less disruptive. |
I think that's fine, though for a range that large some calendars will still have to fall back to arithmetical approximations, I think some of the islamic/chinese weirdnesses are not that far in the future. |
The oldest epoch of a CLDR calendar is 5492 BCE, Ethiopian Amete Alem's creation of the world. So that seems like a reasonable start point. For the end point, maybe something around 5000 CE. This will still get into ranges where we have to implement some workarounds in Chinese and Islamic, though. The calendrical calculations fail as soon as a few hundred years into the future, although I have workarounds implemented in ICU4X that make them work a little bit longer than that. |
@Manishearth We discussed this in today's Temporal meeting. Everyone is happy with your proposal on the table in #2869 (comment). Those of us who advocated throwing on extreme dates have been convinced otherwise, because it'd cause unexpected (data-driven) exceptions in user applications. |
Out of curiosity, which "truth" is
|
Math truth, using icu4c algorithms, but I don't know if those algorithms are faithful to the specific calendrical invariants or not for dates that big. Probably not. |
I'm fairly certain that |
Examples where ICU4C computes months with 28 or 31 days: js> new Date(-3828538828800000).toLocaleDateString("en-u-ca-islamic")
"12/1/-123656 AH"
js> new Date(-3828538828800000 - 24*60*60*1000).toLocaleDateString("en-u-ca-islamic")
"11/28/-123656 AH"
js> new Date(-3828541248000000).toLocaleDateString("en-u-ca-islamic")
"11/1/-123656 AH"
js> new Date(-3828541248000000 - 24*60*60*1000).toLocaleDateString("en-u-ca-islamic")
"10/31/-123656 AH" |
Prior context: unicode-org/icu4x#4917 in ICU4X, as well as unicode-org/icu4x#4713, unicode-org/icu4x#4904, and some others.
Temporal.PlainDate
has a validity range of ≈ Unix epoch ± 250,000 years. This is quite a large range, but it makes perfect sense for working with mathematically defined calendars like the Gregorian calendar: the concept of a Gregorian day 200,000 years into the future is something where there is a reasonable answer to the question.However, when it comes to lunisolar calendars dependent on astronomical concerns1, and even to some extent solar calendars like the Persian calendar, answering the question "what is $date in $calendar" becomes far murkier. For such calendars, there are three potential sources of answers:
- "the ground truth": what people actually believe to be the details of the calendar: This is what is printed in almanacs and generally only extends at most 100 years into the future. When there are potential ambiguities; for example when moonrise occurs extremely close to sunrise time, the user community tends to make a call in some direction.
- "the space truth": what is actually going on in space, plugged in to the definition of the calendar. This can be affected by higher-order characteristics of the celestial orbits, as well as some kinds of unpredictable uncertainties in the really long run.
- "the math truth": what the algorithms say, and what computers say when they run the algorithms. This is what's actually implementable, but will diverge from the space truth due to celestial approximations, floating point error, and unpredictable higher order factors of space.
The long term intangibility of ground truth means that there is no right answer for the behavior of such a calendar beyond maybe 100 years into the future. You can make informed guesses, but their accuracy starts dwindling quickly as time passes. Of course, the usefulness of the question also dwindles over time: the precise date of the Chinese calendar exactly 10,000 years from now is not really that usable for anything other than idle curiosity.
(Similar considerations apply for the far past: there's little point debating the accuracy of a calendrical calculation for dates before the inception of the calendar)
Given that Temporal expects implementations to support dates in a very large range, it is probably useful to provide guidance and invariants that implementations should follow when dealing with these issues.
Some questions that could be answered:
Temporal.PlainDate
in the first place?(We found that "calendar internal" invariants and "general calendrical" invariants are often in tension when attempting to patch up algorithms to behave nicely for such dates)
cc @hsivonen @anba
Footnotes
All of them except Islamic Tabular and Hebrew. The former follows a fixed roughly-alternating cycle of short and long years, and the latter at the moment is considered to follow a purely arithmetical system where the lunation time is a known approximation expressed as an integer number of ḥalakim. This is a case where the ground truth is basically defined to deliberately ignore the space truth. This means the Hebrew calendar will slowly desynchronize from the lunar cycle but that is ultimately expected and okay. There are, of course, chances for future adjustment happening anyway. ↩
The text was updated successfully, but these errors were encountered: