Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEEDBACK] u:locale may be problematic for ICU4X and interoperability #976

Open
sffc opened this issue Jan 13, 2025 · 8 comments
Open

[FEEDBACK] u:locale may be problematic for ICU4X and interoperability #976

sffc opened this issue Jan 13, 2025 · 8 comments
Labels
LDML47 LDML 47 Release (Stable) Preview-Feedback Feedback gathered during the technical preview

Comments

@sffc
Copy link
Member

sffc commented Jan 13, 2025

An early decision we made in the ICU4X design was that we would load data for formatters ahead of time into the built-in registry.

The u:locale option makes this more challenging and less efficient. Instead of loading number formatting data one time (for the message locale), we now have to load it an arbitrary number of times for an arbitrary set of locales.

@sffc sffc added Preview-Feedback Feedback gathered during the technical preview LDML47 LDML 47 Release (Stable) labels Jan 13, 2025
@eemeli eemeli changed the title [FEEDBACK] @u:locale may be problematic for ICU4X and interoperability [FEEDBACK] u:locale may be problematic for ICU4X and interoperability Jan 13, 2025
@eemeli
Copy link
Collaborator

eemeli commented Jan 13, 2025

We currently say this about the u: options:

This section describes common **_<dfn>`u:` options</dfn>_** which each implementation SHOULD support
for all _functions_ and _markup_.

Given that e.g. a conformant ICU4X implementation is therefore not required to implement them, are you ok with the spec still recommending them?

@sffc
Copy link
Member Author

sffc commented Jan 13, 2025

Okay so this is another normative-optional thing. #977

@aphillips
Copy link
Member

In addition to the text @eemeli quotes, there is also this in the u:locale itself:

Implementations MAY emit a Bad Option error and MAY ignore the value of the u:locale option as a whole or any of the entries in the list of language tags.

@sffc noted:

The u:locale option makes this more challenging and less efficient. Instead of loading number formatting data one time (for the message locale), we now have to load it an arbitrary number of times for an arbitrary set of locales.

Before you proceed to not implement this, note that standards are not always written for the convenience of the implementers. There are many cases in which a user wishes to override the locale of a given placeholder or expression, which is what u:locale is for. Not implementing the feature because the performance of that feature is inconvenient to implement doesn't sound like good policy. It can even be rather low performance because the frequency of such messages is relatively low compared to the single-locale case.

@sffc
Copy link
Member Author

sffc commented Jan 13, 2025

Not implementing the feature because the performance of that feature is inconvenient to implement doesn't sound like good policy.

Hm? This statement doesn't make sense. I am pushing back on the feature because I worry it may reduce performance, since we cannot take for granted that the locale is a constant.

It can even be rather low performance because the frequency of such messages is relatively low compared to the single-locale case.

Yes, we would likely keep single-locale as the faster path. But since multi-locale can occur, there are certain assumptions we cannot make, which impacts performance for everyone.

note that standards are not always written for the convenience of the implementers.

Yes. Of course. Implementers are one of several constituencies. I am very much familiar with these tradeoffs in my role as TC39-TG2 convener.

My concern is about implementation difficulty and also the performance impact.

There are many cases in which a user wishes to override the locale of a given placeholder or expression, which is what u:locale is for.

If that's true, then that's the most important thing and we'll figure out how to implement it and eat the cost. Do you have evidence of this, though? I don't think MF1 or most of the other syntaxes I'm familiar with support this feature. Can you point me to the discussion where it was proposed and added?

@sffc
Copy link
Member Author

sffc commented Jan 14, 2025

Given this:

The 2119 keyword SHOULD is often misunderstood. It is stronger than MAY--it is, in fact, judgemental in the way you don't care for 🙈

I do not believe that the SHOULD language alleviates my concern. I would prefer for this type of feature to be MAY.

@sffc
Copy link
Member Author

sffc commented Jan 16, 2025

To summarize, my position currently is:

  • I would like to see evidence that motivates this feature
  • If the feature is well motivated, then the spec can stay as-is.
  • If the feature has questionable or niche motivation, then it should be a MAY, not a SHOULD.

Determining whether a feature has "questionable or niche motivation" is a bit subjective, but some rules of thumb should be:

  1. Is the feature required for grammatical correctness?
  2. Is the feature included in predecessor specifications?
  3. Is the feature necessary to support any of the other goals?

@eemeli
Copy link
Collaborator

eemeli commented Jan 16, 2025

The option was extensively discussed in the WG through the Expression Attributes design doc, which hopefully provides some of the rationalization that you're looking for. Note that its Metadata section opens up, and includes links to the six PRs that helped shape the proposal.

@sffc
Copy link
Member Author

sffc commented Jan 22, 2025

Thank you. I reviewed that design doc as well as the six linked PRs. Very good record-keeping.

I did not, however, find justification for why this specific expression attribute should be included. I see a lot of discussion about how expression attributes are an important feature to include in the syntax, which is fine, but it seems like the three proposed attributes "id", "locale", and "dir" were all adopted without a lot of discussion on their individual merits, at least not that I can see from that source.

The design doc says "A common example of [a message author wanting to set an attribute] is the locale", with the following example:

In French, this date would be displayed as {|2024-05-06| :date u:locale=fr}

which seems like a fairly synthetic example. Can you share data on how widely u:locale is needed? Basically, the above sentence A common example of this is the locale is currently [citation needed] to me.

If a feature is optional, like this one apparently is, implementations like ICU4X really need to know what tradeoffs are involved in implementing it or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LDML47 LDML 47 Release (Stable) Preview-Feedback Feedback gathered during the technical preview
Projects
None yet
Development

No branches or pull requests

3 participants