Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reserved code point U+11939 should not have a glyph #22

Open
dscorbett opened this issue Feb 17, 2025 · 2 comments
Open

Reserved code point U+11939 should not have a glyph #22

dscorbett opened this issue Feb 17, 2025 · 2 comments

Comments

@dscorbett
Copy link

Font

NotoSerifDivesAkuru-Regular.otf

Where the font came from, and when

Site: https://github.com/notofonts/dives-akuru/releases/tag/NotoSerifDivesAkuru-v2.000
Date: 2025-02-17

Font version

Version 2.000

Issue

The reserved code point U+11939 is mapped to the glyph ooVoweldivesakuru. A Unicode font should not assign a glyph to a reserved code point. The glyph can instead be mapped to the sequence <U+11937, U+11930>, just as U+11938 DIVES AKURU VOWEL SIGN O is equivalent to <U+11935, U+11930>.

Character data

𑤹
U+11939 <reserved-11939>

Screenshot

𑤹

@simoncozens
Copy link
Contributor

This is admittedly a bug, but I plan to fix it by fixing Unicode; the Dives proposal did not at the time see evidence for an independent OO VOWEL but thought there may be a need in this future (hence why this codepoint is reserved), but Fernando has found some examples of this in his manuscript research.

@dscorbett
Copy link
Author

I think this vowel sign should be encoded as the sequence <U+11937, U+11930> instead of a new character, for three reasons.

  1. The sequence is already available. A new character can take years to encode.
  2. Even if U+11939 is made canonically equivalent to the sequence, the NFC form will have to stay <U+11937, U+11930> per Unicode’s normalization stability policy. The font will have to support the sequence anyway.
  3. If U+11939 is not made canonically equivalent to the sequence, it will still look identical to the sequence. That will require a new confusable entry and a new “do not emit” entry. Also, it is inconsistent with U+11938 DIVES AKURU VOWEL SIGN O, which does have a canonical decomposition. I think the confusion and inconsistency are not worth the convenience of having a single code point.

See also L2/25-021, another proposal for an Indic vowel sign that looks like a combination of two existing vowel signs, which the Script Encoding Working Group recommended against in L2/25-010.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants