-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2.0.0-beta1 script properties not up to date for unicode 16 #6041
Comments
cc @robertbastian what's the status of icuexportdata being updated? I thought we were already on Unicode 16. |
All I can tell you is that |
It's supposed to , from the relnotes: https://github.com/unicode-org/icu/releases/tag/release-76-1 Confirmed that this reproduces on ICU4X main, and confirmed that Unicode 16 data has a whole bunch of scx values for low codepoints that are not available on Unicode 15. |
Trying to build ICU4C to see what's up |
Found the culprit: https://unicode-org.atlassian.net/browse/ICU-21821 That hardcoded table in icuexportdata needs to be updated |
New data in #6044 Confirmed that it passes the following test: #[test]
fn expected_script_thing() {
use crate::props::Script;
use crate::script::ScriptWithExtensions;
let scripts = ScriptWithExtensions::new()
.get_script_extensions_val('\u{2bc}')
.iter()
.collect::<Vec<_>>();
assert_eq!(
scripts,
[
Script::Bengali,
Script::Cyrillic,
Script::Devanagari,
Script::Latin,
Script::Thai,
Script::Lisu,
Script::Toto
]
);
} |
Linking #4602 |
It seems you have a workaround for this for now: We have fixed the ICU4C data export around this, and could do the work for a patch release, but @sffc and I would prefer to wait till ICU4X 2.0.0-beta2 which should happen in the next few weeks, instead of doing a transient patch release. |
A few weeks is fine, thanks for tracking this down! |
Per the unicode 16 version of ScriptExtensions.txt, the following should pass:
but we end up with just
Script::Common
, which would have been expected for unicode 15 and earlier.To make this more confusing, If I look at the raw data files in the release-76-1 tag, it does appear up to date. I haven't dug much past that.
The text was updated successfully, but these errors were encountered: