Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iterate through canonical timezones and metazones #5675

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions components/timezone/src/ids.rs
Original file line number Diff line number Diff line change
Expand Up @@ -333,6 +333,19 @@ impl TimeZoneIdMapperBorrowed<'_> {
Some(string)
}

/// Iterates through all canonical time zones
pub fn iter_canonical(&self) -> impl Iterator<Item = TimeZoneBcp47Id> + '_ {
self.data.bcp47_ids.iter()
}

/// Iterates through all metazones
pub fn iter_metazones(&self) -> impl Iterator<Item = TimeZoneBcp47Id> + '_ {
self.data
.metazones
.iter()
.filter_map(|i| self.data.bcp47_ids.get(i as usize))
}
Comment on lines +352 to +357
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: huh? The function returns an iterator of TimeZoneBcp47Id but it is documented as returning an iterator of metazones.


/// Queries the data for `iana_id` without recording the normalized string.
/// This is a fast, no-alloc lookup.
fn iana_lookup_quick(&self, iana_id: impl AsRef<[u8]>) -> Option<IanaTrieValue> {
Expand Down
2 changes: 2 additions & 0 deletions components/timezone/src/provider/names.rs
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,8 @@ pub struct IanaToBcp47MapV3<'data> {
// Note: this is 9739B as `ZeroVec<TimeZoneBcp47Id>` (`ZeroVec<TinyStr8>`)
// and 9335B as `VarZeroVec<str>`
pub bcp47_ids: ZeroVec<'data, TimeZoneBcp47Id>,
/// Metazones
pub metazones: ZeroVec<'data, u16>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: Please explain/document this field. Previous authors of this data struct (mostly me) have gone above and beyond documenting how the pieces fit together, since it is non-obvious. I don't immediately see how a ZeroVec<u16> relates to metazones.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: Besides iterating over metazones, what do you plan to use this field for?

(We shouldn't add fields to the core TimeZoneIdMapper data struct unless they have a strong core use case)

/// An XxHash64 checksum of [`Self::bcp47_ids`].
pub bcp47_ids_checksum: u64,
}
Expand Down
2 changes: 2 additions & 0 deletions ffi/capi/tests/missing_apis.txt
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,8 @@ icu::timezone::CustomZonedDateTime::try_from_utf8#FnInStruct
icu::timezone::CustomZonedDateTime::try_iso_from_str#FnInStruct
icu::timezone::CustomZonedDateTime::try_iso_from_utf8#FnInStruct
icu::timezone::ParseError#Enum
icu::timezone::TimeZoneIdMapperBorrowed::iter_canonical#FnInStruct
icu::timezone::TimeZoneIdMapperBorrowed::iter_metazones#FnInStruct
icu::timezone::UnknownTimeZoneError#Struct
icu::timezone::WindowsTimeZoneMapper#Struct
icu::timezone::WindowsTimeZoneMapper::as_borrowed#FnInStruct
Expand Down

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion provider/data/timezone/fingerprints.csv
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
time_zone/bcp47_to_iana@1, <singleton>, 7643B, 7623B, d999c8ca67868c35
time_zone/iana_to_bcp47@3, <singleton>, 9579B, 9538B, 1bf6347ae2f6619d
time_zone/iana_to_bcp47@3, <singleton>, 9921B, 9858B, 192398d2ca5e487
time_zone/metazone_period@1, <singleton>, 11097B, 11009B, dcfa50fb39ebd549
time_zone/offset_period@1, <singleton>, 13374B, 13286B, 397b18d3f2345411
time_zone/windows_zones_to_bcp47@1, <singleton>, 8634B, 8591B, 90b7fe3f9cadf5bd

Large diffs are not rendered by default.

161 changes: 161 additions & 0 deletions provider/source/data/debug/time_zone/[email protected]

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions provider/source/src/cldr_serde/time_zones/meta_zones.rs
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ pub(crate) struct MapZone {
#[serde(rename = "_type")]
pub(crate) zone_type: String,
#[serde(rename = "_territory")]
pub(crate) territory: String,
pub(crate) territory: icu::locale::subtags::Region,
}

#[derive(PartialEq, Debug, Clone, Deserialize)]
Expand All @@ -85,7 +85,7 @@ pub(crate) struct Metazones {
#[serde(rename = "metazoneInfo")]
pub(crate) meta_zone_info: MetazoneInfo,
#[serde(rename = "metazones")]
pub(crate) _meta_zones_territory: MetazonesTerritory,
pub(crate) meta_zones_territory: MetazonesTerritory,
#[serde(rename = "metazoneIds")]
pub(crate) meta_zone_ids: MetazoneIds,
}
Expand Down
21 changes: 21 additions & 0 deletions provider/source/src/time_zones/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,11 @@ use crate::IterableDataProviderCached;
use crate::SourceDataProvider;
use icu::datetime::provider::time_zones::*;
use icu::timezone::provider::*;
use icu_locale_core::subtags::region;
use icu_locale_core::subtags::Region;
use icu_provider::prelude::*;
use std::collections::BTreeMap;
use std::collections::BTreeSet;
use std::collections::HashSet;
use tinystr::tinystr;

Expand Down Expand Up @@ -104,6 +106,25 @@ impl SourceDataProvider {
Ok(meta_zone_ids)
}

fn compute_meta_zones_representatives(&self) -> Result<BTreeSet<String>, DataError> {
let metazones = &self
.cldr()?
.core()
.read_and_parse::<cldr_serde::time_zones::meta_zones::Resource>(
"supplemental/metaZones.json",
)?
.supplemental
.meta_zones;

Ok(metazones
.meta_zones_territory
.0
.iter()
.filter(|t| t.map_zone.territory == region!("001"))
.map(|t| t.map_zone.zone_type.clone())
.collect())
}

fn compute_primary_zones(&self) -> Result<BTreeMap<TimeZoneBcp47Id, Region>, DataError> {
let primary_zones = self
.cldr()?
Expand Down
13 changes: 13 additions & 0 deletions provider/source/src/time_zones/names.rs
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,18 @@ impl DataProvider<IanaToBcp47MapV3Marker> for SourceDataProvider {
// the canonical order of the IANA names.
let bcp2iana = self.compute_canonical_tzids_btreemap()?;

let metazone_representatives = self.compute_meta_zones_representatives()?;

let metazones = bcp47_ids
.iter()
.enumerate()
.flat_map(|(i, bcp47)| {
metazone_representatives
.contains(bcp2iana.get(&bcp47)?)
.then_some(i as u16)
})
.collect();

// Transform the map to use BCP indices:
#[allow(clippy::unwrap_used)] // structures are derived from each other
let map: BTreeMap<Vec<u8>, usize> = iana2bcp
Expand Down Expand Up @@ -105,6 +117,7 @@ impl DataProvider<IanaToBcp47MapV3Marker> for SourceDataProvider {
})?
.convert_store(),
bcp47_ids,
metazones,
bcp47_ids_checksum,
};
Ok(DataResponse {
Expand Down