use threads to update default time zone cache asynchronously #97

1996fanrui · 2024-08-14T09:36:43Z

close #96

Background

TimeZone::system() obtains the default time zone, and it may be called frequently by users.

Especially, Zoned::now will call it. In some systems, Zoned::now is frequently used to obtain the current time. For example, log system will call it for each log item.

How long does `TimeZone::system()` need? and why?a

After run the benchmark on my Mac, I found TimeZone::system() needs 61.665 ns, its cost mainly consists of 3 parts:

TimeZone.clone[1] costs 9.3 ns, it clones the TimeZone from cache.
Request the read lock[2] costs 16.88 ns. (The read lock of cache.)
Obtain current time via Instant::now[3] costs 37.3 ns, the Instant is used for checking whether the cache is expired.

So the Instant::now(part 3) takes most of time.

The idea of optimization

Don't call Instant::now during call TimeZone::system(), and we could start a light async thread to update the default time zone cache periodically.

Note: if the default time zone is obtained by async thread, we could update it more frequently. For example: update the default TTL from 5 minutes to 20 seconds or less.

[1]

jiff/src/tz/system/mod.rs

Line 119 in c659069

cache.tz = Some(tz.clone());

[2]

jiff/src/tz/system/mod.rs

Line 104 in c659069

let cache = CACHE.read().unwrap();

[3]

jiff/src/tz/system/mod.rs

Line 106 in c659069

if !cache.expiration.is_expired() {

This PR is a draft PR due to it includes some temporary benchmark code or use case.

I will polish this PR if the solution makes sense.

1996fanrui · 2024-08-14T09:40:07Z

bench/src/default_time_zone_benchmark.rs

+fn default_time_zone_benchmark(c: &mut Criterion) {
+    c.bench_function("Get default TimeZone::system()", |b| {
+        b.iter(|| {
+            TimeZone::system();


After this PR, this benchmark is improved from 61.665 ns to 25.429 ns on my Mac.

BurntSushi · 2024-08-14T11:05:32Z

I don't think this is the way to go. We shouldn't be starting threads inside a low level library like this. Or at least, I would want a strong ecosystem precedent for it.

1996fanrui · 2024-08-14T11:52:33Z

I don't think this is the way to go. We shouldn't be starting threads inside a low level library like this. Or at least, I would want a strong ecosystem precedent for it.

I also think it's not suitable, so only submit a draft PR to check with you, thanks for your quick feedbacck.

Do you think Zoned::now is frequently used to obtain the current time for the users of jiff?

An alternative solution is using SystemTIme to check the TTL instead of Instant. Because Zoned::now will call SystemTime::now() and TimeZone::system(). If we use SystemTIme to check the TTL, and pass SystemTime::now() to TimeZone::system(), TimeZone::system() doesn't need to call Instant::now. It could save some time, WDYT?

Of course, if user calls TimeZone::system() directly, we still need to call SystemTime::now() to check TTL.

Also, I see the TTL is 5 miuntes by default, so the time error of SystemTime::now() should be within the tolerance range.

BurntSushi · 2024-08-14T12:08:34Z

Also, I see the TTL is 5 miuntes by default, so the time error of SystemTime::now() should be within the tolerance range.

I think this is kinda missing the point of using monotonic time instead of system time on its own. It's not really about "time error." The system time can change arbitrarily and at any time. It's not that system time has some kind of error built into it necessarily. It's just that it's configurable. So if the code relies on time always increasing (or rather, being non-decreasing), then monotonic time is very useful to that end.

With that said, I do actually think this is an interesting idea. I don't think we can just get rid of the monotonic time check completely, but we might be able to manage something here because we are dealing with caching. In particular, it's always okay to invalidate the cache. The goal is to just do it infrequently. But if there's an "odd" case where we can't know if the cache is fresh or not, we can fall back to monotonic time.

Whenever the cache is invalidated, we should record the expiration as both a monotonic time and a system time.
If no current SystemTime is available when requesting the system time zone, use the monotonic time expiration.
Otherwise, compare the current SystemTime with the expiration's SystemTime:
If the difference is greater than the TTL (in either direction), then use monotonic time.
If the current SystemTime is less than the expiration SystemTime, treat the cache as fresh.
Otherwise, use monotonic time.

I think this is not fully bullet proof. In particular, if the SystemTime is continually reset to a point before the expiration SystemTime that is within the TTL, then the cache will always be fresh. To work around this, I think we can keep a count of the number of times we make a caching decision, consecutively, without using monotonic time. If this count gets too large, then we automatically fallback to monotonic time. This will prevent us from getting stuck in cases of pathological system time behavior.

I think that should actually do it? I don't immediately see a way for this to go wrong. (Famous last words.)

1996fanrui · 2024-08-14T12:40:39Z

Thanks @BurntSushi for the detailed and valuable feedback!

Whenever the cache is invalidated, we should record the expiration as both a monotonic time and a system time.

If no current SystemTime is available when requesting the system time zone, use the monotonic time expiration.

Otherwise, compare the current SystemTime with the expiration's SystemTime:

If the difference is greater than the TTL (in either direction), then use monotonic time.

If the current SystemTime is less than the expiration SystemTime, treat the cache as fresh.

Otherwise, use monotonic time.

Overall LGTM.

I think we can keep a count of the number of times we make a caching decision, consecutively, without using monotonic time. If this count gets too large, then we automatically fallback to monotonic time.

Recording a count of using SystemTime and fallback to monotonic time sounds make sense to me.

I have a question: if we find that SystemTime is correct after falling back to monotonic time once, should we continue to use SystemTime?

BurntSushi · 2024-08-14T12:47:37Z

I have a question: if we find that SystemTime is correct after falling back to monotonic time once, should we continue to use SystemTime?

I'm not sure. So pick whichever is simplest to implement. Probably just continue using SystemTime.

BurntSushi · 2024-08-24T13:08:06Z

I'm going to close this out because I think this particular avenue for optimization is not the right path to go down. But if you want to try the other idea I put forward, it's probably best to do that in a separate PR. I may also try it myself at some point too. Thanks for the effort!

1996fanrui added 2 commits August 14, 2024 16:30

Add the benchmark for default time_zone

d3c0519

POC the async update default time zone

4a5a73b

1996fanrui commented Aug 14, 2024

View reviewed changes

BurntSushi changed the title ~~96/default time zone optimization~~ use threads to update default time zone cache asynchronously Aug 24, 2024

BurntSushi closed this Aug 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use threads to update default time zone cache asynchronously #97

use threads to update default time zone cache asynchronously #97

1996fanrui commented Aug 14, 2024 •

edited

Loading

1996fanrui Aug 14, 2024

BurntSushi commented Aug 14, 2024

1996fanrui commented Aug 14, 2024

BurntSushi commented Aug 14, 2024

1996fanrui commented Aug 14, 2024

BurntSushi commented Aug 14, 2024

BurntSushi commented Aug 24, 2024

use threads to update default time zone cache asynchronously #97

use threads to update default time zone cache asynchronously #97

Conversation

1996fanrui commented Aug 14, 2024 • edited Loading

Background

How long does TimeZone::system() need? and why?a

The idea of optimization

1996fanrui Aug 14, 2024

Choose a reason for hiding this comment

BurntSushi commented Aug 14, 2024

1996fanrui commented Aug 14, 2024

BurntSushi commented Aug 14, 2024

1996fanrui commented Aug 14, 2024

BurntSushi commented Aug 14, 2024

BurntSushi commented Aug 24, 2024

1996fanrui commented Aug 14, 2024 •

edited

Loading

How long does `TimeZone::system()` need? and why?a