storage: round probe timestamps to the probe interval #31072

teskje · 2025-01-16T17:15:53Z

This PR ensures that probe timestamps, and therefore timestamps used for minting new bindings, are rounded down to the probe interval (typically 1s). This is to reduce the amount of distinct timestamps flowing through a dataflow joining multiple sources. Each distinct timestamp induces some amount of overhead, so forcing the timestamps of individual sources to the same values is more efficient.

Motivation

This PR adds a known-desirable feature.

Closes https://github.com/MaterializeInc/database-issues/issues/8885

Checklist

This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

This commit ensures that probe timestamps, and therefore timestamps used for minting new bindings, are rounded down to the probe interval (typically 1s). This is to reduce the amount of distinct timestamps flowing through a dataflow joining multiple sources. Each distinct timestamp induces some amount of overhead, so forcing the timestamps of individual sources to the same values is more efficient.

antiguru

Looks reasonable!

aljoscha

I had some inline questions, but nothing should be blocking merging this.

aljoscha · 2025-01-22T11:34:55Z

src/storage/src/source/probe.rs

+    /// Return the desired time of the next tick.
+    fn next_tick_target(&self) -> EpochMillis {
+        let target = match self.last_tick {
+            Some(ms) => ms + self.interval,


Might this over time get out of sync with the NowFn/wall-clock time? If we only ever keep adding an interval to a timestamp we got initially.

Ah, but we're not doing that! We call the NowFn in each tick and set the returned timestamp (rounded down) as last_tick to which the interval is then added.

aljoscha · 2025-01-22T11:37:37Z

src/storage/src/source/kafka.rs

@@ -1737,8 +1739,6 @@ fn spawn_metadata_thread<C: ConsumerContext>(
                if tx.send((probe_ts, update)).is_err() {
                    break;
                }
-
-                thread::park_timeout(poll_interval);


Is the change from park_timeout to using sleep a behavior change? Or can sleep also be woken up the same way? Might be worth considering but I haven't looked into the behavior of either function.

It is! The difference is that a parked thread can be woken up by another thread, a sleeping thread cannot. I think in this case we don't need the wakeup-by-other-thread behavior and so sleep is the right way, but lmk if I missed something!

Park/unpark was introduced here: #10845, so it was meaningful. Not sure where things are currently, I think the thread now works somewhat differently. But still might be worth it to keep the immediate-cleanup-on-drop behavior.

Ah, good to know! Just changing the sleep to a park_timeout doesn't help much though. The sleep is now in the Ticker which only returns from a tick when the interval is reached, regardless of whether it's woken up before or not. The unpark-on-drop code also doesn't exist anymore.

We could change both of these things, but it doesn't seem worthwhile to pile on complexity just to shut down the thread slightly faster. With RLU the tick interval will be 1s, which is quite different from the 5m interval that existed when the unpark_timeout was added.

fine for me! as I said, none of my comments were meant to block merging 🕊️

aljoscha · 2025-01-22T11:40:36Z

test/testdrive/source-timestamps.td

+> ALTER CLUSTER test SET (REPLICATION FACTOR 0)
+> ALTER CLUSTER test SET (REPLICATION FACTOR 1)
+
+> SELECT name, write_frontier::text::uint8 % 1000 = 1


Are the write frontiers now, say 1, 1001, 2001? My I-like-rounded-numbers brain would prefer 0, 1000, 2000. I could be totally misunderstanding what's happening though.

Indeed. I think what's happening here is that the source pipeline mints a binding for time, e.g. 1000, writes out all the data it has for that binding and then steps forward the frontier to indicate that all future updates will be at some later time. We could maybe use our knowledge of the tick interval instead of just stepping forward, but idk what the implications of that would be.

Looks like that got accidentally removed in MaterializeInc#30817.

teskje · 2025-01-22T13:24:43Z

TFTRs!

teskje marked this pull request as ready for review January 16, 2025 17:39

teskje requested a review from a team as a code owner January 16, 2025 17:39

teskje requested a review from petrosagg January 16, 2025 17:39

teskje force-pushed the probe-round-timestamps branch 2 times, most recently from cd4f574 to c5a5c10 Compare January 17, 2025 13:31

teskje mentioned this pull request Jan 21, 2025

storage: make probe intervals runtime configurable #31125

Merged

5 tasks

teskje force-pushed the probe-round-timestamps branch from c5a5c10 to 6856a22 Compare January 22, 2025 10:09

antiguru approved these changes Jan 22, 2025

View reviewed changes

aljoscha approved these changes Jan 22, 2025

View reviewed changes

ci: switch on RLU reclocking again

3e04f9b

Looks like that got accidentally removed in MaterializeInc#30817.

teskje enabled auto-merge January 22, 2025 12:26

teskje disabled auto-merge January 22, 2025 12:38

teskje merged commit 6c407e2 into MaterializeInc:main Jan 22, 2025
77 checks passed

teskje deleted the probe-round-timestamps branch January 22, 2025 13:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: round probe timestamps to the probe interval #31072

storage: round probe timestamps to the probe interval #31072

teskje commented Jan 16, 2025 •

edited

Loading

antiguru left a comment

aljoscha left a comment

aljoscha Jan 22, 2025

teskje Jan 22, 2025

aljoscha Jan 22, 2025

teskje Jan 22, 2025

aljoscha Jan 22, 2025

teskje Jan 22, 2025

aljoscha Jan 22, 2025

aljoscha Jan 22, 2025

teskje Jan 22, 2025

teskje commented Jan 22, 2025

storage: round probe timestamps to the probe interval #31072

storage: round probe timestamps to the probe interval #31072

Conversation

teskje commented Jan 16, 2025 • edited Loading

Motivation

Checklist

antiguru left a comment

Choose a reason for hiding this comment

aljoscha left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

teskje commented Jan 22, 2025

teskje commented Jan 16, 2025 •

edited

Loading