-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(python,rust): Support engine callback for LazyFrame.profile
#21534
base: main
Are you sure you want to change the base?
Conversation
This is an alternative to #20693 that tries to avoid exposing too many internals. |
LazyFrame.profile
c9a8794
to
bb518af
Compare
So that we can support the GPU engine in profiled collection of a lazyframe, plumb through a mechanism for recording raw timings for nodes that were executed through the PythonScan node. This necessitates some small changes to the internals of the NodeTimer, since `Instant`s are opaque. We instead directly store durations (as nanoseconds since the query start) and when calling into the IR post-optimization callback, provide a duration that is the number of nanoseconds since the query was started. On the Python side we can then keep track and record durations independently, offsetted by this optimisation duration. As a side-effect, `profile` now correctly measures the optimisation time in the logical plan, rather than as previously just the time to produce the physical plan from the optimised logical plan.
bb518af
to
6f893e8
Compare
LazyFrame.profile
LazyFrame.profile
with_columns.map(|x| x.into_iter().map(|x| x.to_string()).collect::<Vec<_>>()), | ||
predicate, | ||
n_rows, | ||
// Is this boolean is true, callback should return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Is this boolean is true, callback should return | |
// If this boolean is true, callback should return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 small comment, otherwise good to go (if CI is green). Thank you @wence-
@@ -29,7 +29,19 @@ impl NodeTimer { | |||
let nodes = &mut data.0; | |||
nodes.push(name); | |||
let ticks = &mut data.1; | |||
ticks.push((start, end)) | |||
ticks.push(( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we replace this whole inner with:
store_raw(
start.duration_since(self.query_start),
end.duration_since(self.query_start),
name)
# Don't run on GPU in _eager mode (but don't warn) | ||
is_gpu = False | ||
|
||
callback = _gpu_engine_callback( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much cleaner. 👍
Ok(()) | ||
}) | ||
ldf._collect_post_opt(|root, lp_arena, expr_arena, _| { | ||
post_opt_callback(&lambda, root, lp_arena, expr_arena, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. 👍
So that we can support the GPU engine in profiled collection of a lazyframe, plumb through a mechanism for recording raw timings for nodes that were executed through the PythonScan node.
This necessitates some small changes to the internals of the NodeTimer, since
Instant
s are opaque. We instead directly store durations (as nanoseconds since the query start) and when calling into the IR post-optimization callback, provide a duration that is the number of nanoseconds since the query was started. On the Python side we can then keep track and record durations independently, offsetted by this optimisation duration.As a side-effect,
profile
now correctly measures the optimisation time in the logical plan, rather than as previously just the time to produce the physical plan from the optimised logical plan.