Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python,rust): Support engine callback for LazyFrame.profile #21534

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

wence-
Copy link
Collaborator

@wence- wence- commented Feb 28, 2025

So that we can support the GPU engine in profiled collection of a lazyframe, plumb through a mechanism for recording raw timings for nodes that were executed through the PythonScan node.

This necessitates some small changes to the internals of the NodeTimer, since Instants are opaque. We instead directly store durations (as nanoseconds since the query start) and when calling into the IR post-optimization callback, provide a duration that is the number of nanoseconds since the query was started. On the Python side we can then keep track and record durations independently, offsetted by this optimisation duration.

As a side-effect, profile now correctly measures the optimisation time in the logical plan, rather than as previously just the time to produce the physical plan from the optimised logical plan.

@wence-
Copy link
Collaborator Author

wence- commented Feb 28, 2025

This is an alternative to #20693 that tries to avoid exposing too many internals.

@wence- wence- changed the title feat(rust, python): Support engine callback for LazyFrame.profile feat(rust, python): Support engine callback for LazyFrame.profile Feb 28, 2025
@wence- wence- force-pushed the wence/fea/profile-gpu branch from c9a8794 to bb518af Compare February 28, 2025 16:01
So that we can support the GPU engine in profiled collection of a
lazyframe, plumb through a mechanism for recording raw timings for
nodes that were executed through the PythonScan node.

This necessitates some small changes to the internals of the
NodeTimer, since `Instant`s are opaque. We instead directly store
durations (as nanoseconds since the query start) and when calling into
the IR post-optimization callback, provide a duration that is the
number of nanoseconds since the query was started. On the Python side
we can then keep track and record durations independently, offsetted
by this optimisation duration.

As a side-effect, `profile` now correctly measures the optimisation
time in the logical plan, rather than as previously just the time to
produce the physical plan from the optimised logical plan.
@wence- wence- force-pushed the wence/fea/profile-gpu branch from bb518af to 6f893e8 Compare February 28, 2025 17:19
@wence- wence- changed the title feat(rust, python): Support engine callback for LazyFrame.profile feat(python,rust): Support engine callback for LazyFrame.profile Feb 28, 2025
@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars and removed title needs formatting labels Feb 28, 2025
with_columns.map(|x| x.into_iter().map(|x| x.to_string()).collect::<Vec<_>>()),
predicate,
n_rows,
// Is this boolean is true, callback should return
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Is this boolean is true, callback should return
// If this boolean is true, callback should return

Copy link
Member

@ritchie46 ritchie46 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 small comment, otherwise good to go (if CI is green). Thank you @wence-

@@ -29,7 +29,19 @@ impl NodeTimer {
let nodes = &mut data.0;
nodes.push(name);
let ticks = &mut data.1;
ticks.push((start, end))
ticks.push((
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we replace this whole inner with:

store_raw(
    start.duration_since(self.query_start),
    end.duration_since(self.query_start),
    name)

# Don't run on GPU in _eager mode (but don't warn)
is_gpu = False

callback = _gpu_engine_callback(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much cleaner. 👍

Ok(())
})
ldf._collect_post_opt(|root, lp_arena, expr_arena, _| {
post_opt_callback(&lambda, root, lp_arena, expr_arena, None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants