Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Container trait and to simplify Expr and LogicalPlan apply and map methods #13467

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

peter-toth
Copy link
Contributor

@peter-toth peter-toth commented Nov 18, 2024

Which issue does this PR close?

Part of #8913.

Rationale for this change

The current implementation of LogicalPlan:apply_children(), LogicalPlan::map_children(), LogicalPlan::apply_expressions(), LogicalPlan::map_expressions(), Expr::apply_children() and Expr::map_children() are confusing due the map_until_stop_and_collect macro. I think we can introduce a trait that can contain arbitrary sibling elements that functions can be applied on and mapped:

/// [`Container`] contains elements that a function can be applied on or mapped. The
/// elements of the container are siblings so the continuation rules are similar to
/// [`TreeNodeRecursion::visit_sibling`] / [`Transformed::transform_sibling`].
pub trait Container<'a, T: 'a>: Sized {
    fn apply_elements<F: FnMut(&'a T) -> Result<TreeNodeRecursion>>(
        &'a self,
        f: F,
    ) -> Result<TreeNodeRecursion>;

    fn map_elements<F: FnMut(T) -> Result<Transformed<T>>>(
        self,
        f: F,
    ) -> Result<Transformed<Self>>;
}

What changes are included in this PR?

This PR:

  • Gets rid of map_until_stop_and_collect macro and many transform and rewrite helper methods.
  • Adds the Container trait and blanket implementations for Box, Option, Vec, tuples, ...
  • Defined the Container implementation for Expr and LogicalPlan.
  • Simplifies the above mentioned apply... and map... methods.

Are these changes tested?

Yes, with exitsing UTs.

Are there any user-facing changes?

No.

…til_stop_and_collect` macro, simplify apply and map logic with `Container`s where possible
@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions optimizer Optimizer rules common Related to common crate labels Nov 18, 2024
@findepi
Copy link
Member

findepi commented Nov 18, 2024

This PR:

  • Gets rid of map_until_stop_and_collect macro

that's great

Adds the Container trait and blanket implementations for Box, Option, Vec, tuples, ...

Do we need that?

What about calling this trait TreeNode or GraphNode and implementing it for our types only?

@peter-toth
Copy link
Contributor Author

Adds the Container trait and blanket implementations for Box, Option, Vec, tuples, ...

Do we need that?

What about calling this trait TreeNode or GraphNode and implementing it for our types only?

Yes, we do. We already have the TreeNode trait with a well defined API. It has TreeNode::apply_children() / TreeNode::map_children() to let the tree implementations define how to visit/map the children of a node.

This new Container trait with the above mentioned blankets just make the implementation of that apply_children() / map_children() of logical plan trees (Expr and LogicalPlan) simpler.
We can actually move Container and its blanket implementations to datafusion::expr if that's cleaner.

@peter-toth
Copy link
Contributor Author

peter-toth commented Nov 18, 2024

cc @alamb, @berkaysynnada

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @peter-toth - I think this looks like a really nice improvement in the code 🙏

I have some comment suggestions, but nothing that is required.

The only thing I want to do prior to approving this PR is to run the planning benchmarks and make sure we didn't introduce any regressions. I am running them now.

f: F,
) -> Result<TreeNodeRecursion>;

fn map_elements<F: FnMut(T) -> Result<Transformed<T>>>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also add some documentation here about what map_elements is (perhaps just a map back to TreeNode::map_children?)

/// elements of the container are siblings so the continuation rules are similar to
/// [`TreeNodeRecursion::visit_sibling`] / [`Transformed::transform_sibling`].
pub trait Container<'a, T: 'a>: Sized {
fn apply_elements<F: FnMut(&'a T) -> Result<TreeNodeRecursion>>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also add some documentation here about what apply_elements is (perhaps just a map back to TreeNode::apply?)

let mut tnr = TreeNodeRecursion::Continue;
for c in self {
tnr = c.apply_elements(&mut f)?;
match tnr {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is definitely a much easier to understand formulation

}
}

impl<'a, T: 'a, C0: Container<'a, T>, C1: Container<'a, T>> Container<'a, T>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a moment to realize this was the impl for (a,b) 2-tuples. Maybe we could make that clearer with some comments.

Likewise with the 3-tuple.

Though to be honest, it seems to me like it might be clearer if we didn't implement this trait for tuples, and instead made the places they are used, named structs. But that is just a personal style preference

/// [`Container`] contains elements that a function can be applied on or mapped. The
/// elements of the container are siblings so the continuation rules are similar to
/// [`TreeNodeRecursion::visit_sibling`] / [`Transformed::transform_sibling`].
pub trait Container<'a, T: 'a>: Sized {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a matter of preference, but what would you think about calling this something less generic, such as TreeNodeContainer ?

self.function_body.apply_elements(f)
}

fn map_elements<F: FnMut(Expr) -> Result<Transformed<Expr>>>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is very cool

@@ -81,7 +79,7 @@ impl TreeNode for LogicalPlan {
expr,
input,
schema,
}) => rewrite_arc(input, f)?.update_data(|input| {
}) => input.map_elements(f)?.update_data(|input| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the difference between map_children and map_elements that map_elements doesn't also apply the function to input ?

// There are two part of expression for join, equijoin(on) and non-equijoin(filter).
// 1. the first part is `on.len()` equijoin expressions, and the struct of each expr is `left-on = right-on`.
// 2. the second part is non-equijoin(filter).
LogicalPlan::Join(Join { on, filter, .. }) => {
on.iter()
// TODO: why we need to create an `Expr::eq`? Cloning `Expr` is costly...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 for removing the clone

@@ -57,78 +57,50 @@ impl TreeNode for Expr {
| Expr::Negative(expr)
| Expr::Cast(Cast { expr, .. })
| Expr::TryCast(TryCast { expr, .. })
| Expr::InSubquery(InSubquery{ expr, .. }) => vec![expr.as_ref()],
| Expr::InSubquery(InSubquery { expr, .. }) => expr.apply_elements(f),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is nice to avoid creating vecs here as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
common Related to common crate logical-expr Logical plan and expressions optimizer Optimizer rules sql SQL Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants