Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relax %parse-param bounds from Copy to Clone #471

Merged
merged 1 commit into from
Oct 9, 2024

Conversation

ratmice
Copy link
Collaborator

@ratmice ratmice commented Sep 30, 2024

In the recent discussion in #404 the @ltratt brought up the idea of relaxing the Copy bounds to Clone.
Here is a quick attempt at an implementation of that, it seems a lot simpler than I expected.

Marking as draft for the following reasons:

  • At first I thought that this might be backwards compatible, but now I realize that there can be implicit copies in user-provided actions that may no longer compile without adding calls to .clone(). However none of this appears to be exercised by the testsuite currently.
  • I probably also need to go through the documentation and update any Copy bounds therein.
  • add an example using Rc<RefCell<..>>.

@ltratt
Copy link
Member

ltratt commented Sep 30, 2024

Huh, that is easier than I expected! That now makes me wonder if we can/should pass ParamT (without any bound) and then pass it through as & or &mut. Any thoughts?

@ratmice
Copy link
Collaborator Author

ratmice commented Sep 30, 2024

Passing it in as &mut is possible, but it requires us to change the ActionFn signature so that ParamT cannot be captured by actions, this is the current reason that clone/copy is required, that is each action gets a different cloned/copied instance of the param because each action can capture it.

This requires a couple of tricks higher ranked trait bounds (HRTB's) for <'a> Fn(X) -> Y and hrtb-implicit-bounds.

The major concept lifetime-wise is that the lifetime of the parameter is less than the lifetime of the function the parameter gets passed to, and thus the functions closure cannot capture it, once the compiler knows that it is implied that once the function returns the reference that was passed is no longer referenced, and thus unique ownership is retained.

On top of this, there is a lot of "pattern matching/codegen stuff" which is made hard by any needs for explicit lifetimes, and separation of the types from the references, the syntactical side here can actually be a bit of a pain too because of the differences between pattern matching in rust expressions and rust parameter lists.

It is definitely possible, I've gotten it working, but IIRC it was fairly syntactically wonky/mediocre

@ltratt
Copy link
Member

ltratt commented Sep 30, 2024

Ah, so what you're saying is that &mut works, but is so messy, it's probably better to just do Clone?

@ratmice
Copy link
Collaborator Author

ratmice commented Sep 30, 2024

I think i'm saying slightly less than that, it is tricky, but I don't think it is that messy, but it is a break in compatibility maybe & and &mut will both work IIRC.

But it is possible that syntactically the arguments to %parse-param end up kind of at the mercy of what the rust compiler accepts without very quickly needing to integrate a rust parser into grmtools (which I don't think we want to do). It is the syntactic issues here which has caused the biggest difficulty really when I've tried this in the past. I think the syntax of %parse-param has changed though since the last time I tried it. That may well have been because we were still trying to make it work with tuples and the like though.

@ltratt
Copy link
Member

ltratt commented Sep 30, 2024

Right. I don't remember what we changed here, either. Clone feels more flexible than Copy, though, so I think we should just do it.

@ratmice
Copy link
Collaborator Author

ratmice commented Sep 30, 2024

Sounds good to me, I'll try and work on examples/docs/tests soon.

@ratmice
Copy link
Collaborator Author

ratmice commented Oct 1, 2024

So, curiously I haven't been able to make a backwards incompatible test like I thought,
code such as the following modification of lrpar/cttests/parseparam.test appears to compile fine.

The theory that I am working under is that in the generated code the Clone bounds are not present (user actions are compiled using the type given in the %parse-param directive without any weakening or generic types and bounds),

So within the generated code we know that u64 is Copy and it is only within library code that we are dealing with generic types with clone bounds ParamT: Clone.

So unless I am mistaken and there is something weirder going on, it seems like you can actually rely on types being copy within the actions even though the library/parse algorithm can only rely on the weaker Clone bounds with explicit cloning. This is absolutely not what I was expecting would happen though.

name: Test %parse-param clone
yacckind: Grmtools
grammar: |
    %start S
    %parse-param p: u64
    %%
    S -> u64:
        'INT' { (move |_| {})(p); check_copy(p); p + $lexer.span_str($1.unwrap().span()).parse::<u64>().unwrap() }
    ;
    %%
    fn check_copy<T: Copy>(_: T){}
lexer: |
    %%
    [0-9]+ 'INT'

@ltratt
Copy link
Member

ltratt commented Oct 1, 2024

The theory that I am working under is that in the generated code the Clone bounds are not present

That is a good point -- I hadn't even thought about that!

So, dumb question: does that mean we need any bounds at all? Can we just pass through an arbitrary type T? I feel like this surely can't work when you have a rule that references multiple rules, but... who knows!

@ratmice
Copy link
Collaborator Author

ratmice commented Oct 1, 2024

The theory that I am working under is that in the generated code the Clone bounds are not present

That is a good point -- I hadn't even thought about that!

So, dumb question: does that mean we need any bounds at all? Can we just pass through an arbitrary type T? I feel like this surely can't work when you have a rule that references multiple rules, but... who knows!

I'm certain that this is not the case (exactly for the references multiple rules reason given).
This is a particularly interesting case, in the generated code we call parse_actions which simplifying its signature down to the relevant bits looks like parse_actions<ParamT: Clone>(actions: Vec<&dyn Fn(..., ParamT)>, param: ParamT), The generated implementation of each action though has the concrete type (in this case u64), note that &dyn Fn(...) being a trait object as well as the ParamT: Clone bounds.

So only through that mechanism can we call each individual action from parse_actions only knowing the generic bounds of Clone, but from a function called within it know the concrete type u64. So it is actually quite a few layers of trait inheritance that appears to allow this behavior.

@ratmice
Copy link
Collaborator Author

ratmice commented Oct 2, 2024

FWIW, here is kind of minimization of what is going on here, in the following g represents e.g. parse_actions,
f represents a user action. The following code compiles as long as Foo derives Copy, but fails when you remove the Copy derive, despite the fact that g which actually calls f which relies upon the Copy trait, treats ParamT generically only having Clone bounds. All that is to say, I do think that relaxing from Copy bounds Clone is even backwards compatible when it comes to user actions which rely on the value implementing the Copy trait.

#[derive(Copy, Clone)]
struct Foo;

fn f(x:Foo) {
    let bar = *&x;
    (move |_|{})(bar);
}

fn g<ParamT:Clone>(fns: Vec<&dyn Fn(ParamT)>, param: ParamT){
  for f in fns {
    f(param.clone())
  }
}

fn main() {
   let x = vec![&f as &dyn Fn(Foo), &f as &dyn Fn(Foo)];
   g(x, Foo)
}

@ltratt
Copy link
Member

ltratt commented Oct 2, 2024

So with the Clone bounds, when we pass the object to generated code, do we end up with a problem then if there are multiple references to other rules? I'm wondering if %parse-param Rc<RefCell<...>> works, as an example of something that must be cloned and can't be moved multiple times.

@ratmice
Copy link
Collaborator Author

ratmice commented Oct 2, 2024

(With the caveat that I haven't fully tested it), the answer should definitely be no we shouldn't have any problem.
The problem with multiple rules, rears its head when we try to remove both the Copy and the Clone bounds.
Such as the following modification of the minimized example:

#[derive(Copy, Clone)]
struct Foo;

fn f(x:Foo) {
    let bar = *&x;
    (move |_|{})(bar);
}


// We removed the ParamT: Clone bounds here.
fn g<ParamT>(fns: Vec<&dyn Fn(ParamT)>, param: ParamT){
  for f in fns {
   // Subsequently get an error here.
    f(param)
  }
}

fn main() {
   let x = vec![&f as &dyn Fn(Foo), &f as &dyn Fn(Foo)];
   g(x, Foo)
}

11 | f(param)
| ^^^^^ value moved here, in previous iteration of loop
|
help: if ParamT implemented Clone, you could clone the value

To actually get rid of the Clone or Copybounds, we would need to modify the signature of the action functions to remove their ability to capture or move theParamT` value.

struct Foo;

fn f(x:&mut Foo) {
    // Action can no longer move/capture x in the action.
}

fn g<ParamT>(fns: Vec<&dyn for <'a> Fn(&mut ParamT)>, mut param: ParamT){
  for f in fns {
    // Because of the funky for<'a> signature the param could not have been moved in a previous iteration of the loop
    // so now this is ok.
    f(&mut param)
  }
}

fn main() {
   let x = vec![&f as &dyn Fn(&mut Foo), &f as &dyn Fn(&mut Foo)];
   g(x, Foo)
}

Does that help answer your questions?
Anyhow I'll work on an example using Rc<RefCell<...>> now that I have a better idea of what the backwards compatibility scenario for changing the bounds to Clone looks like.

@ltratt
Copy link
Member

ltratt commented Oct 2, 2024

Thanks for the pointers!

Anyhow I'll work on an example using Rc<RefCell<...>> now that I have a better idea of what the backwards compatibility scenario for changing the bounds to Clone looks like.

Sounds good to me :)

@ratmice
Copy link
Collaborator Author

ratmice commented Oct 8, 2024

Sorry it took me a while to get the rest done, Some notes:

  • When adding the lrpar/example/mut_param example, I ran into one more place the bounds needed to be updated
    It isn't quite clear to me why this didn't get caught earlier, or even via the cttests, I think it is because the generated code uses RTParserBuilder internally, and the non-generated RTParserBuilder usage generally has a () argument there with copy bounds. So it didn't start failing until we used something that was Clone without Copy perhaps.
  • I didn't find much in the way of docs that needed to be updated, just that one place in the book, I hope I haven't missed something.
  • I wasn't totally enthusiastic about the name mut_param, for a few reasons: internal mutability, the parse-param directive vs the examples which all use _ for separators. I'd be happy to change it if we can come up with something better imperative_eval (doesn't exactly roll off the tongue)?

[package]
name = "mut_param"
version = "0.1.0"
authors = ["Laurence Tratt <http://tratt.net/laurie/>"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your name should go here!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be resolved in bf4ada3.

@@ -0,0 +1,20 @@
[package]
name = "mut_param"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could call this clone_param maybe?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be resolved in bf4ada3.

@ltratt
Copy link
Member

ltratt commented Oct 8, 2024

I think this is ready for squashing?

@ratmice ratmice force-pushed the clone_parse_param branch from bf4ada3 to c15507a Compare October 8, 2024 16:31
@ratmice
Copy link
Collaborator Author

ratmice commented Oct 8, 2024

Squashed,

I had considered keeping separate commits for the example, but it all seemed short enough that just doing one commit didn't seem unjustifiable.

@ratmice ratmice marked this pull request as ready for review October 8, 2024 19:44
@ratmice
Copy link
Collaborator Author

ratmice commented Oct 8, 2024

Apologies, didn't realize it was still marked as draft.

@ltratt ltratt added this pull request to the merge queue Oct 8, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 8, 2024
@ltratt
Copy link
Member

ltratt commented Oct 9, 2024

Please squash.

@ratmice ratmice force-pushed the clone_parse_param branch from 207a291 to 1007f13 Compare October 9, 2024 13:05
@ratmice
Copy link
Collaborator Author

ratmice commented Oct 9, 2024

Squashed.

@ltratt ltratt added this pull request to the merge queue Oct 9, 2024
Merged via the queue into softdevteam:master with commit a0972be Oct 9, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants