Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Allow default tokens to be returned, for optional tokens #32

Open
oxcrow opened this issue Jul 20, 2024 · 6 comments
Open
Labels
enhancement New feature or request

Comments

@oxcrow
Copy link

oxcrow commented Jul 20, 2024

Hi,

Consider these statements ...

let x = 1;
let const x = 1;

fn main() {}
fn main() -> void {}

We often have such instances where the default token is assumed, if it is not given by the user.

Thus, it would be easier for us, if the default tokens were returned if optional tokens don't exist in our rules.

This would be helpful, as currently to represent the return type of the functions, I would need to use an Option<>, and modify the AST nodes after parsing, to set the function's return type as void.

// in pest file
function = { "fn" ~ id ~ "(" ~ (args)? ~ ")" ~ ("->" ~ type)? ~ "{" ~ function_item* ~ "}" }

// in rust file
#[derive(Debug, Clone, FromPest)]
#[pest_ast(rule(Rule::function))]
pub struct Function {
    pub id: Id,
    pub args: Args,
    pub r#type: Option<Type>, // <<< Issue ... created by optional rule ("->" ~ type)? above
    pub items: Vec<FunctionItem>,
}

This is rather tedious work, and complicates our compiler's AST representation.

In OCaml's Menhir parser generator, if a rule doesn't match anything, we are allowed to return a default node.

Such a feature would also be extremely helpful for pest.

type:
  | (* empty *) { Ast.Type_Void } (* <<< If nothing exists for return type token, then return a Void type  *)
  | ARROW INT { Ast.Type_Int }
  | ARROW id = ID; { Ast.Type_Derived id }
  ;
@tomtau
Copy link
Contributor

tomtau commented Jul 20, 2024

I'm not sure if this can be added in a backwards-compatible way, but it's good to consider in pest3 -- @TheVeryDarkness wdyt?

@oxcrow
Copy link
Author

oxcrow commented Jul 20, 2024

I can think of a dirty hack that might just work.

Allow a single reserved symbol (let's say #) to match immediately when called

Ex: Modify the rule I showed above as: (("->" ~ type) | #void)? where void = { "void" }.

As soon as pest's parser reads #, the ordered choice arm where it exists, should immediately be matched, and as a result, the void node should be returned as a result.

That's it.

This requires very little modification to pest's grammar and logic, and only acts to short circuit the decision arm.

The effect of # will also be localized, so it should be safe to use.

So, I think this can be added as a new feature, in a backward compatible way.

@tomtau
Copy link
Contributor

tomtau commented Jul 20, 2024

@oxcrow the changes to the grammar may result in semver-breaking changes (due to some old design choices / the lack of #[non_exhaustive] attribute annotation in older Rust versions) which can, unfortunately, cause issues, because Cargo may mix up different 2.X versions of pest crates.

It could, however, potentially go under the grammar-extras feature flag which was originally done for the "tag" feature: https://pest.rs/book/grammars/syntax.html#tags

@oxcrow
Copy link
Author

oxcrow commented Jul 20, 2024

There might be another simpler solution ...

Instead of modifying pest's grammar, what if we modify pest-ast?

If an Option<T> node exists in a struct/enum that will be parsed by pest-ast, then it can be easier to specify the default value to set for this Option<> in case it turns out to be None after being parsed.

This should work.

However, the Option<T> node will remain, which complicates the AST's representation.

To solve that, Instead of an Option<T> what if pest-ast supports default values for T?

For example:

#[derive(Debug, Clone, FromPest)]
#[pest_ast(rule(Rule::function))]
pub struct Function {
    pub id: Id,
    pub args: Args,

    #[pest_ast_default(Void)] // <<< Users can set a default value using an attribute
    pub r#type: Type, // <<< This should be just Type, not Option<Type>

    pub items: Vec<FunctionItem>,
}

This seems to be simpler than my previous hack idea, and will also be safer, since the pest grammar won't need to be modified.

@tomtau
Copy link
Contributor

tomtau commented Jul 20, 2024

Yeah, that sounds like a better solution 👍

@tomtau tomtau transferred this issue from pest-parser/pest Jul 20, 2024
@tomtau tomtau added the enhancement New feature or request label Jul 20, 2024
@oxcrow
Copy link
Author

oxcrow commented Jul 21, 2024

Thanks! I'm glad to help. 😃

Developing this feature may obviously take some time, so for users like me who will be waiting, I would recommend not to use such optional grammar right now, and instead chose to only use the base grammar.

i.e

// Don't support optional grammar
let x = 1;

// Instead, only support the base grammar
let const x = 1;

This is for two reasons:

  • Users can develop their base grammar well, and simplify their compiler.
  • When the default argument feature gets implemented, the users can immediately use it by using only a single attribute. This is important to state, because If they chose to use the optional grammar right now, then their compiler code will become little complex, and in future, they'll have to refactor lot of code to remove the complexity and use this new implemented feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants