Start avoiding parse diagnostics on error tokens #4431

chandlerc · 2024-10-20T08:40:41Z

An invalid parse due to an error token isn't likely a great diagnostic
as it will already have been diagnosed by the lexer. A common case to
start handling that is when the parser encounters an invalid token when
expecting an expression.

This removes a number of unhelpful diagnostics after the lexer has done
a good job diagnosing.

This also means that there may be parse tree errors that aren't
diagnosed when there are lexer-diagnosed errors, so track that.

Follow-up to #4430 that almost finishes addressing its diagnostic TODO.

jonmeow

Seems good, but to be sure, you're doing this in a way that's specific to expressions. I think this'd miss, for example, declarations.

Had you considered adding support to Emit() in order to elide diagnostics where the location is an error token?

jonmeow · 2024-10-21T18:34:52Z

toolchain/parse/handle_expr.cpp

+      // Fallthrough to the error token case -- we don't need to diagnose those.
+      [[fallthrough]];
+    }
+    case Lex::TokenKind::Error: {


I'm not used to seeing a non-last default case (enough that I wouldn't have expected this to work), but maybe others are more versed with that structure? Had you considered writing this as an if instead of fallthrough, e.g.: default: if (token_kind != Lex::TokenKind::Error) { ...Emit... }

🤷 The structure didn't give me any pause, but we've established that I'm not necessarily representative there...

I had tried both ways of writing it, but the switch seemed better. Using an if is a bit awkward as you have to dig the kind back out and then re-test it. As we're already testing it for the switch, and there is a natural fallthrough structure, it seemed clean to use that instead.

🤷 The structure didn't give me any pause, but we've established that I'm not necessarily representative there...

I had tried both ways of writing it, but the switch seemed better. Using an if is a bit awkward as you have to dig the kind back out and then re-test it. As we're already testing it for the switch, and there is a natural fallthrough structure, it seemed clean to use that instead.

What do you mean by digging it back out? Couldn't you add auto token_kind = in the switch statement?

I mean, yes, I could also store it.

But it still ends up with an awkward thing where every other value is handled by a case, but this one isn't.

If you feel strongly that using fallthrough isn't OK here, I can change it I guess? Didn't seem like a big thing either way.

I'm really having trouble. #style to see if it's just me.

No stress, this really wasn't an important one, I'll switch it to the other form. Was really just trying to understand if it was just surprise or causing more trouble. It's slightly awkward to use the if, but as you say, very slight so seems easily outweighed given this isn't working at all.

toolchain/parse/handle_expr.cpp

chandlerc · 2024-10-24T07:25:48Z

Seems good, but to be sure, you're doing this in a way that's specific to expressions. I think this'd miss, for example, declarations.

Had you considered adding support to Emit() in order to elide diagnostics where the location is an error token?

I don't think we necessarily want to always elide a diagnostic because the location is an error token... That seems like a fairly subtle coupling between the location's kind and the diagnostic behavior. It seems more clear to explicitly control the whether to emit the diagnostic based on whether there is some already-diagnosed error.

It does mean we'll need to add support in other places, but I would somewhat want to consider in that place whether the diagnostic makes sense or not, and also whether or what recovery would be best given an error token. Even if we end up making similar choices, the context of teh choice seems relevant, so I wouldn't necessarily factor it until/unless we find some more underlying pattern we want to model with that factoring?

That said, some of my thinking is just initial thinking here. I've only really looked at the expression case so far, so I'm more comfortable confining the change to that. When we get to other cases, can always revisit?

jonmeow · 2024-10-24T16:30:07Z

Seems good, but to be sure, you're doing this in a way that's specific to expressions. I think this'd miss, for example, declarations.
Had you considered adding support to Emit() in order to elide diagnostics where the location is an error token?

I don't think we necessarily want to always elide a diagnostic because the location is an error token... That seems like a fairly subtle coupling between the location's kind and the diagnostic behavior. It seems more clear to explicitly control the whether to emit the diagnostic based on whether there is some already-diagnosed error.

It does mean we'll need to add support in other places, but I would somewhat want to consider in that place whether the diagnostic makes sense or not, and also whether or what recovery would be best given an error token. Even if we end up making similar choices, the context of teh choice seems relevant, so I wouldn't necessarily factor it until/unless we find some more underlying pattern we want to model with that factoring?

That said, some of my thinking is just initial thinking here. I've only really looked at the expression case so far, so I'm more comfortable confining the change to that. When we get to other cases, can always revisit?

Okay, but I'll note, my assumption would've been in the opposite direction -- do the more generic thing, and revisit if it has a problem.

chandlerc · 2024-10-24T16:33:29Z

Okay, but I'll note, my assumption would've been in the opposite direction -- do the more generic thing, and revisit if it has a problem.

To be clear, for me the bigger thing is sinking that commonality into Emit -- I think that making this conditionally emit based on the location token kind doesn't seem like a great API. If we want a generic thing, I think we should build something more dedicated to that, and so it seemed more like building a new generic thing rather than using an existing one.

I also asked @zygoloid to take a look though, maybe he has other thoughts here.

An invalid parse due to an error token isn't likely a great diagnostic as it will already have been diagnosed by the lexer. A common case to start handling that is when the parser encounters an invalid token when expecting an expression. This removes a number of unhelpful diagnostics after the lexer has done a good job diagnosing. This also means that there may be parse tree errors that aren't diagnosed when there are lexer-diagnosed errors, so track that.

zygoloid · 2024-10-25T16:45:58Z

I also asked @zygoloid to take a look though, maybe he has other thoughts here.

Some discussion of this PR moved to discord, and I wrote up some of my thoughts there.

jonmeow

Approving. I trust you'll adjust the case, and I assume any commonality from zygoloid's comment will be handled separately (if there's anything to do right now)

github-actions bot requested a review from jonmeow October 20, 2024 08:40

github-actions bot added the toolchain label Oct 20, 2024

jonmeow reviewed Oct 21, 2024

View reviewed changes

chandlerc force-pushed the skip-error-expr branch from 70c7987 to 144ae9c Compare October 24, 2024 23:17

jonmeow approved these changes Oct 25, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Start avoiding parse diagnostics on error tokens #4431

Start avoiding parse diagnostics on error tokens #4431

chandlerc commented Oct 20, 2024 •

edited

Loading

jonmeow left a comment

jonmeow Oct 21, 2024

chandlerc Oct 24, 2024

jonmeow Oct 24, 2024

chandlerc Oct 24, 2024

jonmeow Oct 25, 2024

chandlerc Oct 25, 2024

chandlerc commented Oct 24, 2024

jonmeow commented Oct 24, 2024

chandlerc commented Oct 24, 2024

zygoloid commented Oct 25, 2024

jonmeow left a comment

Start avoiding parse diagnostics on error tokens #4431

Are you sure you want to change the base?

Start avoiding parse diagnostics on error tokens #4431

Conversation

chandlerc commented Oct 20, 2024 • edited Loading

jonmeow left a comment

Choose a reason for hiding this comment

jonmeow Oct 21, 2024

Choose a reason for hiding this comment

chandlerc Oct 24, 2024

Choose a reason for hiding this comment

jonmeow Oct 24, 2024

Choose a reason for hiding this comment

chandlerc Oct 24, 2024

Choose a reason for hiding this comment

jonmeow Oct 25, 2024

Choose a reason for hiding this comment

chandlerc Oct 25, 2024

Choose a reason for hiding this comment

chandlerc commented Oct 24, 2024

jonmeow commented Oct 24, 2024

chandlerc commented Oct 24, 2024

zygoloid commented Oct 25, 2024

jonmeow left a comment

Choose a reason for hiding this comment

chandlerc commented Oct 20, 2024 •

edited

Loading