-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attempt to recover from invalid YAML #107
Conversation
lib/src/scanner.dart
Outdated
throw YamlException("Expected ':'.", _scanner.emptySpan); | ||
if (_recover) { | ||
_tokens.insert(key.tokenNumber - _tokensParsed, | ||
Token(TokenType.key, key.location.pointSpan() as FileSpan)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still need to produce a diagnostic at this point, but then continue to parse the code. In order to do that we'll need to have a way of reporting diagnostics that doesn't involve throwing an exception.
We will eventually want recovery mode to never throw an exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed a change that starts this, although I have some questions:
- I added a new
YamlError
class though it's fairly similar toYamlException
- should the exceptions just be collected instead? - Should all errors this way (even where we don't currently recover)? For ex. replace all
throw YamlException
s with a call to a function that emits the error and then either recovers or throws (perhaps passing a function that performs the recovery, so that in future it could be made non-nullable to ensure no locations will throwYamlExceptions
when recovery is enabled).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... should the exceptions just be collected instead?
I'd probably start with that until / unless we discover a reason to change it.
Should all errors this way (even where we don't currently recover)?
Eventually the parser should recover from all errors in recovery mode. Some of the errors are probably already things that the parser could safely proceed from, so they should be easy to convert, but proceeding from some of them (without first adding recovery-specific code) would likely produce a large number of useless follow-on diagnostics.
My advice is to continue to throw everywhere that we haven't tested the recover behavior. I'd structure the recovery code like this:
if (detectedAnError) {
reportError(YamlException(...));
// recover from the error.
}
where reportError
either throws or stores the exception depending on whether we're in recovery mode. That way the recovery code will only be executed in recovery mode and is guaranteed to not slow down the parser when parsing valid input.
The code in server that calls the parser then needs to check the stored exceptions whether one was thrown or not.
... perhaps passing a function that performs the recovery ...
I've tried writing parsers with that behavior, but it turns out that it's really hard, in general, to recover well in all cases when operating at a distance like that. I'd recommend against doing it that way and recommend that we keep the recovery code in-line. (The original version of the shared Dart parser was implemented that way, and had to be converted in order to get reasonable recovery.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should the exceptions just be collected instead?
I'd probably start with that until / unless we discover a reason to change it.
Presumably changing the types of errors later would be a breaking change, though I'm not sure if there would ever be reason to. I'll switch to the exceptions for now - perhaps if the errors were extended in future, it'd be useful to have any new information also on the exception anyway.
where
reportError
either throws or stores the exception depending on whether we're in recovery mode.
This is what I started originally, though (if I understand correctly) means reportError
can't be called if the error doesn't currently have recovery code. I wasn't sure if that was ok - though it sounds like it probably is, so I'll update it.
What I had been thinking about was:
if (detectedAnError) {
reportError(YamlException(), () {
// recover from the error
});
}
The recovery code is still in-line, but in the future making the callback non-null would (hopefully) make it harder to add code reporting new errors (assuming you call reportError
) without considering recovery. I'm not sure if there's a real benefit there though (given YAML is fairly static, maybe there won't be much code added for new errors in future).
Scanner(String source, {Uri? sourceUrl}) | ||
: _scanner = SpanScanner.eager(source, sourceUrl: sourceUrl); | ||
Scanner(String source, | ||
{Uri? sourceUrl, bool recover = false, ErrorListener? errorListener}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't currently see any value in passing in an errorListener
if recovery
is false
, nor any value in setting recovery
to true
if there's no errorListener
. We might consider having the value of recovery
be implied by errorListener != null
(that is, make the error listener the only parameter, with null
implying no recovery).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any value in setting
recovery
totrue
if there's noerrorListener
In the server where I'm testing this, we care about recovery but not errors (diagnostics aren't produced here, it's in response to a completion request):
I could create a NullErrorListener
and we could pass that, although it may seem a bit strange to pass a NullErrorListener
in order to trigger recovery. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that would be a bit odd. We might still want to add an assert that the error listener is always null
when recovery
it false
, because that combination doesn't really make sense. If recovery
is false
(and there's an error in the YAML) then the single exception will always be thrown and there's no value to having a listener.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
lib/yaml.dart
Outdated
/// | ||
/// If [errorListener] is supplied, its onError method will be called for each | ||
/// error. If [recover] is false, parsing will end early so only the first error | ||
/// may be emitted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this paragraph is currently true. There's currently only one thrown exception that's sent to the error listener. Which means that any thrown exception needs to be caught and added to the list of exceptions to report, unless it was already reported to the error listener. I think it would be cleaner if exceptions that are going to be thrown are not reported to the error listener so that clients don't need to remember to make the extra check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
This is progress towards #102. Initially it covers from the case where a
key: value
is expected but so far only part of thekey
has been typed. This would usually raise an error"Expected ':'"
.