-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Outstanding Grammar Issues #3924
Comments
Thanks for the useful and actionable list @vmg. One question...
Is this the actual upstream grammar as it stands now or the old pinned copy we're shipping with Linguist at the moment? I'm assuming the latter, but thought I'd check to be sure. |
Correct! I fucked up my submodules. Sorry about that, I'll update the list with the proper URL. |
Hey, sorry about the (really) slow response. My MacBook died last week, which means I've been painfully limited in what I'm able to do on GitHub (I'm using my work's computer for the time being, when time permits). The issues reported by my grammars are easily fixed; but the LilyPond grammar should be fixed with a PR to the upstream repository. Basically, the |
What's the maximum token length permitted by the new compiler? I was about to start fixing the issues with Emacs Lisp's grammar, but realised I don't have the actual limit to go by. Admittedly, I'm not really fond about the size limit, because the "fix" here is to simply break the pattern down into multiple rules that're bunched together under the same name. It feels terribly hacky, and the fact that the rules in question were compiled from an external source means that updating the list in future might be made more complicated... |
@Alhadis: Sorry, you caught me on Holidays. The maximum size is enforced by PCRE, not by our parser, and it's 64kb for a single regexp. I'm aware it's a bummer, but it's the way PCRE was designed. |
That's understandable. How is whitespace treated inside expressions which use "expanded" notation...? m/
abc
(?:
xyz
)
(?=\w+)
/x; Because there are two different ways to represent that in CSON. One is with an ordinary quoted-string, which includes embedded newlines as part of the pattern... pattern: "(?x)
abc
(?:
xyz
)
(?=\\w+)
" ... and the other is to use triple-quoted strings ("heredocs"): pattern: """(?x)
abc
(?:
xyz
)
(?=\\w+)
""" The latter will strip as much indentation as it can, leaving some (but not all) horizontal whitespace after the CSON-to-JSON conversion: (?x)
abc
(?:
xyz
)
(?=\\w+) Now this won't make any difference to the regex engine, but it will to my subdivision efforts... 😀 |
@Alhadis I'm honestly not sure of how exactly does PCRE implement this -- you should be able to test it out by simply downloading libpcre and trying to compile in the regexps. Our parser has no custom behavior here. |
Okay, that's the last of my grammars fixed. 😉 |
@Alhadis You closed this by mistake, right? 😸 |
Yeah, sorry. I didn't even notice I'd pressed the wrong button to comment. My mistake. 😓 |
@Alhadis 🙇🙇🙇 |
I've updated the sublime-mask output in the OP as the latest grammar compiler now prefer the "compiled" .tmLanguage file over the YAML file and sublime-mask has only updated the YAML file. I have pinged the author in tenbits/sublime-mask#1 asking them to update the |
The MATLAB "Unknown keys in grammar" error is resolved in mathworks/MATLAB-Language-grammar#38 😄 along with many other improvements |
@lildude when you get the time, please ✅ the MATLAB tasks 😄 |
I'm not sure where to post this so I'm just gonna post it here. I came across a TypeScript rendering/syntax highlighting issue the other day here on GitHub. I sent in a support ticket and they redirected me to this repo. The issue can be viewed here. Thanks |
@lildude The Godot Engine grammars should have been fixed in godotengine/godot-vscode-plugin#416. I attempted to validate the fixes using the grammar compiler instructions you gave us, but I don't know how to actually run linguist to make 100% sure that they're working now. |
Running Linguist wouldn't help you as it doesn't actually do the highlighting. This is done by an internal service so you can only go based on what the validator says, or not. |
The Python grammar referenced by this repo (MagicStack/MagicPython@7d0f2b2) includes support for the new Python |
@dragoncoder047 Unfortunately, GitHub uses a specialised Tree-Sitter parser for Python; the |
I guess it's a bug in the grammar then. It's already been reported (tree-sitter/tree-sitter-python#141) so I won't bother re-filing it. |
Hello hello, I don't know how this issue works but we supposedly eliminated the \p problems from the Raku repository. I'm curious about the result... |
@2colours things look better, but it doesn't appear all have been resolved yet: ➜ git submodule update --remote vendor/grammars/atom-language-perl6
Submodule path 'vendor/grammars/atom-language-perl6': checked out '190e4b38d53548b23263f9c399cd5172421aa057'
➜ script/grammar-compiler update -f
latest: Pulling from linguist/grammar-compiler
[...]
Status: Downloaded newer image for linguist/grammar-compiler:latest
docker.io/linguist/grammar-compiler:latest
442 / 442 100.00% 8s
done! processed 442 grammars
- [ ] repository `vendor/grammars/atom-language-perl6` (from https://github.com/perl6/atom-language-perl6) (4 errors)
- Invalid regex in grammar: `source.raku` (in `grammars/raku.tmLanguage.json`) contains a malformed regex (regex "`(?x) ( [\p{Digit}\pL\pM'\-_]+ ) `...": unknown property name after \P or \p (at offset 16))
- Invalid regex in grammar: `source.raku` (in `grammars/raku.tmLanguage.json`) contains a malformed regex (regex "`[\p{Digit}\pL\pM'\-_]+`": unknown property name after \P or \p (at offset 9))
- Invalid regex in grammar: `source.raku` (in `grammars/raku.tmLanguage.json`) contains a malformed regex (regex "`(?x)(?<!\\)(\$|@|%|&)(?!\$)(`...": unknown property name after \P or \p (at offset 141))
- Invalid regex in grammar: `source.raku` (in `grammars/raku.tmLanguage.json`) contains a malformed regex (regex "`(?x)(\$|@|%|&)(\.|\*|:|!|\^|~|`...": unknown property name after \P or \p (at offset 131))
[...] |
Not gonna lie, I'm perfectly clueless how Digit could remain when I even noted that it can/should be replaced to Nd... anyway, soon to be addressed. EDIT: here goes nothing... should be good now 🤞 |
@lildude Ping? |
@2colours Pong? 🤣 All looks good now. You'll see the benefit (if there's anything noticeable) when the next release is made. Thanks for addressing these issues 🙇 |
Should I be mentioning new issues I find in this thread? |
@AdamRaichu No. This is only for issues picked up by the grammar compiler. |
The following is a detailed list of all the outstanding issues in the grammars that GitHub.com uses for syntax highlighting the code in our website.
These issues are detected by our grammars compiler (#3915) and are probably causing minor rendering bugs in the website.
Help is very much welcome! If you're seeing bugs or rendering issues in your source code in GitHub, please start by taking a look at this list to make sure we're not detecting any issues in your language's grammar.
Feel free to ask any questions about any given issue and what would be the appropriate way to fix it. I'll keep the issue up-to-date as I work through grammar fixes myself.
cc @github/linguist @pchaigno @Alhadis
Last updated: 2 Sep 2024
repository
vendor/grammars/MATLAB-Language-grammar
(from https://github.com/mathworks/MATLAB-Language-grammar) (16 errors)source.matlab
(inMatlab.tmbundle/Syntaxes/MATLAB.tmLanguage
) contains a malformed regex (regex "(\)|(?<!\.{3}.*)\n)
": lookbehind assertion is not fixed length (at offset 15))source.matlab
(inMatlab.tmbundle/Syntaxes/MATLAB.tmLanguage
) contains a malformed regex (regex "(?<!\.{3}.*)(?:(?=([,;])(?![^(]*
...": lookbehind assertion is not fixed length (at offset 11))source.matlab
(inMatlab.tmbundle/Syntaxes/MATLAB.tmLanguage
) contains a malformed regex (regex "(?<!\.{3}.*)(?:(?=[,;](?![^(]*\)
...": lookbehind assertion is not fixed length (at offset 11))source.matlab
(inMatlab.tmbundle/Syntaxes/MATLAB.tmLanguage
) contains a malformed regex (regex "(?<!\.{3}.*)(?:(?=([,;])(?![^(]*
...": lookbehind assertion is not fixed length (at offset 11))source.matlab
(inMatlab.tmbundle/Syntaxes/MATLAB.tmLanguage
) contains a malformed regex (regex "(?<!\.{3}.*)(?:(?=([,;])(?![^(]*
...": lookbehind assertion is not fixed length (at offset 11))source.matlab
(inMatlab.tmbundle/Syntaxes/MATLAB.tmLanguage
) contains a malformed regex (regex "(?<!\.{3}.*)(?:(?=([,;])(?![^(]*
...": lookbehind assertion is not fixed length (at offset 11))source.matlab
(inMatlab.tmbundle/Syntaxes/MATLAB.tmLanguage
) contains a malformed regex (regex "(?<!\.{3}.*)(?:(?=[,;](?![^(]*\)
...": lookbehind assertion is not fixed length (at offset 11))source.matlab
(inMatlab.tmbundle/Syntaxes/MATLAB.tmLanguage
) contains a malformed regex (regex "(?<!\.{3}.*)(?:(?=([,;])(?![^(]*
...": lookbehind assertion is not fixed length (at offset 11))source.matlab
(inMatlab.tmbundle/Syntaxes/MATLAB.tmLanguage
) contains a malformed regex (regex "(?<!\.{3}.*)(?:(?=[,;](?![^(]*\)
...": lookbehind assertion is not fixed length (at offset 11))source.matlab
(inMatlab.tmbundle/Syntaxes/MATLAB.tmLanguage
) contains a malformed regex (regex "(?=;|,|(?<!(?:\.{3}.*))\n|%)
": lookbehind assertion is not fixed length (at offset 22))source.matlab
(inMatlab.tmbundle/Syntaxes/MATLAB.tmLanguage
) contains a malformed regex (regex "(\))?[^\S\n]*(?=;|,|(?<!(?:\.{3}
...": lookbehind assertion is not fixed length (at offset 35))source.matlab
(inMatlab.tmbundle/Syntaxes/MATLAB.tmLanguage
) contains a malformed regex (regex "(\)|(?<!\.{3}.*)\n)
": lookbehind assertion is not fixed length (at offset 15))source.matlab
(inMatlab.tmbundle/Syntaxes/MATLAB.tmLanguage
) contains a malformed regex (regex "(?<=\))|(?>(?<!\.{3}.*)\n)
": lookbehind assertion is not fixed length (at offset 22))source.matlab
(inMatlab.tmbundle/Syntaxes/MATLAB.tmLanguage
) contains a malformed regex (regex "(\)|(?<!\.{3}.*)\n)
": lookbehind assertion is not fixed length (at offset 15))source.matlab
(inMatlab.tmbundle/Syntaxes/MATLAB.tmLanguage
) contains a malformed regex (regex "(\)|(?<!\.{3}.*)\n)
": lookbehind assertion is not fixed length (at offset 15))source.matlab
(inMatlab.tmbundle/Syntaxes/MATLAB.tmLanguage
) contains a malformed regex (regex "(\}|(?<!\.{3}.*)\n)
": lookbehind assertion is not fixed length (at offset 15))repository
vendor/grammars/TypeScript-TmLanguage
(from https://github.com/Microsoft/TypeScript-TmLanguage) (2 errors)source.ts
(inTypeScript.tmLanguage
) contains a malformed regex (regex "(?!(?<![_$[:alnum:]])(?:(?<=\.\.
...": lookbehind assertion is not fixed length (at offset 731))source.tsx
(inTypeScriptReact.tmLanguage
) contains a malformed regex (regex "(?!(?<![_$[:alnum:]])(?:(?<=\.\.
...": lookbehind assertion is not fixed length (at offset 731))repository
vendor/grammars/abl-tmlanguage
(from https://github.com/chriscamicas/abl-tmlanguage) (2 errors)source.abl
(inabl.tmLanguage.json
) contains a malformed regex (regex "(?i)(?<=\s+|^)(for|preselect)[\s
...": lookbehind assertion is not fixed length (at offset 11))source.abl
(inabl.tmLanguage.json
) contains a malformed regex (regex "(?i)(?<=^|\s*)(today|now)(?!\w|-
...": lookbehind assertion is not fixed length (at offset 13))repository
vendor/grammars/atom-language-julia
(from https://github.com/JuliaEditorSupport/atom-language-julia) (1 errors)source.julia
(ingrammars/julia.cson
) contains a malformed regex (regex "(?<=\S\s+)\b(as)\b(?=\s+\S)
": lookbehind assertion is not fixed length (at offset 9))repository
vendor/grammars/c.tmbundle
(from https://github.com/textmate/c.tmbundle) (4 errors)source.c.platform
(inSyntaxes/Platform.tmLanguage
) contains a malformed regex (regex "\b(?:A(?:APNot(?:CreatedErr|Foun
...": definition too long (282084 bytes))source.c.platform
(inSyntaxes/Platform.tmLanguage
) contains a malformed regex (regex "\b(?:A(?:E(?:A(?:ddressDesc|rray
...": definition too long (52248 bytes))source.c.platform
(inSyntaxes/Platform.tmLanguage
) contains a malformed regex (regex "\b(?:CATransform3DIdentity|KERNE
...": definition too long (33340 bytes))source.c.platform
(inSyntaxes/Platform.tmLanguage
) contains a malformed regex (regex "(\s*)(\b(?:A(?:E(?:Build(?:Apple
...": definition too long (58589 bytes))repository
vendor/grammars/csharp-tmLanguage
(from https://github.com/dotnet/csharp-tmLanguage) (4 errors)source.cs
(ingrammars/csharp.tmLanguage
) contains a malformed regex (regex "\G(?=(?~\*/)$)
": unrecognized character after (? or (?- (at offset 7))source.cs
(ingrammars/csharp.tmLanguage
) contains a malformed regex (regex "^(\s*+)(\*(?!/))?(?=(?~\*/)$)
": unrecognized character after (? or (?- (at offset 22))source.cs
(ingrammars/csharp.tmLanguage
) contains a malformed regex (regex "(?<!\.\s*)\b(await)\b
": lookbehind assertion is not fixed length (at offset 9))source.cs
(ingrammars/csharp.tmLanguage
) contains a malformed regex (regex "(?<!\.\s*)\b(await)\b
": lookbehind assertion is not fixed length (at offset 9))repository
vendor/grammars/gap-tmbundle
(from https://github.com/dhowden/gap-tmbundle) (3 errors)source.gap
(inSyntaxes/GAP.tmLanguage
) contains a malformed regex (regex "\b(16Bits_AssocWord|16Bits_Depth
...": definition too long (65523 bytes))source.gap
(inSyntaxes/GAP.tmLanguage
) contains a malformed regex (regex "\b(IndicesChiefNormalSteps|Indic
...": definition too long (65529 bytes))source.gap
(inSyntaxes/GAP.tmLanguage
) contains a malformed regex (regex "\b(SMTX_GoodElementGModule|SMTX_
...": definition too long (42470 bytes))repository
vendor/grammars/godot-vscode-plugin
(from https://github.com/godotengine/godot-vscode-plugin) (1 errors)source.gdscript
(insyntaxes/GDScript.tmLanguage.json
) contains a malformed regex (regex "(?<!/\s*)(\$|%|\$%)([a-zA-Z_]\w*
...": lookbehind assertion is not fixed length (at offset 8))repository
vendor/grammars/linter-lilypond
(from https://github.com/nwhetsell/linter-lilypond) (1 errors)source.lilypond
(ingrammars/lilypond.cson
) contains a malformed regex (regex "(?<!-)\b(!=|\*(?:(?:location|par
...": definition too long (35491 bytes))repository
vendor/grammars/mathematica-tmbundle
(from https://github.com/shadanan/mathematica-tmbundle) (1 errors)source.mathematica
(inSyntaxes/Mathematica.tmLanguage
) contains a malformed regex (regex "(\b|(?<=_))(Abort|AbortKernels|A
...": definition too long (54020 bytes))repository
vendor/grammars/nu-grammar
(from https://github.com/hustcer/nu-grammar.git) (1 errors)source.nushell
(ingrammars/tmLanguage.json
) contains a malformed regex (regex "(?<=]\s*)(:)\s+(\[)
": lookbehind assertion is not fixed length (at offset 8))repository
vendor/grammars/objective-c.tmbundle
(from https://github.com/textmate/objective-c.tmbundle) (2 errors)source.objc.platform
(inSyntaxes/Platform.tmLanguage
) contains a malformed regex (regex "\b(?:AB(?:AddRecordsError|Multip
...": definition too long (32854 bytes))source.objc.platform
(inSyntaxes/Platform.tmLanguage
) contains a malformed regex (regex "\b(?:A(?:M(?:Action(?:A(?:pplica
...": definition too long (44404 bytes))repository
vendor/grammars/sublime-autoit
(from https://github.com/AutoIt/SublimeAutoItScript) (2 errors)source.autoit
(inAutoIt.tmLanguage
) contains a malformed regex (regex "\b(?i:_array1dtohistogram|_array
...": definition too long (39591 bytes))source.autoit
(inAutoIt.tmLanguage
) contains a malformed regex (regex "\b(?i:_guictrltoolbar_getbuttoni
...": definition too long (39600 bytes))repository
vendor/grammars/turtle.tmbundle
(from https://github.com/peta/turtle.tmbundle) (5 errors)source.turtle
(inSyntaxes/Turtle.tmLanguage
) contains a malformed regex (regex "(?x) (?<PN_CHARS_U>[\p{L}\p{M
...": PCRE does not support \L, \l, \N{name}, \U, or \u (at offset 73))source.turtle
(inSyntaxes/Turtle.tmLanguage
) contains a malformed regex (regex "(?x)( (?: [\p{L}\p{M}] | [:0
...": PCRE does not support \L, \l, \N{name}, \U, or \u (at offset 121))source.turtle
(inSyntaxes/Turtle.tmLanguage
) contains a malformed regex (regex "\[[\u20\u9\uD\uA]*\]
": PCRE does not support \L, \l, \N{name}, \U, or \u (at offset 4))source.turtle
(inSyntaxes/Turtle.tmLanguage
) contains a malformed regex (regex "(?x)((?<=\s|^|_)(?:[\p{L}\p{M}]
...": PCRE does not support \L, \l, \N{name}, \U, or \u (at offset 57))source.turtle
(inSyntaxes/Turtle.tmLanguage
) contains a malformed regex (regex "(?x) (?<PNAME_NS> (?: (?: [\
...": PCRE does not support \L, \l, \N{name}, \U, or \u (at offset 68))repository
vendor/grammars/vscode-jest
(from https://github.com/jest-community/vscode-jest) (1 errors)syntaxes/ExtSettingsSchema.json
failed to parse: Undeclared scope in grammar:syntaxes/ExtSettingsSchema.json
has no scope namerepository
vendor/grammars/vscode-move-syntax
(from https://github.com/damirka/vscode-move-syntax.git) (3 errors)source.move
(insyntaxes/move.tmLanguage.json
) contains a malformed regex (regex "(?<=\b(module|spec)\b)
": lookbehind assertion is not fixed length (at offset 21))source.move
(insyntaxes/move.tmLanguage.json
) contains a malformed regex (regex "(?<=\b(module|spec))
": lookbehind assertion is not fixed length (at offset 19))source.move
(insyntaxes/move.tmLanguage.json
) contains a malformed regex (regex "(?<=\b(module|spec))
": lookbehind assertion is not fixed length (at offset 19))repository
vendor/grammars/vscode-vba
(from https://github.com/serkonda7/vscode-vba.git) (1 errors)source.vba
(insyntaxes/vba.yaml-tmlanguage
) contains a malformed regex (regex "(?i:\b(?:(?<=(Call|Function|Sub)
...": lookbehind assertion is not fixed length (at offset 33))repository
vendor/grammars/vscode-yara
(from https://github.com/infosec-intern/vscode-yara.git) (1 errors)source.yara
(inyara/syntaxes/yara.tmLanguage.json
) contains a malformed regex (regex "(?<=(^|[\)]|\b(?:them)\b))(?:\s*
...": lookbehind assertion is not fixed length (at offset 25))Other
vendor/grammars/Sublime-QML
- skozlovf/Sublime-QML - the project has been restructured and rewritten and only provides a Sublime 3 compatible grammar which is not supported.This grammar will need to be replaced and will no longer be updated.
The text was updated successfully, but these errors were encountered: