-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: String literals in MLIR are not as permissive as those in C #468
Comments
This should be fixed in LLVM 18. @Jezurko can you check this? |
This should only happen in operations/types that use a I think it's preferable to wait until the fix gets into a release, as we would be duplicating logic while bringing more complexity to our TableGen code. *soon to be replaced by |
"Should probably do it" is not exactly a step to reproduce, please next time provide proper steps. |
Just in case I didn't understand you properly - try it with #467 which is removing the |
I will test this branch once it is merge to master. The issue I observed is in this MLIR function: /// Lex a string literal.
///
/// string-literal ::= '"' [^"\n\f\v\r]* '"'
///
/// TODO: define escaping rules.
Token Lexer::lexString(const char *tokStart) { Found here. I figured the issue was actually in the pretty printers, i.e. that when pretty printing string literals, an extra add slashes step would be needed, so that MLIR's string literal lexer would not complain. |
@lkorenc, @Jezurko I've now confirmed that my suggested example is indeed enough to reproduce this issue. I have not yet tested against #467. Again, the key is that the relevant error message (shown below) only manifests when parsing the human-readable MLIR, and is not caught by the MLIR verifier. Example error:
|
@pgoodman sorry, my fault. I now reproduced the issue on master and can also confirm that it is fixed with the PR (because the LLVM/MLIR printer uses the hex codes instead of the "special" characters - and therefore it can also parse it) |
VAST version
master
LLVM version
17
Operating system
All
Description
String literals are more permissive in C. Especially those found in things like
__attribute__((section("...")))
, which are "unevaluated" string literals. C string literals support more escape formats than those supported by MLIR's string literals. Unevaluated C string literals can contain arbitrary / made up escape strings/formats.VAST should introduce additional escape characters into the string literals where necessary.
Steps to Reproduce
const char * x = "\r";
should probably do it. Unfortunately MLIR's verifier doesn't catch this, so you need to round-trip and parse it.The text was updated successfully, but these errors were encountered: