Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PLE2510, PLE2512, PLE2513, PLE2514, and PLE2515 fixes add escape sequences to raw strings #13294

Open
dscorbett opened this issue Sep 9, 2024 · 6 comments · May be fixed by #13882
Open

PLE2510, PLE2512, PLE2513, PLE2514, and PLE2515 fixes add escape sequences to raw strings #13294

dscorbett opened this issue Sep 9, 2024 · 6 comments · May be fixed by #13882
Labels
bug Something isn't working fixes Related to suggested fixes for violations help wanted Contributions especially welcome

Comments

@dscorbett
Copy link

The fixes for PLE2510, PLE2512, PLE2513, PLE2514, and PLE2515 add escape sequences to raw strings, which is an invalid change because raw strings don’t support escape sequences. The fixes should convert the raw strings to normal strings first.

$ ruff --version
ruff 0.6.4
$ printf 'print(r"""\x08\x1a\x1b\x00\xe2\x80\x8b\n""")\n' >ple251.py
$ python3.10 ple251.py | xxd -p
081a1b0a
$ ruff check --isolated --select PLE251 ple251.py --fix
Found 5 errors (5 fixed, 0 remaining).
$ python3.10 ple251.py
\b\x1A\x1B\0\u200b

I’m using Python 3.10 for this example because literal null characters are syntax errors in later versions.

@MichaReiser
Copy link
Member

@dscorbett how do you find all these incorrect fixes. Very impressive and thanks for opening an issue!

@MichaReiser
Copy link
Member

Converting the raw string to a regular string might not be trivial. I think it would also be reasonable to remove the fix for raw strings.

@MichaReiser MichaReiser added bug Something isn't working fixes Related to suggested fixes for violations help wanted Contributions especially welcome labels Sep 10, 2024
@dscorbett
Copy link
Author

I don’t know how hard it is to implement in Ruff, but theoretically, converting raw strings to regular strings should be straightforward. Each \ becomes \\. Additionally, if it was doubling as a line continuation character in a non-triple-quoted string, the following newline becomes \n; or if it was escaping a quotation mark, the ' or " becomes \' or \". The rationale for avoiding these 5 characters (viz. it is confusing) applies to raw strings just as much as regular strings, so I think it is better to fix the fix for raw strings rather than removing it.

I find all these incorrect fixes by reading the rule descriptions and/or implementations and thinking of edge cases that might have been missed. Basically, it’s glass-box fuzz-testing done manually.

@MichaReiser
Copy link
Member

I find all these incorrect fixes by reading the rule descriptions and/or implementations and thinking of edge cases that might have been missed. Basically, it’s glass-box fuzz-testing done manually.

Wow, that's a very impressive skill!

@PedroTurik
Copy link

PedroTurik commented Sep 17, 2024

I can work on this one. If I understand this correctly, I can start by adding the replacement logic here, right? I am a first time contributor

Also, I am probably missing something here, but where should I add replacement tests?

@MichaReiser
Copy link
Member

MichaReiser commented Sep 18, 2024

I can work on this one. If I understand this correctly, I can start by adding the replacement logic here, right? I am a first time contributor

Awesome and welcome. Yes, this looks about right

Also, I am probably missing something here, but where should I add replacement tests?

You can add them to this file

https://github.com/astral-sh/ruff/blob/f70e8a7524b255c20e3dc966141b6e80d9697958/crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py#L1-L0

The tests are defined here

#[test_case(Rule::InvalidCharacterBackspace, Path::new("invalid_characters.py"))]
#[test_case(Rule::InvalidCharacterEsc, Path::new("invalid_characters.py"))]
#[test_case(Rule::InvalidCharacterNul, Path::new("invalid_characters.py"))]
#[test_case(Rule::InvalidCharacterSub, Path::new("invalid_characters.py"))]
#[test_case(
Rule::InvalidCharacterZeroWidthSpace,
Path::new("invalid_characters.py")
)]
#[test_case(
Rule::InvalidCharacterBackspace,
Path::new("invalid_characters_syntax_error.py")
)]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fixes Related to suggested fixes for violations help wanted Contributions especially welcome
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants