-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement ForVariableUsedAfterBlock
(based on wemake_python_styleguide
ControlVarUsedAfterBlockViolation
WPS441
)
#11769
base: main
Are you sure you want to change the base?
Changes from all commits
c1091a1
261fbc3
8bed302
e85fe87
eecc2af
3661066
59a2e3b
0e491b3
9ca72a6
0ad0df0
eaf2a35
710d68c
a7de1cb
f2a62bb
97b165c
7ceb306
acc5b2a
f5aa633
026a48f
544bbf5
2d13a62
1d18c66
7edd539
5351952
19a0ab9
9e2ece4
d675a03
71ca5e1
8a11073
e59efa2
5a76166
106e78a
8b3e4de
59427f1
63ecbfc
27dad4f
e7653df
9bcc4fe
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,81 @@ | ||
for global_var in []: | ||
_ = global_var | ||
pass | ||
|
||
# ❌ | ||
_ = global_var | ||
|
||
def foo(): | ||
# For control var used outside block | ||
for event in []: | ||
_ = event | ||
pass | ||
|
||
# ❌ | ||
_ = event | ||
|
||
for x in []: | ||
_ = x | ||
pass | ||
|
||
# Using the same control variable name in a different loop is ok | ||
for x in []: | ||
_ = x | ||
pass | ||
|
||
for y in []: | ||
# ❌ x is used outside of the loop it was defined in (meant to use y) | ||
if x == 5: | ||
pass | ||
|
||
# Assign a variable before the loop | ||
room_id = 3 | ||
_ = room_id | ||
# Use the same variable name in a loop | ||
for room_id in []: | ||
_ = room_id | ||
pass | ||
|
||
# ❌ After the loop is not ok because the value is probably not what you expect | ||
_ = room_id | ||
|
||
# ❌ Augmented assignment is not allowed because the value is probably not what you expect | ||
room_id += 1 | ||
|
||
# Assigning again after the loop is ok | ||
room_id = 5 | ||
room_id += 1 | ||
_ = room_id | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use in combination with
|
||
# Tuple destructuring | ||
for a, b, c in []: | ||
_ = a | ||
_ = b | ||
_ = c | ||
pass | ||
|
||
# ❌ | ||
_ = a | ||
_ = b | ||
_ = c | ||
|
||
# Array destructuring | ||
for [d, e, f] in []: | ||
_ = d | ||
_ = e | ||
_ = f | ||
pass | ||
|
||
# ❌ | ||
_ = d | ||
_ = e | ||
_ = f | ||
|
||
# Nested function and class definitions are fine | ||
for potential_power in []: | ||
def action(): | ||
print(potential_power) | ||
|
||
class Animal: | ||
power_level = potential_power | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
use ruff_diagnostics::{Diagnostic, Violation}; | ||
use ruff_macros::{derive_message_formats, violation}; | ||
use ruff_python_semantic::Scope; | ||
use ruff_text_size::Ranged; | ||
|
||
use crate::checkers::ast::Checker; | ||
|
||
/// ## What it does | ||
/// Checks for variables defined in `for` loops that are used outside of their | ||
/// respective blocks. | ||
/// | ||
/// ## Why is this bad? | ||
/// Usage of of a control variable outside of the block they're defined in will probably | ||
/// lead to flawed logic in a way that will likely cause bugs. The variable might not | ||
/// contain what you expect. | ||
/// | ||
/// In Python, unlike many other languages, `for` loops don't define their own scopes. | ||
/// Therefore, usage of the control variables outside of the block will be the the value | ||
/// from the last iteration until re-assigned. | ||
/// | ||
/// While this mistake is easy to spot in small examples, it can be hidden in larger | ||
/// blocks of code, where the the loop and downstream usage may not be visible at the | ||
/// same time. | ||
/// | ||
/// ## Example | ||
/// ```python | ||
/// for x in range(10): | ||
/// pass | ||
/// | ||
/// print(x) # prints 9 | ||
/// ``` | ||
#[violation] | ||
pub struct ForVariableUsedAfterBlock { | ||
control_var_name: String, | ||
} | ||
|
||
impl Violation for ForVariableUsedAfterBlock { | ||
#[derive_message_formats] | ||
fn message(&self) -> String { | ||
let ForVariableUsedAfterBlock { control_var_name } = self; | ||
|
||
format!("`for` loop variable {control_var_name} is used outside of block") | ||
} | ||
} | ||
|
||
/// Based on wemake-python-styleguide (WPS441) to forbid control variables after the block body. | ||
pub(crate) fn for_variable_used_after_block( | ||
checker: &Checker, | ||
scope: &Scope, | ||
diagnostics: &mut Vec<Diagnostic>, | ||
) { | ||
// Find for-loop variable bindings | ||
let loop_var_bindings = scope | ||
.all_bindings() | ||
.map(|(name, binding_id)| (name, checker.semantic().binding(binding_id))) | ||
.filter_map(|(name, binding)| binding.kind.is_loop_var().then_some((name, binding))); | ||
|
||
for (name, binding) in loop_var_bindings { | ||
// This is safe to unwrap because we're only dealing with loop var bindings and | ||
// this can only be `None` for built-ins. | ||
let binding_source_node_id = binding.source.unwrap(); | ||
// The node_id of the for-loop that contains the binding | ||
let binding_statement_id = checker.semantic().statement_id(binding_source_node_id); | ||
let binding_statement = checker.semantic().statement(binding_source_node_id); | ||
|
||
// Loop over the references of those bindings to see if they're in the same block-scope | ||
'references: for reference in binding.references() { | ||
let reference = checker.semantic().reference(reference); | ||
// This is safe to unwrap because a loop var binding can't be a built-in so | ||
// any reference to the loop var won't be a built-in where this could be `None`. | ||
let reference_node_id = reference.expression_id().unwrap(); | ||
MadLittleMods marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
// Skip any reference that come before the control var binding in the source | ||
// order, skip it because people can assign and use the same variable name | ||
// above the block. | ||
if reference.range().end() < binding_statement.range().start() { | ||
continue; | ||
} | ||
|
||
// Traverse the hierarchy and look for a block match | ||
let statement_hierarchy = checker.semantic().parent_statement_ids(reference_node_id); | ||
for ancestor_node_id in statement_hierarchy { | ||
if binding_statement_id == ancestor_node_id { | ||
continue 'references; | ||
} | ||
} | ||
|
||
// The reference wasn't used in the same block, report a violation/diagnostic | ||
diagnostics.push(Diagnostic::new( | ||
ForVariableUsedAfterBlock { | ||
control_var_name: name.to_owned(), | ||
}, | ||
reference.range(), | ||
)); | ||
} | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,103 @@ | ||
--- | ||
source: crates/ruff_linter/src/rules/ruff/mod.rs | ||
--- | ||
for_variable_used_after_block.py:6:5: RUF032 `for` loop variable global_var is used outside of block | ||
| | ||
5 | # ❌ | ||
6 | _ = global_var | ||
| ^^^^^^^^^^ RUF032 | ||
7 | | ||
8 | def foo(): | ||
| | ||
|
||
for_variable_used_after_block.py:15:9: RUF032 `for` loop variable event is used outside of block | ||
| | ||
14 | # ❌ | ||
15 | _ = event | ||
| ^^^^^ RUF032 | ||
16 | | ||
17 | for x in []: | ||
| | ||
|
||
for_variable_used_after_block.py:28:12: RUF032 `for` loop variable x is used outside of block | ||
| | ||
26 | for y in []: | ||
27 | # ❌ x is used outside of the loop it was defined in (meant to use y) | ||
28 | if x == 5: | ||
| ^ RUF032 | ||
29 | pass | ||
| | ||
|
||
for_variable_used_after_block.py:40:9: RUF032 `for` loop variable room_id is used outside of block | ||
| | ||
39 | # ❌ After the loop is not ok because the value is probably not what you expect | ||
40 | _ = room_id | ||
| ^^^^^^^ RUF032 | ||
41 | | ||
42 | # ❌ Augmented assignment is not allowed because the value is probably not what you expect | ||
| | ||
|
||
for_variable_used_after_block.py:43:5: RUF032 `for` loop variable room_id is used outside of block | ||
| | ||
42 | # ❌ Augmented assignment is not allowed because the value is probably not what you expect | ||
43 | room_id += 1 | ||
| ^^^^^^^ RUF032 | ||
44 | | ||
45 | # Assigning again after the loop is ok | ||
| | ||
|
||
for_variable_used_after_block.py:58:9: RUF032 `for` loop variable a is used outside of block | ||
| | ||
57 | # ❌ | ||
58 | _ = a | ||
| ^ RUF032 | ||
59 | _ = b | ||
60 | _ = c | ||
| | ||
|
||
for_variable_used_after_block.py:59:9: RUF032 `for` loop variable b is used outside of block | ||
| | ||
57 | # ❌ | ||
58 | _ = a | ||
59 | _ = b | ||
| ^ RUF032 | ||
60 | _ = c | ||
| | ||
|
||
for_variable_used_after_block.py:60:9: RUF032 `for` loop variable c is used outside of block | ||
| | ||
58 | _ = a | ||
59 | _ = b | ||
60 | _ = c | ||
| ^ RUF032 | ||
61 | | ||
62 | # Array destructuring | ||
| | ||
|
||
for_variable_used_after_block.py:70:9: RUF032 `for` loop variable d is used outside of block | ||
| | ||
69 | # ❌ | ||
70 | _ = d | ||
| ^ RUF032 | ||
71 | _ = e | ||
72 | _ = f | ||
| | ||
|
||
for_variable_used_after_block.py:71:9: RUF032 `for` loop variable e is used outside of block | ||
| | ||
69 | # ❌ | ||
70 | _ = d | ||
71 | _ = e | ||
| ^ RUF032 | ||
72 | _ = f | ||
| | ||
|
||
for_variable_used_after_block.py:72:9: RUF032 `for` loop variable f is used outside of block | ||
| | ||
70 | _ = d | ||
71 | _ = e | ||
72 | _ = f | ||
| ^ RUF032 | ||
73 | | ||
74 | # Nested function and class definitions are fine | ||
| |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -25,7 +25,8 @@ pub struct Binding<'a> { | |
pub scope: ScopeId, | ||
/// The context in which the [`Binding`] was created. | ||
pub context: ExecutionContext, | ||
/// The statement in which the [`Binding`] was defined. | ||
/// The statement in which the [`Binding`] was defined. `None` if the [`Binding`] | ||
/// comes from a built-in. | ||
Comment on lines
+28
to
+29
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please double-check the accuracy of these comments. In any case, I'd like to document the cases so we don't have to run through this uncertainty in the future. |
||
pub source: Option<NodeId>, | ||
/// The references to the [`Binding`]. | ||
pub references: Vec<ResolvedReferenceId>, | ||
|
@@ -232,7 +233,8 @@ impl<'a> Binding<'a> { | |
locator.slice(self.range) | ||
} | ||
|
||
/// Returns the statement in which the binding was defined. | ||
/// Returns the statement in which the binding was defined. `None` if the binding | ||
/// comes from a built-in. | ||
pub fn statement<'b>(&self, semantic: &SemanticModel<'b>) -> Option<&'b Stmt> { | ||
self.source | ||
.map(|statement_id| semantic.statement(statement_id)) | ||
|
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable declared outside the loop
I think it's intentional that we catch the
indico
example as part of the rule. We have an example in the test fixture file for this.The other example from
pandas
could be addressed with some fancy control flow analysis (see other discussion)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I think there are two different reasons why you want to avoid using a loop variable after the loop has finished:
The conclusion I draw from this is that we in fact need two rules here:
For now I think I'd implement the first rule; we can consider the second rule as a followup.