-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: presence of trailing comma (#416) #418
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Abhishek Kumar Gupta <[email protected]>
WalkthroughWalkthroughThis change addresses the issue of trailing commas in code snippets by introducing a logic to identify and remove lines with trailing commas. The update ensures that lines containing only commas are removed during the replacement process. Additionally, it modifies the Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant System
participant TestFramework
User->>System: Provides code snippet with trailing commas
System->>System: Identifies and removes trailing commas
System->>User: Returns updated code snippet
TestFramework->>System: Runs `trailing_comma_after_argument_removal()` test
System->>TestFramework: Provides updated code snippet
TestFramework->>TestFramework: Verifies the output
TestFramework->>User: Returns test results
Assessment against linked issues
Recent review detailsConfiguration used: .coderabbit.yaml Files selected for processing (2)
Files skipped from review as they are similar to previous changes (1)
Additional comments not posted (1)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
1 file(s) reviewed, no comment(s)
Edit PR Review Bot Settings
@morgante Pls review. All tests are passing locally for me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Outside diff range, codebase verification and nitpick comments (1)
crates/core/src/test.rs (1)
12017-12017
: Add a newline before the comment.Adding a newline before the comment improves readability.
+ // refer https://github.com/getgrit/gritql/issues/416
Signed-off-by: Abhishek Kumar Gupta <[email protected]>
crates/core/src/inline_snippets.rs
Outdated
@@ -234,12 +234,28 @@ pub(crate) fn inline_sorted_snippets_with_offset( | |||
} | |||
} | |||
|
|||
let mut trailing_comma_lines = vec![]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have a delete_hanging_comma
function that should be used - we don't want two different pieces of code dealing with orphaned commas.
Any solution should either be there or in a replacement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@morgante
Had tried that earlier. I mean, placing the code fix in delete_hanging_comma function or in sub functions called inside it. That resulted in failure of an existing edge test case. This happens since offsets are again calculated after calling delete_hanging_comma and then actual replacements happens.
So, I feel doing this fix during actual replacements is the best approach.
Pasting here the changes (or little variations of it, you will get the idea) I was trying with :
just manipulated the range indexes here.
fn delete_hanging_comma(
code: &str,
replacements: &mut [(EffectRange, String)],
offset: usize,
) -> Result<(String, Vec<ReplacementInfo>)> {
....
....
// Handle the case when after applying replacements, a single comma
// gets left behind on a single line
let mut temp_code = code.to_string();
for (range, snippet) in replacements.iter_mut() {
let mut adjusted_range = adjust_range(&range.effective_range(), offset, &temp_code)?;
if adjusted_range.start > temp_code.len() || adjusted_range.end > temp_code.len() {
bail!("Range {:?} is out of bounds for code:\n{}\n", adjusted_range, temp_code);
}
temp_code.replace_range(adjusted_range.clone(), snippet);
let line_start = temp_code[..adjusted_range.start].rfind('\n').map_or(0, |pos| pos + 1);
let line_end = temp_code[adjusted_range.start..].find('\n').map_or(temp_code.len(), |pos| adjusted_range.start + pos);
let line_content = temp_code[line_start..line_end].trim();
// Check for lines with a single comma after replacements
if line_content == "," {
// Ensure the next character is a newline
if line_end < temp_code.len() && temp_code.chars().nth(line_end) == Some('\n') {
adjusted_range.end = line_end + 1;
}
// Update the original range to replace the range with an empty string later.
range.range.end += adjusted_range.end - line_start; <--------------------------------------------
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to do it in one of those two locations. Please don't make our handling even more spaghetti.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@morgante hey one doubt I have :
In below test case (existing), shouldnt we remove the empty newline?
#[test]
fn only_deduplicate_rewrites6() {
run_test_expected({
TestArgExpected {
pattern: r#"
engine marzano(0.1)
language json
pair($key, $value) => . where {
$key <: `"extends"`,
}"#
.to_owned(),
source: r#"
{
"env": "bar",
"extends": "foo",
"baz": 1
}"#
.to_owned(),
expected: r#"
{
"env": "bar",
<----------------------------------
"baz": 1
}"#
.to_owned(),
}
})
.unwrap();
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it is not necessary for Grit to handle removing newlines. We rely on formatters to handle that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@morgante got it. This fix should be satisfactory, have a look..
…etgrit#416) Signed-off-by: Abhishek Kumar Gupta <[email protected]>
f3e105b
to
7118310
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Outside diff range, codebase verification and nitpick comments (1)
crates/core/src/test.rs (1)
12017-12017
: Add a more descriptive comment.The comment should provide more context about the issue being addressed.
- // refer https://github.com/getgrit/gritql/issues/416 + // This test addresses issue #416 (https://github.com/getgrit/gritql/issues/416) which involves removing dangling commas after keyword arguments in Python.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please include at least 3 test cases that demonstrate you have fully thought through your implementation.
@@ -300,6 +300,27 @@ fn delete_hanging_comma( | |||
replacements: &mut [(EffectRange, String)], | |||
offset: usize, | |||
) -> Result<(String, Vec<ReplacementInfo>)> { | |||
// Handle the case when after applying replacements, a single comma |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is completely incorrect. You should not be assuming the comma is on a single line.
The original bug is also present even if arguments are on 1 line: https://app.grit.io/studio?key=-Pm14wW6mvH0S41J_hFh8
Please think logically, don't just try to brute force the single case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@morgante
you mean when it is like this? :
bibtex_citation="",
n_samples={_EVAL_SPLIT: 1820}, avg_character_length={_EVAL_SPLIT: 78},
then my current fix prints with a extra space:
bibtex_citation="",
stats=GeneralDescriptiveStats(n_samples={_EVAL_SPLIT: 1820}, avg_character_length={_EVAL_SPLIT: 78}),
But this is due to the single space present between both the matching words (just after the comma of first argument)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't find any other edge cases (except for arguments on multiple lines which works)
Since I am still new to gritql, can you provide a case where it fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your fix completely fails in cases where the arguments are on 1 line. The original bug isn't related to line handling at all and introducing that is a fundamental mistake in your fix.
/// Same as above test, but ensures the behavior doesn't depend on line breaks
#[test]
fn trailing_comma_after_argument_removal_one_line() {
run_test_expected({
TestArgExpected {
pattern: r#"
language python
`TaskMetadata($args)` where {
$args <: any {
contains `n_samples=$_` as $ns_kwarg where {
$ns_kwarg <: `n_samples = $ns_val` => .
},
contains `avg_character_length=$_` as $avg_kwarg where {
$avg_kwarg <: `avg_character_length = $avg_val` => `stats=GeneralDescriptiveStats(n_samples=$ns_val, avg_character_length=$avg_val)`
},
},
}
"#.to_owned(),
source: r#"
TaskMetadata(description="Parallel news titles from the Tbilisi City Hall website (https://tbilisi.gov.ge/).", main_score="f1", domains=["News"], text_creation="created", n_samples={_EVAL_SPLIT: 1820}, reference="https://huggingface.co/datasets/jupyterjazz/tbilisi-city-hall-titles")
"#
.to_owned(),
expected: r#"
TaskMetadata(description="Parallel news titles from the Tbilisi City Hall website (https://tbilisi.gov.ge/).", main_score="f1", domains=["News"], text_creation="created", reference="https://huggingface.co/datasets/jupyterjazz/tbilisi-city-hall-titles")
"#
.to_owned(),
}
})
.unwrap();
}
Also has nothing to do with GritQL, you don't need to change the pattern to see how this is broken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't rely on line breaks.
I think you're probably working in entirely the wrong place. The bug is likely in the mini state machine / parser we have here: https://github.com/getgrit/rewriter/blob/main/vendor/gritql/crates/core/src/inline_snippets.rs#L535-L652 |
/claim #416
/closes #416
Greptile Summary
This is an auto-generated summary
crates/core/src/inline_snippets.rs
Summary by CodeRabbit
Bug Fixes
Tests