Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't normalize strings in the CLI #127

Merged
merged 5 commits into from
Jan 7, 2025
Merged

Conversation

DavisVaughan
Copy link
Collaborator

Closes #90
Closes #123

Follow up to #78

  • The CLI now never normalizes line endings, allowing --check to work correctly, and allowing us to take advantage of an optimization where we detect that no changes occurred during the formatting
  • The LSP continues to always normalizes line endings to Unix endings. Since everything is Unix line endings there, we could add the optimization there too (and pre-emptively return None in the response to the format request if we detect no changes, rather than running an expensive diff algorithm)

As mentioned in #90, our parser and Biome's formatter are happy with alternative line endings when they appear in the trivia, but if non-unix line endings appear in a token then the formatter panics. This happens in multiline strings, so we now normalize strings efficiently using a Cow::Borrowed when nothing changed.

The way this all flows through biome is (as implemented by rome/tools#1672):

  • We send a parse tree with CRLF line endings into the formatter
  • The formatter normalizes all trivia to Unix line endings
  • We normalize all tokens to Unix line endings (only multiline strings have this issue)
  • At Print time, the formatter turns a Unix line ending into the user requested LineEnding

The last step there is why the Printer doesn't allow any non-Unix line endings internally. It really just looks for \n at print time to decide when to apply LineEnding, so \r\n would make it behave incorrectly.

@DavisVaughan DavisVaughan requested a review from lionel- January 6, 2025 18:07
}
}

/// Normalize a string, returning a [`Cow::Borrowed`] if the input was already normalized
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention why using the line_ending won't work here, because it mutates strings in place?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also anticipate a small chance that we might do further "normalization" here, like using a consistent quote style or something similar

…alize in the CLI

Allowing us to actually take advantage of the `Unchanged` optimization with CRLF endings, and correctly handle `--check` too
To prove we can parse these line endings, and to prove that the CRLF ends up in the `RStringValue`
@DavisVaughan DavisVaughan force-pushed the feature/no-cli-normalize branch from 985bf67 to c6bf4b8 Compare January 7, 2025 19:46
@DavisVaughan DavisVaughan merged commit f41a27e into main Jan 7, 2025
4 checks passed
@DavisVaughan DavisVaughan deleted the feature/no-cli-normalize branch January 7, 2025 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CRLF makes --check report a change while there is nothing to modify Dont normalize line endings in the cli
2 participants