-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[en] improve rules #11191
[en] improve rules #11191
Conversation
WalkthroughThis pull request introduces modifications to LanguageTool's English language modules. The changes involve expanding the spell-checking vocabulary by adding new words to the Changes
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
languagetool-language-modules/en/src/main/resources/org/languagetool/resource/en/hunspell/spelling.txt (2)
Line range hint
656-667
: Consider grouping related terms together.For better maintainability, consider grouping related terms together in the file:
- Anatomical terms: "tympanic", "tympanum", "tympanums", "tympana"
- Religious terms: "Ebionite", "Ebionites", "Ebionism", "docetic", "docetism"
- Mathematical terms: "meromorphic"
- Brand names: "VisionTek"
Line range hint
656-667
: Consider adding related mathematical terms.For completeness, consider adding related mathematical terms:
- "holomorphic" (related to meromorphic functions)
- "analytic" (in the complex analysis context)
languagetool-language-modules/en/src/main/resources/org/languagetool/rules/en/grammar.xml (1)
106873-106873
: Consider tightening the chunk_re pattern for better precision.While the changes to SPEND_IT_FOR are good:
- Extended antipattern for reflexive pronouns is comprehensive
- However, the chunk_re=".-VP" pattern is too loose and might match unintended verb phrases
Consider using a more specific pattern like
chunk_re="B-VP.*"
to ensure "spend" starts a verb phrase.Also applies to: 106877-106877
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
languagetool-language-modules/en/src/main/resources/org/languagetool/resource/en/hunspell/spelling.txt
(1 hunks)languagetool-language-modules/en/src/main/resources/org/languagetool/rules/en/grammar.xml
(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Analyze (java-kotlin)
🔇 Additional comments (8)
languagetool-language-modules/en/src/main/resources/org/languagetool/resource/en/hunspell/spelling.txt (5)
656-656
: LGTM! Mathematical term correctly added.The term "meromorphic" is a valid mathematical term related to complex analysis and function theory.
Line range hint
657-660
: LGTM! Anatomical terms are complete and well-formed.The anatomical terms "tympanic", "tympanum", "tympanums", and "tympana" form a complete set with both singular and plural forms (including the Latin plural "tympana").
Line range hint
662-664
: LGTM! Religious terms are complete and well-formed.The terms "Ebionite", "Ebionites", and "Ebionism" form a complete set covering both the adherents and the belief system.
Line range hint
665-666
: LGTM! Religious terms are properly added.The terms "docetic" and "docetism" are correctly added, representing both the adjective and the noun forms of this theological concept.
Line range hint
661-661
: Verify brand name capitalization.Please verify that "VisionTek" follows the company's official capitalization style.
✅ Verification successful
Brand name "VisionTek" appears to be correctly capitalized
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check official brand name capitalization # Search for any variations of the brand name in documentation or official sources rg -i "visiontek|vision-?tek"Length of output: 259
Script:
#!/bin/bash # Search for any documentation files containing the brand name fd -e md -e txt -e html README -x rg -i "visiontek|vision-?tek" {}Length of output: 70
languagetool-language-modules/en/src/main/resources/org/languagetool/rules/en/grammar.xml (3)
106817-106824
: Well-structured improvements to the AT_THE_JOB rule!The changes enhance the rule's precision by:
- Adding valid antipatterns for "shot at" and "remain/stay at"
- Using chunk_re for better noun phrase detection
- Including comprehensive examples that validate the changes
Also applies to: 106828-106828, 106834-106835
106839-106842
: Good addition of the "what about" antipattern!The antipattern effectively prevents false positives for valid question constructions while maintaining the rule's ability to catch incorrect "analysis about" usage.
106914-106914
: Verify the impact of removing contraction support.Removing
regexp='yes'
and the 'm' option from the token pattern might cause the rule to miss contractions like "I'm", "you're", etc.Run this script to check for potential impact:
Summary by CodeRabbit
Release Notes
New Features
Improvements
Bug Fixes