-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[crh] rules update (cyrillization) #10906
base: master
Are you sure you want to change the base?
Conversation
WalkthroughThe pull request introduces multiple new rule groups to enhance grammar checking for the Crimean Tatar language in both Latin and Cyrillic scripts. These include rules for colloquial typos, toponymy, adverb affixes, question particle placement, and verb case collocation. Additionally, minor updates to existing rules are made for improved clarity and accuracy. Changes
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (6)
languagetool-language-modules/crh/src/main/resources/org/languagetool/rules/crh/grammar.xml (6)
291-320
: LGTM! Consider improving comment formatting.The new LASTING_ADVERB_AFFIX_CYR rule group is a well-implemented Cyrillic version of the existing Latin script rule. The patterns, suggestions, and examples are correctly transliterated and appropriate.
Consider adding a blank line before and after the rule group for improved readability, consistent with the formatting of other rule groups in the file.
339-355
: LGTM! Consider improving comment formatting.The new COLLOQUIAL_TYPOS_CYR rule group is a well-implemented Cyrillic version of the existing Latin script rule. The patterns, suggestions, and examples are correctly transliterated and appropriate.
Consider adding a blank line before and after the rule group for improved readability, consistent with the formatting of other rule groups in the file.
366-374
: LGTM! Consider improving comment formatting.The new TOPONOMY_TYPOS_CYR rule group is a well-implemented Cyrillic version of the existing Latin script rule. The patterns, suggestions, and examples are correctly transliterated and appropriate.
Consider adding a blank line before and after the rule group for improved readability, consistent with the formatting of other rule groups in the file.
472-501
: LGTM! Consider improving comment formatting.The new COMPLEX_NUMBER_DEFIS_MISSING_CYR rule group is a well-implemented Cyrillic version of the existing Latin script rule. The patterns, suggestions, exceptions, and examples are correctly transliterated and appropriate. The new exceptions added to the Latin script version are also correctly included here.
Consider adding a blank line before and after the rule group for improved readability, consistent with the formatting of other rule groups in the file.
531-550
: LGTM! Consider improving comment formatting.The new QUESTION_PARTICLE_SEPARATION_CYR rule group is a well-implemented Cyrillic version of the existing Latin script rule. The patterns, suggestions, and examples are correctly transliterated and appropriate.
Consider adding a blank line before and after the rule group for improved readability, consistent with the formatting of other rule groups in the file.
570-587
: LGTM! Consider improving comment formatting.The new QUESTION_PARTICLE_PERSONALIZED_MISSPOS_CYR rule group is a well-implemented Cyrillic version of the existing Latin script rule. The patterns, suggestions, and examples are correctly transliterated and appropriate.
Consider adding a blank line before and after the rule group for improved readability, consistent with the formatting of other rule groups in the file.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- languagetool-language-modules/crh/src/main/resources/org/languagetool/rules/crh/grammar.xml (12 hunks)
🔇 Additional comments (2)
languagetool-language-modules/crh/src/main/resources/org/languagetool/rules/crh/grammar.xml (2)
Line range hint
442-471
: LGTM! Improved rule accuracy with new exceptions.The modifications to the COMPLEX_NUMBER_DEFIS_MISSING rule group add valuable exceptions for specific cases (1C, numbers with apostrophes or 'k', and temperature notations). These changes improve the rule's accuracy and are consistently applied across both rules in the group.
Line range hint
1-1231
: Overall, excellent additions to support Cyrillic script!The changes in this file significantly enhance the grammar checking capabilities for Crimean Tatar by adding Cyrillic versions of existing Latin script rules. The new rule groups are well-implemented, consistent with the existing structure, and cover important aspects such as lasting adverb affixes, colloquial typos, toponymy, complex numbers, and question particle usage.
The modifications to existing rules, such as the COMPLEX_NUMBER_DEFIS_MISSING group, improve rule accuracy by adding relevant exceptions.
To further improve the file:
- Consider adding blank lines before and after each rule group for better readability.
- Ensure consistent indentation across all rule groups.
- Consider adding comments to explain the purpose of each rule group, especially for the newly added Cyrillic versions.
These changes will greatly benefit users of the Crimean Tatar language module in LanguageTool, providing more comprehensive grammar checking for both Latin and Cyrillic scripts.
Cyrillized rules and rulegroups:
EXODIVE_VERB_ADVB_COLLOCATION
POSTPOSITION_CASE_COLLOCATION
TENSE_PARTICLE_CASE_COLLOCATION
REASON_PARTICLE_CASE_COLLOCATION
SIMPLE_INFINITIVE
QUESTION_PARTICLE_PERSONALIZED_MISSPOS
QUESTION_PARTICLE_SEPARATION
COMPLEX_NUMBER_DEFIS_MISSING
TOPONOMY_TYPOS
COLLOQUIAL_TYPOS
LASTING_ADVERB_AFFIX
Summary by CodeRabbit