Skip to content

Commit

Permalink
rules & disallowed words for Catalan (#42)
Browse files Browse the repository at this point in the history
* rules & disallowed words for Catalan

* ca: more disallowed symbols

* update Catalan config

* update Catalan settings

* missing comma

* fix escape

* Catalan: update rules & disallowed words

* +disallowed symbols

* Catalan: update rules
  • Loading branch information
jaumeortola authored and Gregor committed Aug 13, 2019
1 parent a23abce commit 1ffde0a
Show file tree
Hide file tree
Showing 2 changed files with 1,481,997 additions and 0 deletions.
45 changes: 45 additions & 0 deletions src/rules/catalan.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
min_trimmed_length = 6
min_word_count = 2
max_word_count = 15
min_characters = 6
may_end_with_colon = false
quote_start_with_letter = true
needs_punctuation_end = true
needs_letter_start = true
needs_uppercase_start = true
disallowed_symbols = [
'<', '>', '+', '*', '\', '#', '@', '^', '[', ']', '(', ')', '/', '=',
'Å', 'Ł', 'ś', 'ń', 'ź', 'ñ', 'š', 'ö', 'â', 'ë', 'ä', 'ê',
'α', 'β', 'Γ', 'γ', 'Δ', 'δ', 'ε', 'ζ', 'η', 'Θ', 'θ', 'ι', 'κ',
'Λ', 'λ', 'μ', 'ν', 'Ξ', 'ξ', 'Π', 'π', 'ρ', 'Σ', 'σ', 'ς', 'τ',
'υ', 'Φ', 'φ', 'χ', 'Ψ', 'ψ', 'Ω', 'ω',
'', 'בְ', 'ɛ', 'ɔ'
'$', '€',
'б', 'м', 'ы', 'л',
'"', '«', '»', '“', '”', '„', '“', '•', '‘'
]
broken_whitespace = [" ", " ,", " .", " ?", " !", " ;"]
abbreviation_patterns = [
"[A-Z]+\\.* ?[A-Z]",
"[A-Z][A-Z]+",
"[\\s'‘][A-Z]\\b",
"aC", "dC",
" ex\\.",
" .\\.",
"\\betc.",
"\\..",
"[;,]$",
"El següent diagrama mostra les poblacions més properes\\.",
"És inofensiu per als humans\\.",
"Ha estat doblada al català\\.",
"Es troba amenaçada d'extinció per la pèrdua del seu hàbitat natural\\.",
"Està amenaçada d'extinció per la pèrdua del seu hàbitat natural\\.",
"És un peix d'aigua dolça i de clima tropical\\.",
"Viu en zones de clima tropical\\.",
"Cristal·litza en el sistema monoclínic\\.",
"És ovípar\\.",
"Cap de les famílies estaven per davall del llindar de pobresa\\.",
"Cristal·litza en el sistema ortoròmbic\\.",
"És un fragment de la llista d'asteroides completa\\.",
"Es troba a la Xina\\."
]
Loading

0 comments on commit 1ffde0a

Please sign in to comment.