Merge SV files to main #267

NSoiffer · 2024-05-21T23:13:51Z

No description provided.

* added commented out rules for new unicode chemistry arrows -- more work needs to be done to verify them. * Update unicode-full.yaml * Changed pronounciation of inverse function. * Update unicode-full.yaml * Update unicode-full.yaml * Move the testing files to a tw subdir. Modify the tests to reflect the new location/name. * Update unicode-full.yaml * Fixes #217 This bug had two issues: 1. The chemistry code threw out ids on elements that were marked as `data-change="added"`. However, if they had an id already, then they had been parsed, so no need to reparse and lose the idea. 2. Moving into the base of a script that had parens passed in the contents of the parens instead of the parens. This caused the test that determines whether in base or script to failed. This fix has to be made to all languages (but doesn't involve speech). * Additional Nemeth tests from other sources, mainly from bugs/issues * Added translations for some provisional chemistry equilibrium characters to be added to Unicode * fix clippy warnings * Update unicode-full.yaml * Update unicode-full.yaml * Update unicode-full.yaml * Fixed pronounciation issues concerning logarithms * Update unicode-full.yaml * Update unicode-full.yaml * Update ClearSpeak_Rules.yaml * Update unicode-full.yaml * Update unicode-full.yaml * Update unicode-full.yaml * Add clippy to what runs after a push or PR * Fix clippy warnings that only show up when run from push (!?) * added acknowledgement for Chinese * Added "ReplaceAll" string function Fixes #235 * added speech output for Roman numbers * comment out debug * Another look at the CSS whitespace collapse rules and adjusted code and tests appropriately. Whitespace is: * collapsed in the middle of text * if there are embedded elements, the code just grabs their text, so any whitespace around them should be collapsed * removed at the start/end of tokens ``` As an example: <math><mn>1</mn> <mtext>a<x> <y>first</y><foo></foo><y> x second</y><y>third</y> </x>last </mtext> <mi>y</mi></math> ``` should display as `1 a first x secondthird last y`. Note that the empty `foo` element doesn't add whitespace. Also note that ` x second` has a single space before "x" (the `y` contents merge into the `mtext`) and the spaces after the "x" are collapsed to a single space. Fixes #242 * Added `mn` to tags allowed to be marked as MAYBE_CHEMISTRY. It should have the value of 0. * bump version number * Work on Vietnam braille rules Modulo some feedback, this fixes a number of bugs: * Fixes #238 * Fixes #237 * Fixes #236 There is now just one braille test failure (modulo feedback): test_3a which is a `.`/`,` problem/ambiguity * Add data-number to all Roman numerals. This allows speech to speak the number as opposed to the letters if desired. This was partially done before for chemistry valances Fix two bugs in the test for Roman numerals: * the code should have look at preceeding and following tokens (only looked in one direction) * changed the list of acceptable operators to includ `=`, `<`, etc. Removed `,` and `.` because otherwise intervals such as `c, d` are accepted. * added crate to convert roman numerals to ints * added Vietnam tests * Fix #240 -- speaking roman numerals inside and outside of chemistry * Added proposed unicode characters for chemistry Added "1" that was missing from some values. * Added a second `definitions.yaml` file to `PreferenceManager` -- this one for braille. Currently this is supported via a hack that does all the work of reading in the *Languages* version of the def file based on what language is associated with a braille code. This system is a hack. #12 has a suggestion of a better way to do it. But that depends upon #11 and possibly some changes to the speech rule implementation. For now, this fixes a bug where the "Vietnam" braille code doesn't work if the speech language is not Vietnamese. * bump version # * Update unicode-full.yaml * Update unicode-full.yaml * Add another file * add some python checking flags * Add left/right angle bracket chars with same defs as other left/right angle bracket chars * remove old comment * Add accented vowels to rules to check for putting together strings of characters into a single `mi` * "㏇" had an almost certainly bad translation, but I can't figure out what it should really be. It is in the CJK block and will never show up. I've changed it to a descriptive "cap C, o, period" reading. * Update unicode-full.yaml * Some Unicode fixes Some Unicode fixes, including e.g. arrows, fonts & greek pronounciation * Remove duplicate proposed unicode chars. * Fix #244: bug with spacing and prefix "~" * Fix #245 Change the support for '𝟙' in the unicode files to pull the G1 word indicator to the start of the word in which it resides. * Fix #245 Change the support for '𝟙' in the unicode files to pull the G1 word indicator to the start of the word in which it resides. * Left this file out as part of previous commit for fixing #245. * Some minor fixes. * Add more cases for $Impairment -- added only for English only so far. Did a little tweaking to the pauses. Fixes #194 Added some tests for this and also tweaked the expected results to do the pausing tweaks. From UEB branch accidental commit. * Update unicode-full.yaml * Add a check for prefix `~` to make sure it is in an mrow. Add a test for it. * Update unicode.yaml * Changed some aoperators to the definitive form. * Made fixes of pronunciations. * Update unicode-full.yaml * Update unicode-full.yaml * Update unicode-full.yaml * Update unicode-full.yaml * Update unicode-full.yaml * update timeline and add to acknowledgements * timeline tweaks * timeline tweaks * Added pauses before fractions to reduce ambiguity. * Bug fixing after user testing. * Update general.yaml * Update unicode-full.yaml * Part one of major changes to preferences and reading files. This brings include files to the definitions.yaml file This gets rid of the the max three locations for the file and depends on using includes. This is not working yet. I need to fix the order of calling done. The next big step is to move the files used out of preferences (just leaving the head) and moving the timestamps into the rules. * Finished up redoing prefs, tracking including files, and make sure all files are up to date It passes all the tests now. The definitions files were modified to allow hashmaps, along with existing vectors and hashset. Also "include" is now allowed. I changed the format of the defs files some. Also, there is a separate defs file for braille -- currently they just point to the corresponding defs file in the language dir, but these can be cleaned up and some definitions moved out of the common file and into a common braille file or specific file. Fixes: #12 Fixes: #13 * remove extra set_rules_dir calls * Tweaked setting up the preferences. This wasn't quite right if initialize() was called twice because the was trying to be too efficient. * Adding Swedish braille. * Two big speedups: 1. CannonicalizeContext creates several Regexs that are based on preference values. Hence, they get created everytime a context is created. Creating CannonicalizeContext use to be cheap, so it was done to check if an operator had a property such as being a fence. This was the cause of the slowdown identified in #248. The fix is avoid the creation of CannonicalizeContext and make find_operator a static class method. 2. The newly introduced preference and FileAndTime code had redundencies with the "update()" and "read_files()" calls. This resulted in a lot of extra system calls to check the time. This was maybe a 20% speed up. Fixes #248 * Caching the regex patterns that are created when creating a new CanoncializeContext -- these are pref-based so they can't be static. The speed difference is > 70% in CanoncializeContext::new() and maybe 30% overall for set, speech, and braille(!). * Add a preference `CheckRuleFiles ` with values `All`, `Prefs`, and `None` to tell MathCAT whether it should check the Rule files on every call (set_mathml, get speech/braille, or navigate) to see if they have changed. Fixes #249 * fix 'up-to-date-test' that I tweaked and broke * Update the documentation with CheckRuleFiles and new timings. * Changed default for `CheckRuleFiles` from "All" to "Prefs" as that makes the most sense for a default. Updated documentation to match. * &PathBuf -> &Path for defs file * better error messages * Added Finnish and Swedish support into braille.rs so that work can be done on those codes using the main branch. Rules/prefs.yaml picks up another option which should be integrated with the UEB option of the same name (need to change Rule files for UEB). * Changes agreed upon after user testing. * Changed rule for reading out "bråk/slut bråk" and "division/slut division". * tweak some grammar * change Swedish/Finnish timeline * Add a special case when finding definitions.yaml to not include a base case in the Rules directory. It missings the required defs for ordinal numbers, etc. As I split the definitions.yaml file up, this check may need to change. Fixes #250 I couldn't find a problem for the other problem found in #250 (`overview.yaml`) * Added include statements in definitions.yaml * Wrote some code that grabs SRE's "euro" 8-dot braille maps and converts it to MathCAT's unicode.yaml format. I don't think SRE's mappings make much sense. I'll need to wait to here from some people to see if this was useful or not. * Update to work with main branch * Added get_navigation_node_from_braille_position(). Tested for Nemeth, UEB, and CMU. Fixes #125 * update version number due to new interfaces * The psuedo script code was too agressive for something like `x*y`. The fix is not great because it is looking for a `mo` that follows `*` (ignoring `(`, `[`, and `{`). This is a weak hueristic and likely needs more thought. * is_defined_in() needs to know if it should check the braille or speech definitions, so an arg was added. I made the xpath function accept 2 or 3 args, with 2 args meaning "use speech defs" as a temporary compatibility measure rather than changing all the occurences in the rules -- they do need to change eventually. * Initial, very incomplete attempt at EuroBraille. * More work on EuroBraille 1. There are some problems with spacing 2. I used latex-symbols.htm to scrape LaTeX <=> Unicode symbols, but it has problems. Need to look elsewhere or do a lot of cleanup * Added braille-to-ascii functions for EuroBraille More scraping of tables for latex * Mostly complete EuroBraille implementation. There are some bugs with spacing that I need guidance on. Passes 25 of 27, with the 2 fails being spacing issues. * Some reference starting data * Changed DefinitionValue so that it returns the key if the key is not in the table This means that asking for the value of "km" returns something sensible if "km" is not defined. * Missed two places for previous update * Fixed the munder and mover rules. * Fixed pausing in munder and mover rules. * is_defined_in() needs to know if it should check the braille or speech definitions, so an arg was added. I made the xpath function accept 2 or 3 args, with 2 args meaning "use speech defs" as a temporary compatibility measure rather than changing all the occurences in the rules -- they do need to change eventually. * The psuedo script code was too agressive for something like `x*y`. The fix is not great because it is looking for a `mo` that follows `*` (ignoring `(`, `[`, and `{`). This is a weak hueristic and likely needs more thought. * After merging in a fix to pseudo scripts, testing turned up a bug. Now fixed. * Removed the pre-check that the definitions.yaml file entries for ordinals exist. Instead, the "None" value gets propogated up to the xpath function that then returns the "raw" number. Seems like a better fallback. * Added "LanguageAuto" to support automatic language switching. Strangely, I didn't modify MathCATForPython and the auto switching is now working(?!). MathCATForPython should really be setting "LanguageAuto" preference when the voice changes. * bump version * Added code to handle unicode.yaml (which I should have used in the first place) * Added full set of names based on unicode.xml Changed from emitting dots to using text based on feedback Renamed "EuroBraille" to "LaTeX" Still need to look through name list some more. * Brought in unicode.xml (full list of all names) About 4,000 chars are now defined. I spot checked a bunch of char names and they all seem to be the ones that will be used. Still need to deal with menclose. Added more tests: 33 out of 35 pass. The two fails are due to spacing, and one of them is almost certainly a bad spec example. * Brought in unicode.xml (full list of all names) About 4,000 chars are now defined. I spot checked a bunch of char names and they all seem to be the ones that will be used. Added in short name checks for the chars (LaTeX_UseShortName). Still need to deal with menclose. Added more tests: 33 out of 35 pass. The two fails are due to spacing, and one of them is almost certainly a bad spec example. * Removed `python.analysis.diagnosticMode": "workspace"` -- too many of my scripts have lint issues and this setting buries the warnings for the files I'm editing. * Somewhat hacked/cleaned up version of wikipedia page * Added comment about how the unicode files were derived. * Changed speech for "hat" to use hat only when it is a "modified-variable" (basically, mover). Otherwise, it uses "caret" so that `x^2` (those three chars) sounds better. * Added menclose rules and tests for them. I needed to make up latex for many of them as their don't appear to be standard names for the strikes and arrows (the closed in the cancel package, but that is very limited). * Remove a blank line... * Updated readme * Fix tests based on feedback that "spec" has typos * Add support for ASCIIMath (treated as braille so no intent) There are 40 tests, but more are needed. LaTeX: added support for moveable limits src/xpath_functions.rs: added hashmaps to IsInDefinition and fixed DefinitionValue * Add support for ASCIIMath (treated as braille so no intent) There are 40 tests, but more are needed. LaTeX: added support for moveable limits src/xpath_functions.rs: added hashmaps to IsInDefinition and fixed DefinitionValue * Added CopyAs pref for NSoiffer/MathCATForPython#67 * Added `get_navigation_braille()` to support CopyAs LaTeX and ASCIIMath * In the math alphanumeric block, there are italic chars. Changed to treating these as if they were automatic italics and hence are just regular chars. Also for a few braille codes, the math alphanumerics were part of unicode.yaml instead of unicode-full.yaml. I move them over. Fixes #258 * Fix clippy warnings * update status * Add some more info about testing * Unicode fixes More obscure unicode characters translated, plus changed "punkt" to "prick" for Newtonian derivative notation. * Update main page: * update current speech and braille translations * update JAWS info * update award donation * fix up problems with dollar signs * fix up problems with dollar signs, this time with * Make sure translations are lower case * update version # * testing a failure * Update unicode-full.yaml Lots of obscure unicode characters translated as well as currently possible. * Fix crash when setting a langague that doesn't exist * Try again for crash fix * add spaces around mfrac -- mixed fractions are an example where no space does the wrong thing * add mixed fraction test * add some speech style changes in non-en dir tests * bump version # * Checked for alphanumeric digits and converted them to ascii digits (simplifies to ordinal calculation). Fixes #260. Also checks for block separators and removes them. * Minor Unicode fixes * added ordinal test with non-ascii digits * I'm not sure if #261 was a real bug -- the test I made had a bug... There is one maybe fix and a new test with non-ascii digits. Fixes #261. * remove init_logger() calls * Fix #262 -- when resetting the speech style, the old style was being used because the prefs hadn't been updated The test for this is too ugly to run in general because the only way I was able to replicate this was to change the prefs.yaml file. In general, I don't want to do this. For testing, I added functionality to some zz rule files. * remove cut/paste extra bit * remove useless test left over while debugging * can't comment out function used in '#[ignore]' test * Minor unicode fixes * Fix up lint warnings Add hack to deal with broken Nemeth chemistry ascii braille. * change println! to debug! * Change the way `set_preference` works. The preferences previously were partioned into API and USER ones. `set_preference` would always write to the API prefs, thereby overriding any user preference. This caused a problem for copy/paste which needs to change the braille code. I changed this so that API and USER prefs never override one another. `set_preference` now calls `set_string_pref` instead of `set_api_pref`. This function sets the API pref if the pref name is an API pref and the USER pref if the pref name is a USER pref. Note: user prefs, at least in NVDA, are typically changed via modifying prefs.yaml and the default is to reread that if that files changes. * improve comment * Fix casing of default values. The code use to change everything to lowercase but doesn't anymore. The defaults shouldn't matter, but maybe they do in some cases. * Set the user pref time even when the user pref file isn't there. This avoids rereading the prefs, which wipes out existing prefs. Add some more tests of pref setting (didn't catch this problem though) * Various fixes after QA by mathematician More coming * Fixes after QA Some more fixes after QA by mathematician. * Chemistry (#264) * Adding new Nemeth chemistry tests * add chemical element check for no subscripted indicator used * replace_children was wrong when the parent had a fixed number of children -- the children were never changed to the replacements. * ELEMENTS_WITH_FIXED_NUMBER_OF_CHILDREN had "mmultiscripts" and "mlongdiv", which do not have a fixed number of elements. }; * remove debug statement * Rule for bullet "•" used wrong dots. See green book p99 or 2.5.1 in revised Nemeth chemistry braille code * Deal with chemistry case NH_2, where the N and H need to be split, but then the N needs to move to in front of the script. Added test for this case. * Add ⋅∙• to the chars that don't require a superscript indicator. I found this with Lewis dot chemistry notation tests. * Fix a test -- shouldn't have been msubsup * Fix mroot rule when the index is a decimal integer. The case for that was missing. * removed init_logger from a test * Update (with lots of bug fixes and new chemistry cases) to match the updated BANA guidelines for Nemeth and chemistry Surprisingly, there was a bug/missing code for staggered scripts (I think superscript inside a subscript). That applies to non-chemistry also. There are lots of fixes. One of the biggest is that something like HSO_4 wasn't split properly. And if there were prescripts, they weren't handled. If the result wasn't chemistry, I had to significantly improved the code to put it all back together. There were some chemistry notations I hadn't see such as a prescript lewis dot. There are some more chemistry notations to add, but I think I have the more common ones done. Added a "get_parent" method to replace some ugly code that shows up a lot. All tests pass, including 30 new Nemeth tests. * Added use of get_parent() in place of uglier code * removed init_logger * Add two more binomial notations to inference rules. Add tests for them. * Italian seed (#263) * Seeding Italian settings * Initial Italian translation Fixed all files except Unicode and Unicode Full, which will be later handled by scripts --------- Co-authored-by: NSoiffer <[email protected]> * fix linting problems * bump version number * Moved ZIPPED files references so they only matter for the webasm build * Put back building in the zip files. The python build is failing without them. I need to think of another way to separate the Rules files from the binary (for all but the wasm/mathcatdemo) project. * nothing important -- want to make git happy * Revert "Moved ZIPPED files references so they only matter for the webasm build" This reverts commit 6168866. * Fix extra line I accidentally added * comment out debug! * Spotted a bug in braille highlighting when nothing is highlighted (maybe never happens which is why I didn't see it in practice?) * Comment out debug!() statements * Added turning off braille highlights for testing * Spotted a bug (kind of typo) in braille_mathml() where the calculated end was the number of chars/3 instead of bytes/3. This exposed a problem in finding the highlight where the code got stuck on an invisible char that isn't in the braille. I added a case to detect this and then move on to the right. * Fixed bug that manifested itself in doc_order() not returning the right order. The problem was that the temporary navigation tree wasn't attached a a "Root" node. I fixed that, added copy_mathml() to prevent the tree structure in the original from being wrong with a math tag is added around the navigaiton math. There were also another spot that needed to be attached to a root. I cleaned up some existing code dealing with this. * bump version # * bump version number * Change ":structure" to ":literal" as per the Math WG's decision on the name. * Update definitions.yaml * Update definitions.yaml * Update definitions.yaml * Update languages.rs * Changed language code in test files * Update shared.rs with some translated tests * Tests & intervals A few tests translated, also fixed the intervals function in ClearSpeak to better reflect the Swedish way of speaking intervals. * More tests translated * Update mroot.rs * Translated SimpleSpeak:functions.rs * Update geometry.rs * Translated SimpleSpeak:linear_algebra.rs * Translated SimpleSpeak:large_ops.rs * Translated SimpleSpeak:mfrac.rs * Translated SimpleSpeak:msup.rs * Translated SimpleSpeak:multiline.rs * Update functions.rs * Update unicode.yaml * More tests translated * Update functions.rs * Update functions.rs * New test results * Translated SimpleSpeak:sets.rs * Bug fix functions.rs * Bug fix linear_algebra.rs * Final translations of shared.rs * Translated mtable.rs * More tests translated * New test results file * Bug fix sets.rs * Final translations in mtable.rs * More tests translated plus new test results file * More tests * Clean up of ClearSpeak_Rules.yaml * Clean up of SimpleSpeak_Rules.yaml * More clean up * Clean up of Unicode-full.yaml * More clean up * Last batch of tests translated --------- Co-authored-by: NSoiffer <[email protected]> Co-authored-by: Tim Arborealis <[email protected]> Co-authored-by: Otto Ewald <[email protected]> Co-authored-by: Dang Hoai Phúc <[email protected]> Co-authored-by: NSoiffer <[email protected]> Co-authored-by: Tnonis90 <[email protected]>

NSoiffer and others added 5 commits September 29, 2023 21:00

Add Swedish rules

03eb79f

fixed problem with left/right phrase

51323aa

update to make merge to main happy

e072591

Merge branch 'main' into interface-additions

bb9c5b9

NSoiffer merged commit 6e7fecd into main May 21, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge SV files to main #267

Merge SV files to main #267

NSoiffer commented May 21, 2024

Merge SV files to main #267

Merge SV files to main #267

Conversation

NSoiffer commented May 21, 2024