Skip to content

Releases: prohippo/pyelly

Improve Vocabulary Table Lookup

22 Mar 07:12
Compare
Choose a tag to compare

PyElly was failing to recognize the plural of hyphenated terms like NAIL-BITERS. The solution was to limit vocabulary table search keys at the first hyphen, if it comes before a space in input text. The "marking" and "indexing" integration test files had to be changed to be consistent with new PyElly output.

Extend FSA Capabilities

22 Feb 08:21
Compare
Choose a tag to compare

This changes the PyElly FSA algorithm to allow for a token string. to be split up. A problem in handling the € symbol was fixed. The "marking" example application rules were extended and cleaned up. Broad update of the PyElly User's Manual, along with addition of Appendix G on Unicode.

Major Code Cleanup

18 Feb 08:21
Compare
Choose a tag to compare

Add Unicode hyphen to PyElly character set, Add definitionLine recognition of \H as Unicode hyphen; rework code to allow it to be used with vocabularyTable rule definitions. Revise example application *.v.elly files to work with new vocabularyTable. Improved macroTable commentary and debugging statements. Extend "marking" example application. Extend and revise documentation.

Major Overhaul of Vocabulary Table Operation

13 Feb 18:54
Compare
Choose a tag to compare

This cleans up ellyChar conversion of Unicode to ASCII, which is required in nameRecognition and in vocabularyTable. The generation of SQLite search keys for vocabulary entries was cleaned up. A bug in generating temporary rules for vocabulary entries was fixed. More extending of "marking" rules.

Various Bug Fixes, Extend "marking" rules

08 Feb 08:25
Compare
Choose a tag to compare

Continue changes to PyElly to address issues uncovered in processing "wild" text from the Web. This includes changes in vocabulary lookup, macro substitution patterns, handling of non-standard representations of right double quotation marks, English suffix recognition.

Bug Fix in Stop Exceptions, More Error Checking of Vocabulary Rules

05 Feb 19:47
Compare
Choose a tag to compare

This continues upgrading of PyElly code as it is tested on more "wild" Web test. Faulty stop exception logic was replaced and unit testing was extended. Vocabulary loading now has more error checking to identify problems.

Vocabulary Bug Fix and Diagnostics for Parse Tree Overflow

30 Jan 20:55
Compare
Choose a tag to compare

This provides information on a parse overflow by showing the list of generated tokens, which makes it easier to diagnose the underlying problem. It also fixes a bug when both an uninflected and a inflected form of a term are in a vocabulary table. More progress for the MARKING example application.

Add Extractor for Time Period References

06 Jan 15:36
Compare
Choose a tag to compare

Time references like "early Thursday evening" are easily recognized and can greatly help for avoiding parse tree overflows in translating long sentences from news stories. This was implemented with Python code, which is a little more readable than using the PyElly FSA.

Clean Up Punctuation Handling

01 Jan 20:06
Compare
Choose a tag to compare

Corrected problems with default punctuation rules and improved documentation. Added commentary to punctuationRecognizer.py to explain better what is going on there.

Handle M Dash, Vocabulary Table Code Cleanup

31 Dec 08:29
Compare
Choose a tag to compare

The change was mainly to make it possible for a vocabulary table entry to start with an m dash. It then becomes easier to describe certain semi-parenthetical expressions with m dashes. The compile() method for the VocabularyTable class was renamed build() to make the code more Pythonic. More rules were added to the "marking" language definition.