Skip to content

Releases: Esukhia/bordr

v0.1.4

21 Apr 09:26
Compare
Choose a tag to compare

Fix

  • readme: Documented latest change (2056672)
  • RDRPOSTagger: Sdict content changed according to botok segmentation (c87936b)

Bugfix

19 Nov 16:14
Compare
Choose a tag to compare

0.1.3 - 20191119

Fixed

  • print of rules being built is now integrated in the log

Bugfix

19 Nov 14:43
Compare
Choose a tag to compare

0.1.2 - 20191119

Changed

  • remove error catching in rdr()

bugfix

19 Nov 14:17
Compare
Choose a tag to compare

0.1.1 - 20191119

Fixed

  • bad parse didn't trigger error

initial release

19 Nov 14:14
Compare
Choose a tag to compare

0.1.0 - 20191119

Removed

  • all language data
  • all InitialTaggers except for a generic one adapted to Tibetan
  • ExtRDRPOSTagger is removed because it won't be use in the foreseeable future

Changed

  • InitialTagger becomes InitialTagger4Bo and incorporates support of syllable-suffixes
  • all encodings turned from utf-8 to utf-8-sig
  • all print messages are boxed into logs or msgs that are returned by the methods/functions
  • absolute imports are turned into relative imports (that hack is not needed anymore for pypi packages)
  • RDRPOSTagger/run() becomes rdr() and all CLI arguments/options are turned into arguments of rdr()
  • temporary files used by RDRPOSTagger are not deleted at train time.
  • black is run on the whole codebase.

Added

  • evaluate() function added to Utility/Eval.py and depending on the presence/absence of the fullDictFile arg,
    it runs either computeAccuracies() or computeAccuracy()
  • interface into RDRPOSTagger: rdr(), evaluate(), NUMBER_OF_PROCESSES and THRESHOLD are exposed
    on the root level of bordr. the two constants to configure RDRPOSTagger, rdr() to train and tag data,
    evaluate() to check the performance of the trained model.