Skip to content

initial release

Compare
Choose a tag to compare
@drupchen drupchen released this 19 Nov 14:14
· 20 commits to master since this release

0.1.0 - 20191119

Removed

  • all language data
  • all InitialTaggers except for a generic one adapted to Tibetan
  • ExtRDRPOSTagger is removed because it won't be use in the foreseeable future

Changed

  • InitialTagger becomes InitialTagger4Bo and incorporates support of syllable-suffixes
  • all encodings turned from utf-8 to utf-8-sig
  • all print messages are boxed into logs or msgs that are returned by the methods/functions
  • absolute imports are turned into relative imports (that hack is not needed anymore for pypi packages)
  • RDRPOSTagger/run() becomes rdr() and all CLI arguments/options are turned into arguments of rdr()
  • temporary files used by RDRPOSTagger are not deleted at train time.
  • black is run on the whole codebase.

Added

  • evaluate() function added to Utility/Eval.py and depending on the presence/absence of the fullDictFile arg,
    it runs either computeAccuracies() or computeAccuracy()
  • interface into RDRPOSTagger: rdr(), evaluate(), NUMBER_OF_PROCESSES and THRESHOLD are exposed
    on the root level of bordr. the two constants to configure RDRPOSTagger, rdr() to train and tag data,
    evaluate() to check the performance of the trained model.