Skip to content

v0.2.1

Compare
Choose a tag to compare
@MaxDall MaxDall released this 26 Mar 16:57
· 1373 commits to master since this release
03d4cab

This release is about bug fixing and quality maintenance for our parser. In addition:

  • we added two new publishers to DE (Business Insider, Braunschweiger Zeitung)
  • we had to disable WorldTruth until we get rid of the batch logic with #357
  • @addie9800 added a new attribute to Article called free_access indicating if an article is available for free.
  • we added a new workflow to automatically publish releases on TestPyPi and PyPi

What's new?

Bug fixing

Refactors

  • Replace asyncio with thread-based solution for WARC-path download by @MaxDall in #347
  • Refactor ExtractionFilter and Requires by @MaxDall in #360

QoL

for DEVs

  • Add utility to retrieve test articles by @MaxDall in #355
  • Add URL parameter to test generation script by @MaxDall in #364

Text with Article.body is now also normalized

  • Add functionality to exclude tags from extraction and normalize space by @MaxDall in #382

Publisher quality maintenance

Full Changelog: v0.2.0...v0.2.1