Skip to content

Latest commit

Β 

History

History
807 lines (759 loc) Β· 40.5 KB

CHANGELOG.md

File metadata and controls

807 lines (759 loc) Β· 40.5 KB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Romantic Versioning.

v2.3.7

  • feat: Add custom parser for bialystok.se.pl
  • feat: Add custom parser for lublin.se.pl
  • feat: Add custom parser for wroclaw.se.pl
  • feat: Add custom parser for lodz.se.pl
  • fix: Hide extra header content for Ars Technica by @jocmp

v2.3.6

  • bump version v2.3.5 -> v2.3.6 by @jocmp
  • fix: Update arstechnica custom parser by @jocmp in #54
  • Add more se.pl parsers by @jocmp in #53
  • feat: Add parser for portalobronny.se.pl by @jocmp
  • feat: Add custom parser for superbiz.se.pl by @jocmp
  • feat: Add parser for szczecin.se.pl by @jocmp

Full Changelog: https://github.com/jocmp/mercury-parser/compare/v2.3.5...v2.3.6

v2.3.5

  • bump version v2.3.4 -> v2.3.5 by @jocmp
  • Add custom parsers for se.pl by @jocmp in #52
  • feat: Add custom parser for polityka.se.pl by @jocmp
  • feat: Add custom parser for polityka.se.pl by @jocmp
  • feat: Add custom parser for www.se.pl by @jocmp
  • feat: Add custom parser for sport.se.pl by @jocmp
  • Fix bsky selectors by @jocmp
  • Update CHANGELOG by @jocmp
  • feat: Add custom parser for n-tv.de by @jocmp in #51
  • feat: Add custom parser for n-tv.de by @jocmp
  • feat: Add custom parser for bsky.app by @jocmp in #50
  • fix: Hide ads on heise.de by @jocmp
  • fix: Keep nested h3's in tldr.tech by @jocmp

Full Changelog: https://github.com/jocmp/mercury-parser/compare/v2.3.4...v2.3.5

v2.3.4

  • bump version v2.3.3 -> v2.3.4 by @jocmp
  • feat: Add custom parser for tldr.tech by @jocmp in #49
  • feat: Add custom parser for tldr.tech by @jocmp
  • feat: Add custom parser for heise.de by @jocmp in #48

Full Changelog: https://github.com/jocmp/mercury-parser/compare/v2.3.3...v2.3.4

v2.3.3

  • bump version v2.3.2 -> v2.3.3 by @jocmp
  • fix: Retain h2 for channelnewsasia.com by @jocmp in #47

Full Changelog: https://github.com/jocmp/mercury-parser/compare/v2.3.2...v2.3.3

v2.3.2

  • bump version v2.3.1 -> v2.3.2 by @jocmp
  • Fix android authority parser by @jocmp in #46

Full Changelog: https://github.com/jocmp/mercury-parser/compare/v2.3.1...v2.3.2

v2.3.1

  • bump version v2.3.0 -> v2.3.1 by @jocmp
  • feat: Add custom parser - wccftech.com by @jocmp in #45
  • feat: Add custom parser - channelnewsasia.com by @jocmp in #44
  • chore: Fix references to Postlight Parser in README by @jocmp

Full Changelog: https://github.com/jocmp/mercury-parser/compare/v2.3.0...v2.3.1

v2.3.0

  • bump version v2.2.10 -> v2.3.0 by @jocmp
  • Fix changelog by @jocmp
  • chore: Fix bumpver config by @jocmp
  • chore: Update changelog by @jocmp
  • fix: androidauthority.com - Retain heading tags by @jocmp in #41
  • fix: Update versants.com to parse figures by @jocmp in #42

Full Changelog: https://github.com/jocmp/mercury-parser/compare/v2.2.10...v2.3.0

v2.2.10

  • bump version v2.2.9 -> v2.2.10 by @jocmp
  • feat: Add custom parser for mobilesyrup.com by @jocmp in #39

Full Changelog: https://github.com/jocmp/mercury-parser/compare/v2.2.9...v2.2.10

v2.2.9

  • bump version v2.2.8 -> v2.2.9 by @jocmp
  • [skip ci] Update changelog by @jocmp
  • feat: Add custom parser - spiegel.de by @jocmp in #38

Full Changelog: https://github.com/jocmp/mercury-parser/compare/v2.2.8...v2.2.9

v2.2.8

  • bump version v2.2.7 -> v2.2.8 by @jocmp
  • [skip ci] Update changelog by @jocmp
  • feat: Add custom parser - hardwarezone.com.sg by @jocmp in #37
  • [skip ci] Update changelog by @jocmp

Full Changelog: https://github.com/jocmp/mercury-parser/compare/v2.2.7...v2.2.8

v2.2.7

  • bump version v2.2.6 -> v2.2.7 by @jocmp
  • fix: Hide author thumbnail in techcrunch feed by @jocmp
  • feat: Add custom parser - techcrunch.com by @jocmp in #34
  • Allow any article in preview by @jocmp
  • Update changelog by @jocmp

Full Changelog: https://github.com/jocmp/mercury-parser/compare/v2.2.6...v2.2.7

v2.2.6

  • bump version v2.2.5 -> v2.2.6 by @jocmp
  • feat: Update phoronix.com custom extractor by @jocmp in #29
  • chore(deps-dev): Bump brfs from 2.0.1 to 2.0.2 by @dependabot[bot] in #26
  • chore(deps): Bump iconv-lite from 0.5.0 to 0.6.3 by @dependabot[bot] in #24
  • chore(deps-dev): Bump watchify from 3.11.1 to 4.0.0 by @dependabot[bot] in #23
  • chore: Update changelog by @jocmp

Full Changelog: https://github.com/jocmp/mercury-parser/compare/v2.2.5...v2.2.6

v2.2.5

  • bump version v2.2.4 -> v2.2.5 by @jocmp
  • chore: Update bump-version.yml by @jocmp
  • feat: support headers in Android Authority parser by @jocmp in #28
  • feat: android authority extractor by @jocmp in #27
  • Prefix version in bumpver config by @jocmp
  • Update README.md by @jocmp
  • Update publish step by @jocmp

Full Changelog: https://github.com/jocmp/mercury-parser/compare/v2.2.4...v2.2.5

v2.2.4

  • bump version 2.2.3 -> 2.2.4 by @jocmp
  • Revert "bump version 2.2.3 -> 2.2.4" by @jocmp
  • Use yarn in publish by @jocmp
  • bump version 2.2.3 -> 2.2.4 by @jocmp
  • Revert "bump version 2.2.3 -> 2.2.4" by @jocmp
  • Publish on tag by @jocmp
  • bump version 2.2.3 -> 2.2.4 by @jocmp
  • Update release workflow by @jocmp
  • Split out release step by @jocmp
  • Use release ssh key by @jocmp
  • Commit on bumpver by @jocmp
  • Simplify build step by @jocmp
  • Run yarn install in create-release by @jocmp
  • Fix workflow paths by @jocmp
  • Create FUNDING.yml by @jocmp
  • Add release actions by @jocmp in #22
  • chore(deps): Bump postman-request from 2.88.1-postman.31 to 2.88.1-postman.40 by @dependabot[bot] in #20
  • chore(deps-dev): Bump eslint-config-prettier from 6.11.0 to 6.15.0 by @dependabot[bot] in #19
  • chore(deps-dev): Bump remark-preset-lint-recommended from 3.0.2 to 7.0.0 by @dependabot[bot] in #17
  • chore(deps-dev): Bump karma-chrome-launcher from 3.1.0 to 3.2.0 by @dependabot[bot] in #21
  • Add custom parser - 1pezeshk.com by @jocmp in #15
  • Update dist by @jocmp
  • Add custom parser - www.versants.com by @jocmp in #13
  • Clean up docs by @jocmp
  • chore(deps): Bump moment-timezone from 0.5.37 to 0.5.46 by @dependabot[bot] in #8
  • chore(deps-dev): Bump rollup-plugin-uglify from 6.0.1 to 6.0.4 by @dependabot[bot] in #7
  • chore(deps): Bump @postlight/ci-failed-test-reporter from 1.0.22 to 1.0.26 by @dependabot[bot] in #11
  • chore(deps-dev): Bump rollup-plugin-terser from 6.1.0 to 7.0.2 by @dependabot[bot] in #12
  • chore(deps-dev): Bump changelog-maker from 2.3.0 to 4.3.1 by @dependabot[bot] in #10
  • chore(deps-dev): Bump babel-eslint from 10.0.1 to 10.1.0 by @dependabot[bot] in #5
  • Clean up README, add CI badge by @jocmp
  • Bump deps by @jocmp in #1
  • fix: select extended types before content by @touchRED
  • fix: update gif to match rebrand by @touchRED
  • feat: update all fixtures and custom parsers to match by @sdoire
  • feat: remove obsolete custom extractors by @sdoire
  • fixed and improved extraction for latest layout of politico.com by @zhemaituk
  • custom parser for www.investmentexecutive.com by @zhemaituk
  • custom parser for cbc.ca by @zhemaituk
  • fix: postlight parser test by @sdoire
  • adjust postlight insights custom selectors by @austinmbrown
  • release: 2.2.3 by @johnholdun
  • fix: handle sec & ms timestamps properly by @austinmbrown
  • maintenance update - october 2022 by @mtashley
  • feat: add postlight.com custom extractor by @sdoire
  • release: 2.2.2 by @johnholdun
  • Update README.md by @johnholdun
  • Change Name by @johnholdun
  • Update more dependencies by @johnholdun
  • chore: Inline test fixtures by @johnholdun
  • chore: Update builds by @johnholdun
  • Added custom extractor for www.spektrum.de by @Shepard
  • feat: Add figcaption to list of non-convertible span parents by @johnholdun
  • Add li to the list of non-convertible parents for spans by @Wevah
  • feat: Add a custom extractor for www.ndtv.com. by @jbrayton
  • feat: arstechnica.com extractor by @jbrayton
  • feat: Add a custom extractor for www.engadget.com. by @jbrayton
  • Custom extractor for www.gruene.de by @svenwiegand
  • chore(deps): Bump ws from 5.2.2 to 5.2.3 by @dependabot[bot]
  • chore(deps): Bump moment from 2.29.2 to 2.29.4 by @dependabot[bot]
  • chore(deps): Bump terser from 4.8.0 to 4.8.1 by @dependabot[bot]
  • chore: Update CircleCI config by @johnholdun
  • modifies check-build to differentiate between test env by @jaehanley
  • chore: Update jQuery to 3.5.0 by @johnholdun
  • chore(deps): Bump shell-quote from 1.6.1 to 1.7.3 by @dependabot[bot]
  • Update CHANGELOG.md by @samuelclay
  • support build of es modules by @jimniels
  • Add a new custom extractor for www.abendblatt.de by @mwiedemeyer
  • feat: Add a custom extractor for pastebin.com by @Canejo
  • feat: ma.ttias.be extractor by @jbrayton
  • Feat: update qz.com selectors and tests by @jshakes
  • fix: updating generate-parser dist by @mtashley
  • fix: don't try to re-decode prepared response by @ejucovy
  • chore: update node version in .nvmrc & CONTRIBUTING.md by @PeterDaveHello
  • Bugfix new yorker wired extractors by @sodiumjoe
  • Add --version CLI flag by @pirate
  • chore(deps-dev): bump karma from 3.1.4 to 6.3.16 by @dependabot[bot]
  • chore(deps): bump moment from 2.23.0 to 2.29.2 by @dependabot[bot]
  • feat: Add date formats to two extractors by @johnholdun
  • chore(deps): bump jquery from 3.4.1 to 3.5.0 by @dependabot[bot]
  • chore(deps): bump cached-path-relative from 1.0.2 to 1.1.0 by @dependabot[bot]
  • chore(deps): bump async from 2.6.1 to 2.6.4 by @dependabot[bot]
  • chore(deps): bump tmpl from 1.0.4 to 1.0.5 by @dependabot[bot]
  • chore(deps): bump tar from 4.4.8 to 4.4.19 by @dependabot[bot]
  • chore(deps): bump path-parse from 1.0.5 to 1.0.7 by @dependabot[bot]
  • chore(deps): bump y18n from 3.2.1 to 3.2.2 by @dependabot[bot]
  • chore(deps): bump mixin-deep from 1.3.1 to 1.3.2 by @dependabot[bot]
  • chore(deps): bump browserslist from 4.4.0 to 4.20.3 by @dependabot[bot]
  • chore(deps): bump ajv from 6.7.0 to 6.12.6 by @dependabot[bot]
  • chore(deps): bump pathval from 1.1.0 to 1.1.1 by @dependabot[bot]
  • chore(deps): bump node-fetch from 2.3.0 to 2.6.7 by @dependabot[bot]
  • chore(deps): bump hosted-git-info from 2.1.5 to 2.8.9 by @dependabot[bot]
  • chore(deps): bump ini from 1.3.4 to 1.3.8 by @dependabot[bot]
  • chore(deps): bump handlebars from 4.7.6 to 4.7.7 by @dependabot[bot]
  • chore(deps): bump elliptic from 6.3.2 to 6.5.4 by @dependabot[bot]
  • chore(deps): bump http-proxy from 1.15.2 to 1.18.1 by @dependabot[bot]
  • chore(deps): bump eslint-utils from 1.3.1 to 1.4.3 by @dependabot[bot]
  • chore(deps): bump yargs-parser from 14.0.0 to 15.0.1 by @dependabot[bot]
  • chore(deps): bump static-eval from 2.0.0 to 2.1.0 by @dependabot[bot]
  • release: 2.2.1 by @JadTermsani
  • feat: Ladbible.com extractor by @nitinthewiz
  • feat: Times of India extractor by @nitinthewiz
  • chore(deps): bump lodash from 4.17.2 to 4.17.21 by @dependabot[bot]
  • chore(deps): bump handlebars from 4.1.2 to 4.7.6 by @dependabot[bot]
  • chore: remove greenkeeper configs by @JadTermsani
  • chore: update node version by @JadTermsani
  • feat: update nytimes extractor by @WajeehZantout
  • chore(package): update ora to version 4.0.0 by @greenkeeper[bot]
  • fix(package): update yargs-parser to version 14.0.0 by @greenkeeper[bot]
  • release: 2.2.0 by @mtashley
  • feat: ability to add custom extractors via api by @mtashley
  • Implemented custom extractor epaper.zeit.de by @svenwiegand
  • fix: incorrect parsing on medium.com by @mtashley
  • chore(package): update inquirer to version 7.0.0 by @greenkeeper[bot]
  • chore(package): update karma-chrome-launcher to version 3.0.0 by @greenkeeper[bot]
  • chore(package): update eslint-config-prettier to version 6.1.0 by @greenkeeper[bot]
  • deps: Update wuzzy to fix vulnerability by @malob
  • doc: correct link that points to wrong line by @jfix
  • fix: incorrect parsing on theatlantic.com by @mtashley
  • chore: minifying biorxiv.com fixture by @mtashley
  • Add custom extractor for biorxiv.org by @kennyle3377
  • doc: correct internal page links by @jfix
  • chore(deps): bump lodash.merge from 4.6.1 to 4.6.2 by @dependabot[bot]
  • chore(deps): bump cached-path-relative from 1.0.0 to 1.0.2 by @dependabot[bot]
  • chore(deps): bump merge from 1.2.0 to 1.2.1 by @dependabot[bot]
  • chore(package): update brfs-babel to version 2.0.0 by @greenkeeper[bot]
  • Update moment-timezone to the latest version πŸš€ by @greenkeeper[bot]
  • chore(package): update remark-cli to version 7.0.0 by @greenkeeper[bot]
  • deps: update husky to the latest version πŸš€ by @greenkeeper[bot]
  • deps: update iconv-lite to the latest version πŸš€ by @greenkeeper[bot]
  • tests: remove a duplicate test by @kirillDanshin
  • release: 2.1.1 by @adampash
  • deps: update eslint-config-prettier to version 5.0.0 by @greenkeeper[bot]
  • chore: prevent adding phantomjs-prebuilt as a dependency in CI.
  • fix: support query strings in lazy-loaded srcsets by @toufic-m
  • feat: custom parser for phoronix.com. by @benubois
  • feat: pitchfork extractor by @mgeraci
  • deps: Update moment-timezone to the latest version πŸš€ by @greenkeeper[bot]
  • deps: bump handlebars from 4.0.6 to 4.1.2 by @dependabot[bot]
  • chore(deps): bump sshpk from 1.10.1 to 1.16.1 by @dependabot[bot]
  • Custom Extractor for clinicaltrials.gov by @kennyle3377
  • chore: update husky to version 2.3.0 by @toufic-m
  • docs: Add links to README by @ginatrapani
  • chore: update jquery to version 3.4.1 by @toufic-m
  • fix: new yorker extractor by @WajeehZantout
  • feat: add le monde extractor by @WajeehZantout
  • feat: add rbbtoday.com custom parser by @kik0220
  • feat: add japan.zdnet.com custom parser by @kik0220
  • feat: add wired.jp custom parser by @kik0220
  • feat: add techlog.iij.ad.jp custom parser by @kik0220
  • feat: add weekly.ascii.jp custom parser by @kik0220
  • feat: add www.ipa.go.jp custom parser by @kik0220
  • feat: add www.oreilly.co.jp custom parser by @kik0220
  • feat: add sect.iij.ad.jp custom parser by @kik0220
  • feat: add www.lifehacker.jp custom parser by @kik0220
  • feat: add getnews.jp custom parser by @kik0220
  • feat: add www.gizmodo.jp custom parser by @kik0220
  • feat: add deadline.com custom parser by @kik0220
  • feat: add japan.cnet.com custom parser by @kik0220
  • feat: add www.yomiuri.co.jp custom parser by @kik0220
  • fix: skip absolutizing invalid srcsets by @toufic-m
  • fix: add date_published selector in www.sanwa.co.jp extractor by @kik0220
  • fix: add date_published selector in www.elecom.co.jp extractor by @kik0220
  • fix: add date_published selector in www.ossnews.jp extractor by @kik0220
  • fix: add date_published selector in jvndb.jvn.jp extractor by @kik0220
  • feat: add bookwalker.jp custom parser by @kik0220
  • feat: add takagi-hiromitsu.jp custom parser by @kik0220
  • feat: add www.publickey1.jp custom parser by @kik0220
  • feat: add www.itmedia.co.jp custom parser by @kik0220
  • feat: add www.moongift.jp custom parser by @kik0220
  • feat: add www.infoq.com custom parser by @kik0220
  • feat: add phpspot.org custom parser by @kik0220
  • release: 2.1.0 by @adampash
  • fix: skip absolutizing empty hrefs by @toufic-m
  • feat: add www.jnsa.org custom parser by @kik0220
  • feat: custom genius parser. by @adampash
  • feat: add jvndb.jvn.jp custom parser by @kik0220
  • feat: add scan.netsecurity.ne.jp custom parser by @kik0220
  • feat: add www.elecom.co.jp custom parser by @kik0220
  • feat: add www.sanwa.co.jp custom parser by @kik0220
  • feat: add www.asahi.com custom parser by @kik0220
  • feat: add buzzap.jp custom parser by @kik0220
  • feat: add www.ossnews.jp custom parser by @kik0220
  • feat: add otrs.com custom parser by @kik0220
  • Include "src/shims" for webpack builds for web by @a2
  • chore: small CoC typofix by @fdsimms
  • fix: Initialize Content-Type as empty string if not present by @johnholdun
  • chore: remove unneeded import by @fdsimms
  • chore: set up ciftr for failed test reports by @fdsimms
  • fix: explicity reject non-200 status codes by @toufic-m
  • doc: fix extend typo in README by @droob
  • feat: Support passing custom headers in requests by @toufic-m
  • fix: Adapt CNBC extractor to article redesign by @toufic-m
  • docs: Add parsing custom HTML to README.md by @toufic-m
  • feat: extract custom types with extend option by @droob
  • feat: Return specific errors on failed parse attempts by @toufic-m
  • fix: Preserve whitespace in certain HTML elements by @toufic-m
  • fix: run parser preview by @adampash
  • Extract content from GitHub repos. by @benubois
  • docs: add content formats to README.md
  • fix: better handling for responsive images by @toufic-m
  • feat: switch from forked request to postman-request by @droob
  • feat: Add custom parser for Reddit by @toufic-m
  • feat: upgrade watchify to remove vulnerable hoek dep by @droob
  • fix: update parse signature in tests by @droob
  • docs: add usage gif by @adampash
  • feat: Use Deadspin parser for all Kinja websites by @toufic-m
  • feat: add custom extractor for blisterreview.com by @jhotmann
  • feat: add news.mynavi.jp custom parser by @kik0220
  • docs: typofix by @ollisulopuisto
  • fix: ci artifact paths by @adampash
  • dx: comment on custom parser pr fix by @adampash
  • fix: return early if creating the resource failed. by @benubois
  • Update mocha to the latest version πŸš€ by @greenkeeper[bot]
  • release: 2.0.0 by @adampash
  • fix: jquery doesn't like the case insensitive selector by @adampash
  • chore: refactor format output adjustments by @adampash
  • chore: add files to package.json by @xavdid
  • fix: custom parser generator by @adampash
  • feat: Various Character Encoding Improvements by @benubois
  • docs: delete extra semicolon by @Madisonkanna
  • fix: parse signature in cli by @adampash
  • dx: add .prettierignore by @adampash
  • dx: add .prettierignore by @adampash
  • feat: add content format output options by @adampash
  • release: 1.1.1 by @adampash
  • chore: remove all-contributors-cli deps and script since no longer used by @george-haddad
  • docs: add instructions for cli to README by @adampash
  • feat: handle cli errors/timeout by @adampash
  • docs: added gitter badge
  • docs: add custom parsers to README by @ftrain
  • chor: remove appveyor yml and badge by @adampash
  • fix: ci config by @adampash
  • release: 1.1.0 by @adampash
  • feat: add mercury-parser cli by @adampash
  • deps: update dependencies to enable Greenkeeper 🌴 by @greenkeeper[bot]
  • docs: add npm install instructions by @adampash
  • docs: add hero to README by @ginatrapani
  • a more explicit .prettierrc by @adampash
  • docs: cleanup and update docs by @adampash
  • docs: remove contributors (github already has this covered) by @adampash
  • docs: add gitter room text and link by @george-haddad
  • docs: change text to include AMP and Reader by @george-haddad
  • docs: add mit license badge by @george-haddad
  • feat: hook up ci to publish to npm by @george-haddad
  • fresh run of prettier; remove NOTES.md by @adampash
  • fix: proxy browser in build tests by @adampash
  • docs: add instructions for browser usage to parse current page by @toufic-m
  • chore: update node rollup config by @JadTermsani
  • feat: add fortinet custom parser by @WajeehZantout
  • feat: add fastcompany custom parser by @WajeehZantout
  • Docs contributors by @RalphJbeily
  • docs: update mercury parser installation by @RalphJbeily
  • dx: include test results in comment by @adampash
  • fix: Transform relative URLs in srcset attributes to absolute URLs by @toufic-m
  • fix: womansay.net image urls by @JadTermsani
  • fix: non-forked packages breaking web build by @adampash
  • fix: author and date published selectors by @RalphJbeily
  • docs: add code of conduct path by @RalphJbeily
  • fix: Create CI-specific script commands to allow for cross-platform linting by @toufic-m
  • chore: remove forked packages by @JadTermsani
  • fix: timezone comparison by @JadTermsani
  • docs: add license files by @e55o
  • feat: update package.json scripts to work on windows by @RalphJbeily
  • docs: add install build and test guide by @RalphJbeily
  • feat: add remarklint for md docs by @RalphJbeily
  • docs: add contributing.md by @RalphJbeily
  • docs: PR and Issue templates by @e55o
  • deps: upgrade by @adampash
  • docs: add code of conduct by @JadTermsani
  • dx: remove unnec comments in source by @george-haddad
  • fix: pre-commit hook on js by @adampash
  • chore: update node and some deps by @adampash
  • fix: auto-pr by @adampash
  • dx: automate fixture updates by @adampash
  • dx: one-line comment links by @adampash
  • dx: add image to preview and link to original article by @adampash
  • dx: test/finish bot preview by @adampash
  • dx: comment on PRs when fixtures have been added/changed by @adampash
  • fix: failing fetchResource test by @adampash
  • docs: document release process by @adampash
  • dx: add nvmrc file by @adampash
  • docs: Update README.md by @adampash
  • release: 1.0.13 by @adampash
  • chore: update circle config.yml to 2.0 by @adampash
  • fix: nytimes custom parser title selector by @adampash
  • release: 1.0.12 by @adampash
  • fix: PARSING_NODE undefined by @mutewinter
  • release: 1.0.11 by @adampash
  • fix: viewport tags leaking to parent page by @mutewinter
  • release: 1.0.10 by @adampash
  • feat: improve wh parser by @adampash
  • release: 1.0.9 by @adampash
  • fix: kept elements being removed by @adampash
  • docs: update changelog by @adampash
  • release: 1.0.8 by @adampash
  • feat: improve wh.gov parser by @adampash
  • release: 1.0.7 by @adampash
  • feat: prospect magazine parser by @janetleekim
  • feat: fool.com parser by @kev5873
  • feat: forward.com parser by @janetleekim
  • feat: qdaily parser by @janetleekim
  • feat: newrepublic parser shows image on page by @silasburton
  • Feat: Slate extractor by @silasburton
  • feat: ici.radio-canada.ca extractor by @silasburton
  • feat: better cleanup of atlantic articles by @silasburton
  • Fixes an issue with encoding by @kev5873
  • Feat: gothamist extractor by @silasburton
  • Fix Encoding on Body by @kev5873
  • release: 1.0.6 by @adampash
  • feat: news.natgeo parser by @janetleekim
  • feat: natgeo parser by @janetleekim
  • feat: allow parser to define custom date formats by @adampash
  • feat: latimes parser by @janetleekim
  • feat: macrumors parser by @kev5873
  • feat: androidcentral parser by @kev5873
  • feat: pagesix parser by @janetleekim
  • feat: si parser by @janetleekim
  • feat: rawstory parser by @janetleekim
  • feat: thefederalistpapers parser by @janetleekim
  • feat: cnet parser by @janetleekim
  • feat: cbs sports parser by @janetleekim
  • feat: msnbc parser by @janetleekim
  • feat: howtogeek extractor by @janetleekim
  • feat: opposing views parser by @janetleekim
  • feat: today parser by @janetleekim
  • feat: cinema blend parser by @janetleekim
  • feat: the political insider parser by @janetleekim
  • feat: al.com parser by @janetleekim
  • feat: westernjournalism parser by @janetleekim
  • feat: mental floss parser by @janetleekim
  • feat: thepennyhoarder parser by @janetleekim
  • feat: abcnewsgo parser by @janetleekim
  • feat: support cleaning and transforms for all fields by @adampash
  • feat: america now parser by @janetleekim
  • Merge pull request #115 from postlight/feat-fusion-extractor by @dviramontes
  • feat: adds selector for lead image by @dviramontes
  • feat: adds video embed transform by @dviramontes
  • fix: author selector, less brittle by @dviramontes
  • feat: fusion parser by @janetleekim
  • Merge pull request #137 from postlight/feat-the-verge-polygon-supported-domain by @dviramontes
  • Merge branch 'master' into feat-the-verge-polygon-supported-domain by @dviramontes
  • feat: ny daily news parser by @janetleekim
  • feat: adds www.polygon.com to list of www.theverge.com supportedDomains by @dviramontes
  • feat: sciencefly extractor by @janetleekim
  • release: 1.0.5 by @adampash
  • feat: custom parser for wh blog by @adampash
  • fix: medium bug by @adampash
  • fix: i put a bad comment in .gitattributes by @adampash
  • chore: marking html fixtures as "vendored" by @adampash
  • Feat: LinkedIn parser by @adampash
  • release: 1.0.4 by @adampash
  • feat: changed user agent to latest chrome by @adampash
  • feat: npr parser by @janetleekim
  • feat: recode parser by @janetleekim
  • feat: fortune parser by @janetleekim
  • feat: qz parser by @janetleekim
  • feat: dmagazine parser by @janetleekim
  • feat: reuters parser by @janetleekim
  • feat: mashable parser by @janetleekim
  • feat: chicago tribune parser by @janetleekim
  • feat: hellogiggles parser by @janetleekim
  • feat: thought catalog parser by @janetleekim
  • feat: cnbc parser by @janetleekim
  • feat: popsugar parser by @janetleekim
  • feat: observer parser by @janetleekim
  • feat: nbc news parser by @janetleekim
  • feat: nj.com parser by @janetleekim
  • feat: inquisitor parser by @janetleekim
  • feat: refinery29 parser by @janetleekim
  • feat: miami herald parser by @janetleekim
  • feat: eonline parser by @janetleekim
  • uproxx extractor by @janetleekim
  • feat: 247sports.com extractor by @janetleekim
  • feat: rolling stone extractor by @janetleekim
  • feat: usmagazine extractor by @janetleekim
  • feat: people extractor by @janetleekim
  • feat: vox custom parser by @janetleekim
  • release: 1.0.3 by @adampash
  • feat: bustle extractor by @janetleekim
  • feat: browser-friendly selector for medium by @adampash
  • feat: bloomberg extractor by @adampash
  • feat: sbnation extractor by @janetleekim
  • test: streamlined guardian tests w/new single-extraction by @adampash
  • feat: more cleaning for wired by @adampash
  • feat: the guardian custom extractor by @janetleekim
  • release: 1.0.2 by @adampash
  • feat: youtube custom extractor by @adampash
  • Feat: detect platforms by @adampash
  • fix: preserve whitespace by @adampash
  • Refactor: running tests more efficiently by @adampash
  • release: 1.0.1 by @adampash
  • Fix: extension bugs by @adampash
  • feat: improved nyt parser by @adampash
  • feat: improvements for nyer magazine articles by @adampash
  • fix: cleaning up deks by @adampash
  • feat: aol custom extractor by @janetleekim
  • feat: remove footer links by @mattq
  • release: 1.0.0 so we can start doing proper releaes by @adampash
  • feat: new cleaner for wapo by @adampash
  • fix: browser cleanup by @adampash
  • feat: preview with optional rebuild by @adampash
  • feat: ci speedup by @adampash
  • Feat cnn extractor by @silasburton
  • feat: extractor for the verge by @silasburton
  • fix: added timezone to new republic date by @adampash
  • fix: normalizing spaces for authors/dek/title by @adampash
  • feat: adjustment for huffpo. skipping overly aggressive default cleaners by @adampash
  • Feat: huffington post extractor by @silasburton
  • feat: new republic custom extractor by @adampash
  • feat: add money.cnn custom parser by @janetleekim
  • Feat: custom timezones by @adampash
  • feat: test builds are created for preview purposes so we aren't committing dist every time by @adampash
  • Fix extension bugs by @adampash
  • feat: added tmz custom parser by @adampash
  • fix: changed overly liberal regex for removing transparent images by @adampash
  • feat: encoding response body based on content-type charset by @adampash
  • chore: package upgrades by @adampash
  • chore: updated readme by @adampash
  • Feat: browser support by @adampash
  • fix: servers returning bad headers was breaking request. temporarily by @adampash
  • feat: recording/playing back network requests with nock by @adampash
  • feat: making yarn-friendly for package manager by @adampash
  • Feat: improving ci by @adampash
  • chore: added repo by @adampash
  • fix: circle test passing badge by @adampash
  • Feat: adding circle ci by @adampash
  • feat: parser auto-generates name; lint is more specific by @adampash
  • feat: enforcing line break rules in linter by @adampash
  • updated generator templates for new style of import/export. also some by @adampash
  • making all.js export a generic function to decrease possiblity of error by @adampash
  • feat: allowing extractors to support multiple domains by @adampash
  • feat: custom medium extractor by @adampash
  • feat: allowing iframes from src domain by @adampash
  • feat: supporting all GMG sites using DeadspinExtractor by @adampash
  • feat: quicker lint by being more specific by @adampash
  • fix: increased avatar size by @adampash
  • feat: added all-contributors by @adampash
  • Add @mutewinter as a contributor by @adampash
  • Add @droob as a contributor by @adampash
  • Add @spiffytoy as a contributor by @adampash
  • Update @adampash as a contributor by @adampash
  • Add @adampash as a contributor by @adampash
  • fix: bug that stopped proper attr cleaning in certain cases by @adampash
  • feat: support lazy loading video on deadspin by @adampash
  • fix: removeEmpty shouldn't remove elements with images or iframes inside by @adampash
  • fix: narrowed selector to fix blogspot title selector by @adampash
  • feat: keeping youtube and vimeo iframe embeds by @adampash
  • fix: better selector for nytimes authors by @adampash
  • feat: pulling score from whitelist by @adampash
  • Merge pull request #13 from postlight/feat-apartmenttherapy-parser by @adampash
  • feat: Add custom extrator for Apartment Therapy
  • Merge pull request #12 from postlight/feat-broadwayworld-extractor by @adampash
  • feat: Add custom parser for broadwayworld.com
  • feat: added deadspin custom parser by @adampash
  • feat: generator generates potential selectors for all custom selectable fields by @adampash
  • feat: dek returns null if it's basically the same as the excerpt by @adampash
  • fix: babel-polyfill mess (I think) by @adampash
  • feat: some small tweaks to toy's excellent parsers ☺️ by @adampash
  • Merge pull request #11 from postlight/feat-politico-extractor by @spiffytoy
  • feat: added politico extractor by @spiffytoy
  • Merge pull request #10 from postlight/feat-littlethings-extractor by @spiffytoy
  • feat: added littlethings extractor by @spiffytoy
  • Merge remote-tracking branch 'origin/master' by @spiffytoy
  • Merge pull request #9 from postlight/feat-wikia-extractor by @spiffytoy
  • feat: added wikia extractor by @spiffytoy
  • Merge pull request #8 from postlight/feat-buzzfeed-extractor by @spiffytoy
  • feat: added incomplete buzzfeed extractor by @spiffytoy
  • Merge pull request #7 from postlight/feat-yahoo-extractor by @spiffytoy
  • feat: added incomplete yahoo extractor by @spiffytoy
  • Merge pull request #6 from postlight/feat-msn-extractor by @spiffytoy
  • Merge branch 'feat-msn-extractor' by @spiffytoy
  • feat: added incomplete msn extractor by @spiffytoy
  • chore: small doc fixes by @adampash
  • Merge pull request #5 from postlight/feat-wired-extractor by @adampash
  • feat: added wired custom extractor by @spiffytoy
  • chore: fix a few typos/links by @adampash
  • feat: custom parser + generator + detailed readme instructions by @adampash
  • chore: readme improvement by @adampash
  • feat: content cleaner still runs, but can disable some cleaners by @adampash
  • chore: cleaned up unused files, slight reorg by @adampash
  • feat: switched test framework to jest by @adampash
  • feat: generator for custom parsers and some documentation by @adampash
  • fix: .babelrc was still referencing iris by @adampash
  • fix: including babel-runtime as a bandaid for polyfill error by @adampash
  • fix: using transform-runtime to avoid babel-polyfill conflicts when used by @adampash
  • chore: barebones readme by @adampash
  • refactor: slightly better preview by @adampash
  • feat: improve wikipedia parser by @adampash
  • feat: added preview script to test urls on-the-fly by @adampash
  • chore: renamed iris to mercury by @adampash
  • fix: wikpedia transform only grabs one image from .infobox by @adampash
  • fix: added dist back to git by @adampash
  • build for comparisons by @adampash
  • feat: test runner takes args for wildcard search on individual test for easier testing by @adampash
  • chore: cleaned up python and other unneeded comments by @adampash
  • feat: some basic error handling for bad urls by @adampash
  • Merge pull request #3 from postlight/fix-date-not-local by @adampash
  • fix: some improvements to date parsing. punting on localization issues by @adampash
  • feat: added twitter custom extractor by @adampash
  • feat: added text direction to response by @adampash
  • feat: add option to allow custom extractors to skip default cleaners by @adampash
  • test: added sanity test for get-extractor by @adampash
  • chore: cleanup by @adampash
  • fix: encodeURI before fetching by @adampash
  • fix: explicit/better decoding of gzipped content by @adampash
  • push new build for testing by @adampash
  • refactor: renamed child to sibling for clarity by @adampash
  • fix: handling case where node.get(0) returns null by @adampash
  • chore: disable camelcase for linting by @adampash
  • chore: change result keys to match python api by @adampash
  • fix: wordcount calling excerpt by @adampash
  • checking in dist by @adampash
  • updated name in package.json by @adampash
  • chore: removed TODO.md by @adampash
  • feat: generic extractor for word count by @adampash
  • chore: cleanup by @adampash
  • feat: generic excerpt extraction by @adampash
  • fix: selection should not be empty by @adampash
  • feat: improve nymag.com extractor to grab deks from features by @adampash
  • feat: added page counts by @adampash
  • feat: added domain and url extractor (using same extractor) by @adampash
  • refactor: page collection by @adampash
  • chore: clean up junk tests by @adampash
  • Merge pull request #1 from postlight/test-fix-fixture-locations by @adampash
  • test: fix fixture locations by @mutewinter
  • fix: bug in scoring and converting to paragraphs by @adampash
  • chore: improve linter/babelrc by @adampash
  • chore: refactored and linted by @adampash
  • chore: moved content scoring out of utils, removed no-longer-necessary utils by @adampash
  • feat: nextPageUrl handles multi-page articles by @adampash
  • feat: small improvement to author selectors by @adampash
  • fix: scorePs parent scoring was overwriting child scoring by @adampash
  • fix: accepting cookies with request (required for sites like by @adampash
  • debugging: cheerio isn't always consistent in setting scores by @adampash
  • refactor: limiting calls to $ function by @adampash
  • feat: whitelisting attrs to keep by @adampash
  • chore: remove logic for fetching meta tags with custom attrs (resource by @adampash
  • chore: code reorganization by @adampash
  • improved wiki extractor by @adampash
  • fix: cleaning embed and object nodes by @adampash
  • feat: links are rewritten to absolute in cleaner by @adampash
  • feat: can now fetch attrs in RootExtractor's select method by @adampash
  • feat: Improved dateString parsing to handle more; first trying to parse without cleaning by @adampash
  • refactor: cleaners now run on custom extractors by @adampash
  • feat: basic wikipedia custom extractor by @adampash
  • feat: blogspot.com custom extractor by @adampash
  • fix: duplicate key bug by @adampash
  • fix: dek and leadImg should not be html by @adampash
  • fix: brought .html fixtures into project dir by @adampash
  • feat: RootExtractor performs extraction using custom and generic by @adampash
  • refactor: improve extractor args; passing as object by @adampash
  • Some good basic restructuring by @adampash
  • basic merging of extracting sources by @adampash
  • refactor: preparing for extraction merging by @adampash
  • feat: getExtractor returns generic extractor by @adampash
  • clean formatting by @adampash
  • fix: encoding request response as null by @adampash
  • updated constants by @adampash
  • cleanup by @adampash
  • fix: pre-loading html in resource by @adampash
  • cleanup by @adampash
  • feat: can pass in raw html if already fetched by @adampash
  • feat: resource fetches content from a URL and prepares for parsing by @adampash
  • fix: better scoring for iamge extensions by @adampash
  • notes, cleanup by @adampash
  • feat: bundling with rollup by @adampash
  • feat: GenericExtractLeadImageUrl by @adampash
  • feat: extract dek stubbed (not currently functional) by @adampash
  • fix: title wasn't cleaning html tags by @adampash
  • feat: GenericDatePublishedExtractor by @adampash
  • feat: extract author by @adampash
  • chore: plumbing by @adampash
  • feat: title extraction and scaffolding for more by @adampash
  • refactor: restructuring for metadata extraction by @adampash
  • ignore npm-debug.log by @adampash
  • chore: cleanup by @adampash
  • fix: added babel-polyfill for bug in Reflect by @adampash
  • feat: implemented extractBestNode functionality by @adampash
  • feat: find top candidate function by @adampash
  • feat: added linkDensity function by @adampash
  • fix: changed parseInt to parseFloat by @adampash
  • feat: added scoreContent function by @adampash
  • Lots of progress on score-content by @adampash
  • chore: cleaned up repetative testing for dom by @adampash
  • chore: refactored tests by @adampash
  • feat: ported scoring methods with unit tests by @adampash
  • chore: refactored to slightly cleaner file structure (more to do here) by @adampash
  • feat: convertToParagraphs function working by @adampash
  • Converting multiple line breaks to p by @adampash
  • simple logic in place for brsToPs by @adampash
  • updated todo by @adampash
  • Stripping unlikely candidates from DOM by @adampash
  • getWeight with tests by @adampash
  • Functions in need of porting by @adampash
  • Basic testing in place by @adampash
  • bringing in cheerio by @adampash
  • basic structure by @adampash
  • add gitignore by @adampash
  • using rollup by @adampash
  • Quick port of constants file by @adampash

New Contributors

  • @jocmp made their first contribution
  • @dependabot[bot] made their first contribution in #20
  • @touchRED made their first contribution
  • @sdoire made their first contribution
  • @zhemaituk made their first contribution
  • @austinmbrown made their first contribution
  • @johnholdun made their first contribution
  • @mtashley made their first contribution
  • @Shepard made their first contribution
  • @Wevah made their first contribution
  • @jbrayton made their first contribution
  • @svenwiegand made their first contribution
  • @jaehanley made their first contribution
  • @samuelclay made their first contribution
  • @jimniels made their first contribution
  • @mwiedemeyer made their first contribution
  • @Canejo made their first contribution
  • @jshakes made their first contribution
  • @ejucovy made their first contribution
  • @PeterDaveHello made their first contribution
  • @sodiumjoe made their first contribution
  • @pirate made their first contribution
  • @JadTermsani made their first contribution
  • @nitinthewiz made their first contribution
  • @WajeehZantout made their first contribution
  • @greenkeeper[bot] made their first contribution
  • @malob made their first contribution
  • @jfix made their first contribution
  • @kennyle3377 made their first contribution
  • @kirillDanshin made their first contribution
  • @adampash made their first contribution
  • @ made their first contribution
  • @toufic-m made their first contribution
  • @benubois made their first contribution
  • @mgeraci made their first contribution
  • @ginatrapani made their first contribution
  • @kik0220 made their first contribution
  • @a2 made their first contribution
  • @fdsimms made their first contribution
  • @droob made their first contribution
  • @jhotmann made their first contribution
  • @ollisulopuisto made their first contribution
  • @xavdid made their first contribution
  • @Madisonkanna made their first contribution
  • @george-haddad made their first contribution
  • @ftrain made their first contribution
  • @RalphJbeily made their first contribution
  • @e55o made their first contribution
  • @mutewinter made their first contribution
  • @janetleekim made their first contribution
  • @kev5873 made their first contribution
  • @silasburton made their first contribution
  • @dviramontes made their first contribution
  • @mattq made their first contribution
  • @spiffytoy made their first contribution