Generate a reliability score for a given article #65

simonb83 · 2017-02-16T20:41:25Z

In some contexts, information about IDPs is highly politicized, which could be problematic if you're drawing from media reports. You'd want to be very careful in selecting which sources you used for info about the Rohingya in Myanmar, for example.

It would be good to be able to score an article for reliability in order to help analysts as they analyze and interpret the extracted data.
In some cases, news sources may be government run, 'fake news' or have poor sources / track record, and so any data reported by and extracted from these sources should be identifiable as having potential issues.

On the front end, this could include a filter for analysts to use whereby they can select all articles, or those which a reliability score above a certain threshold.

Some thoughts for implementation include:

A maintainable list of known problematic sources
Measuring similarity of reported facts between sources
A maintainable list of highly trusted and common 'core' news sources and anything from these sources automatically gets a high reliability rating.
New or unknown sources automatically get a lower rating unless their facts are similar enough to a report from a highly trusted source etc.

georgerichardson · 2017-02-17T18:53:25Z

We could also look at links in the text to see what other sources they cite.

Are you envisioning that this kind of score be hard coded or that there is also an element of learning from an analyst who verifies the sources?

simonb83 · 2017-02-17T19:48:32Z

I'm not really sure yet, but likely some sort of combination.

Probably initially some hard coded rules to generate a preliminary score that can then be verified by an analyst and updated if need be.

If the 'rules' include some sort of whitelist or blacklist for certain sources, then this could definitely be automatically updated as as analysts verify the sources.

Definitely later down the line with enough hand-reviewed articles, it would be interesting to try and apply ML and see what sort of features might help distinguish articles.

georgerichardson added this to the interpreter v0.1 milestone Feb 17, 2017

georgerichardson mentioned this issue Feb 18, 2017

Scraping reliability score #72

Open

georgerichardson added the interpreter label Feb 19, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate a reliability score for a given article #65

Generate a reliability score for a given article #65

simonb83 commented Feb 16, 2017

georgerichardson commented Feb 17, 2017

simonb83 commented Feb 17, 2017

Generate a reliability score for a given article #65

Generate a reliability score for a given article #65

Comments

simonb83 commented Feb 16, 2017

georgerichardson commented Feb 17, 2017

simonb83 commented Feb 17, 2017