Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate a reliability score for a given article #65

Open
simonb83 opened this issue Feb 16, 2017 · 2 comments
Open

Generate a reliability score for a given article #65

simonb83 opened this issue Feb 16, 2017 · 2 comments

Comments

@simonb83
Copy link
Collaborator

In some contexts, information about IDPs is highly politicized, which could be problematic if you're drawing from media reports. You'd want to be very careful in selecting which sources you used for info about the Rohingya in Myanmar, for example.

It would be good to be able to score an article for reliability in order to help analysts as they analyze and interpret the extracted data.
In some cases, news sources may be government run, 'fake news' or have poor sources / track record, and so any data reported by and extracted from these sources should be identifiable as having potential issues.

On the front end, this could include a filter for analysts to use whereby they can select all articles, or those which a reliability score above a certain threshold.

Some thoughts for implementation include:

  1. A maintainable list of known problematic sources
  2. Measuring similarity of reported facts between sources
  3. A maintainable list of highly trusted and common 'core' news sources and anything from these sources automatically gets a high reliability rating.
  4. New or unknown sources automatically get a lower rating unless their facts are similar enough to a report from a highly trusted source etc.
@georgerichardson georgerichardson added this to the interpreter v0.1 milestone Feb 17, 2017
@georgerichardson
Copy link

We could also look at links in the text to see what other sources they cite.

Are you envisioning that this kind of score be hard coded or that there is also an element of learning from an analyst who verifies the sources?

@simonb83
Copy link
Collaborator Author

I'm not really sure yet, but likely some sort of combination.

Probably initially some hard coded rules to generate a preliminary score that can then be verified by an analyst and updated if need be.

If the 'rules' include some sort of whitelist or blacklist for certain sources, then this could definitely be automatically updated as as analysts verify the sources.

Definitely later down the line with enough hand-reviewed articles, it would be interesting to try and apply ML and see what sort of features might help distinguish articles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants