Skip to content

Commit

Permalink
various updates
Browse files Browse the repository at this point in the history
  • Loading branch information
MM committed Aug 12, 2024
1 parent 1b5b67e commit 344f7ca
Showing 1 changed file with 21 additions and 27 deletions.
48 changes: 21 additions & 27 deletions index.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,42 @@
---
title: "Home"
layout: default
layout: home
---

# {{ site.title }}
### {{ site.description }}


This app uses the [Crossref API](https://www.crossref.org/) to compile a list of newly published research in Political Science and Economics. The web page fully updates every Friday at about 3 AM UTC with the latest research published in the last 7 days.

<main class="container">
<div class="bg-light p-5 rounded">
<p class="lead">Paper Picnic 🧺 - A weekly roundup of the latest published research in Political Science and Economics retrieved from the Crossref API. The web page updates every Friday at about 3 AM UTC with research published in the previous 7 days.</p>
</div>
</main>
<p></p>

### Motivation

Available tools to keep up with newly published research are frustrating. Email alerts from publishers clutter the email inbox, arrive in seemingly random intervals, do not include abstracts but instead a ton of tracking code. Publisher RSS feeds are similar frustrating to use as available RSS readers are either clunky or come with expensive subscriptions models. Signing up to email alerts or finding the RSS feeds from a handful of publishers takes easily an entire afternoon.
Available tools to keep up with newly published research are frustrating. Email alerts from publishers clutter the email inbox, arrive in seemingly random intervals, do not include abstracts but instead a ton of tracking code. Publisher RSS feeds are similarly frustrating to use as available RSS readers are either clunky or come with expensive subscriptions models. Signing up to email alerts or finding the RSS feeds from a handful of publishers takes easily an entire afternoon. Twitter - uhm.

This app is meant to make it all a bit easier. It relies on three key ideas:
This web page is meant to make it all a bit easier. It relies on three key ideas:

1. Updates once a week at a known time.
2. Displays all new research on a single web page without clutter.
3. No registration, no ads and no tracking.
3. No registration, no ads, and no tracking.

All data comes from the Crossref API. _"Crossref runs open infrastructure to link research objects, entities, and actions, creating a lasting and reusable scholarly record. As a not-for-profit with over 20,000 members in 160 countries, we drive metadata exchange and support nearly 2 billion monthly API queries, facilitating global research communication."_
All data comes from the Crossref API. _"Crossref runs open infrastructure to link research objects, entities, and actions, creating a lasting and reusable scholarly record. As a not-for-profit with over 20,000 members in 160 countries, we drive metadata exchange and support nearly 2 billion monthly API queries, facilitating global research communication."_ Learn more about Crossref [here](https://www.crossref.org/).

### How does it work?

The backend is a crawler living in a GitHub repository. Every Friday, GitHub Actions executes the crawler. Once the crawler finished, the crawled data is put in a JSON file and rendered into a HTML file using GitHub Pages.
The backend is a crawler living in a GitHub repository. Every Friday, GitHub Actions executes the crawler. Once the crawler finishes, the crawled data is put in a JSON file and rendered into a HTML file using GitHub Pages. The crawler lives in the branch `main` while the jekyll templates for Github Pages live in the branch `gh-pages`.

For each journal, the crawler retrieves all articles added to a journal in the previous week. To that end, it requests all articles for which the field "created" or "published" in the Crossref database is within the last seven days.

Since journals typically have two ISSN numbers (one for print and one for electronic, see [here](https://en.wikipedia.org/wiki/ISSN)), the crawler retrieves articles for both ISSN numbers and deduplicates the results. The ISSN numbers used for the crawler, come from the Crossref lookup [tools](https://www.crossref.org/titleList/).
Since journals typically have two ISSN numbers (one for print and one for electronic, see [here](https://en.wikipedia.org/wiki/ISSN)), the crawler retrieves articles for both ISSN numbers and deduplicates the results. The ISSN numbers used for the crawler, come from the Crossref lookup [tool](https://www.crossref.org/titleList/).

Once an article has been crawled, its unique identifier (the digital object identifier, the DOI), is added to a list. This list is checked by the crawler at every runtime. Only articles that the crawler has not seen before, are included the in a data update. This ensures that articles appearing first online and then again in print are only included once when they appear online for the first time.
Once an article has been crawled, its unique identifier (the digital object identifier, the DOI), is added to a list. This list is checked by the crawler at every runtime. Only articles that the crawler has not seen before are included in the data update. This ensures that articles appearing first online and then again in print are only included once when they appear online for the first time.

The crawler filters out articles that include the terms "ERRATUM", "Corrigendum", "Frontmatter" or "Backmatter" in the title field. It also removes articles when the title is "Front Matter", "Issue Information", "Forthcoming Papers" or if the title field is empty (which is the case for some journals publishing their Table of Contents).

### Which journals are included?

The app only displays published research in peer-reviewed journals. If you are looking for tools to keep up with unpublished research, this app is not for you. For Economics, you might want to check out [NEP: New Economics Papers](https://nep.repec.org/).
The web page only displays published research in peer-reviewed journals. If you are looking for tools to keep up with unpublished research, this web page is not for you. For Economics, you might want to check [NEP: New Economics Papers](https://nep.repec.org/).

__Political Science Journals__

Expand Down Expand Up @@ -68,28 +68,22 @@ If you wish to use browser search to find abstracts that include particular keyw

### Limitations

- The Crossref API is populated by publishers. Not all publishers add abstracts to their metadata. Examples include the publisher Elsevier and Taylor & Francis, which for all of their journals never includes abstracts (see [this](https://www.crossref.org/blog/i4oa-hall-of-fame-2023-edition/) Crossref Blog for details). The app retrieves and displays abstracts whenever they are available on Crossref.
- The Crossref API is populated by publishers. Not all publishers add abstracts to their metadata. Examples include the publisher Elsevier and Taylor & Francis, which for all of their journals never includes abstracts (see [this](https://www.crossref.org/blog/i4oa-hall-of-fame-2023-edition/) Crossref Blog for details). The crawler retrieves abstracts whenever they are available on Crossref.

- Non-systematic observations suggest that Crossref updates tend to come before email alerts are sent by publishers and that all new content from crawled journals is reported to Crossref.


### How can I contribute?

1. Build your own customized email alert or a Slack Bot: The crawled data are available in public JSON files that are updated jointly with the web page at the same time:

- [./json/politics.json](./json/politics.json)
- [./json/economics.json](./json/economics.json)
- [./json/last_udpate.json](./json/last_update.json)

2. Find and fix bugs or add new features to the app. All code is available in the GitHub repository.

3. Build a better (and equally open source) version of the app.

4. Engage and support with [The Initiative for Open Abstracts](https://i4oa.org/).
1. Build your own customized email alert or a Slack Bot: The crawled data are available in public JSON files: [./json/politics.json](./json/politics.json) and [./json/economics.json](./json/economics.json). The list of journals crawled is also available: [./json/politics_journals.json](./json/politics_journals.json) and [./json/economics_journals.json](./json/economics_journals.json)

2. Find and fix bugs or add new features to the crawler/web page. All code is available in the GitHub repository.

3. Build a better (and equally open source) version of the this web page.

4. Support [The Initiative for Open Abstracts](https://i4oa.org/).

5. <script type="text/javascript" src="https://cdnjs.buymeacoffee.com/1.0.0/button.prod.min.js" data-name="bmc-button" data-slug="mmarbach" data-color="#FFDD00" data-emoji="" data-font="Cookie" data-text="Buy me a coffee" data-outline-color="#000000" data-font-color="#000000" data-coffee-color="#ffffff" ></script>



Expand Down

0 comments on commit 344f7ca

Please sign in to comment.