This tool is completely customizable for converting the arxiv email you receive daily to a list of relevant papers, and then reading the abstracts out loud using Google Text-to-speech!! It is for academics / paper-readers who work better with audio.
Your arxiv email usually looks like this:
But what if it can now look like this?
Complete with keyword specification? That sure sounds nice!
This is a video of me rambling about how to use the current system. When building this I focused on what worked so nothing is too polished. But I have made it into a workflow that works for me and I hope this can benefit you as well!
Here are the steps you need to get the parser kicking and working locally. I am not yet hosting it on a public site, so you would need to set up Google text-to-speech yourself, which should just be about setting up Google Cloud Platform, then creating a project and adding Google Text-to-speech to it.
- If you don't have daily arxiv emails and want to sign up for one, please follow this link
- You should also sign up for Google Cloud Platform if you haven't already, and then make a project specifically for arxivParser, and then add Google Text-to-Speech; the Text-to-Speech service is quite cheap, the Standard voice pricing is US$0.000004 per character (US$4 per 1 million characters) (see here), plus new users get $300 free credit!
- Create a conda environment using the
environment.yml
file attached
conda env create -f environment.yml
- Run the flask app; there will likely be some error if this is your first time. If you haven't yet, you should run
gcloud auth login
to log in as your GCP user. Make sure your default project is the project for arxivParser.
python3 app.py
- Download your .eml file. You can download this by downloading your arxiv email message directly.
- Specify the keyword / key phrase you want to look up papers about; SentenceTransformer would then be used to match individual articles to the keyword you specify.
- Upload the .eml file. The keyword you just specified in the text field would disappear, but don't worry, it has been remembered! I just don't know how to persist it because website design is not my passion.
- Convert the relevant papers to audio; this may take a hot second.
- Play individual audios or play all audio clips (right now you can't pause once you hit play all, unfortunately, will fix that hopefully soon and if there is popular demand!)! These audio clips will be stored in
static/audio
folder in case you want to go find them. You can specify the voice used by Google text-to-speech inhelpers/read_to_audio.py
.
If you just want something read out loud to you, you can copy and paste the content directly into the "Raw Text" text box and continue with the workflow.