Skip to content

DariusTorabian/lyrics-classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Contributors Forks Stargazers Issues MIT License LinkedIn


Logo

Lyrics Classifier 9000

A lyrics scraper and classifer with a CLI.

Report Bug · Request Feature

Table of Contents

About The Project

Project Preview

In this project, I've built a lyrics scraper that automatically scrapes all song texts of a given artist from his lyrics.com artistpage. The data is then written in a .json file. The .json's can then be used to create a Multinominal Naive Bayes model using spaCy's industry-strength natural language processor. The user can then input a string of text of his choice and the model will predict, which artist would have most likely sung that line, even if the artist never did.

All modules have their own command-line interface for easy use. All credit for the song texts belong to lyrics.com for their amazing work. Please follow common sense while scraping and don't DDOS them.

Built With

Getting Started

To get a local copy up and running follow these simple steps.

Prerequisites

I'd advice you to create an own virtual environment for this project. I'm using Anaconda.

Installation & Usage

  1. Clone the repo
git clone https://github.com/dariustorabian/lyrics-classifier.git
  1. Install dependencies with the requirements.txt
conda create --name <NameOfEnvironment> --file requirements.txt
  1. Run lyrics_scraper.py in the command line and with an artistpage URL from lyrics.com and a filename as arguments. For help, run python lyrics_scraper.py -h. The lyrics of this artist will then be scraped and saved under /data/FILENAME.json. Duplicates will be skipped automatically. Repeat this step for as many artists as you'd like to use.

Lyrics Scraper

  1. Run model_creater.py in the command line. You will be asked to input the .json files containing the song texts that were scraped in the previous step and corresponding artist names. Then the Multinominal Naive Bayes model will be created and saved locally.

Model Creater

  1. Run lyrics_classifier.py in the command line. It will automatically load the model created in the previous step. You will be asked to input a string of text and get predictions on which artist of the ones in your model most likely sung that line. Feel free to use my model.p which is trained on The Kooks, Mumford & Sons and Eminem.

Model Creater

Roadmap

Currently, there are no new features in planning. This could change though, so feel free to check back again.

You can also always take a look at the open issues for a list of proposed features (and known issues).

Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Darius Torabian

Project Link: https://github.com/dariustorabian/lyrics-classifier

Acknowledgements

About

A lyrics scraper and classifier.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages