Natural Language Processing of Podcast Data

This repository contains research into podcasts with Natural Language processing.

Podcast Data is vast and growing tremendously day by day. There are many data points to research podcasts, with the main being the audio files themselves, transcripts of the audio, podcast descriptions and other metadata obtained from a podcast's rss feed.

Phase 1: Name entity recognition of Podcast and episode text descriptions

The first phase of this research is dealing with textual data obtained from the podcast and it's episodes' descriptions obtained from rss feeds. Named entities are extracted from the descriptions and the entities attached to the the resulting podcast file.

Research Notes

Downloading the latest version from git
JSON-Java - for parsing JSON.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github/workflows		.github/workflows
docs		docs
nerWork		nerWork
scripts		scripts
.gitignore		.gitignore
.prettierrc		.prettierrc
LICENSE		LICENSE
Readme.md		Readme.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Natural Language Processing of Podcast Data

Phase 1: Name entity recognition of Podcast and episode text descriptions

Research Notes

About

Releases 3

Packages

Languages

License

podcast-data-lab/core-nlp-research

Folders and files

Latest commit

History

Repository files navigation

Natural Language Processing of Podcast Data

Phase 1: Name entity recognition of Podcast and episode text descriptions

Research Notes

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages