Soundification

Sound-Classification using Pytorch

The master branch works with PyTorch 1.1 to 1.3.1.

Introduction

This repository is open-sourced toolbox based on pytorch which can be used for Sound Classification like Emotion recognition, Urbansound 8k, FSD

Feature

It is fast as uses Pytorch and GPUs.
Dataset Management easy.
Can run on colab.
It is platform friendly.
Multi Feature data extraction like MFCC, Chroma, STFT.
Log files and Tensorboard

TO-DO:

Train the model on multiple dataset.
Make model.yaml
Distributed Training multi GPU
Make a code to compute MFCC faster.

Getting Started

To Install the repo and train it on your own Custom Sound-Classifcation Dataset use

Emotion Detetion

MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation using Pytorch.

Dataset

Multimodal EmotionLines Dataset (MELD) has been created by enhancing and extending EmotionLines dataset. MELD contains the same dialogue instances available in EmotionLines, but it also encompasses audio and visual modality along with text. MELD has more than 1400 dialogues and 13000 utterances from Friends TV series. Multiple speakers participated in the dialogues. Each utterance in a dialogue has been labeled by any of these seven emotions -- Anger, Disgust, Sadness, Joy, Neutral, Surprise and Fear. MELD also has sentiment (positive, negative and neutral) annotation for each utterance.

Example Dialogue

Dataset Statistics

Statistics	Train	Dev	Test
# of modality	{a,v,t}	{a,v,t}	{a,v,t}
# of unique words	10,643	2,384	4,361
Avg. utterance length	8.03	7.99	8.28
Max. utterance length	69	37	45
Avg. # of emotions per dialogue	3.30	3.35	3.24
# of dialogues	1039	114	280
# of utterances	9989	1109	2610
# of speakers	260	47	100
# of emotion shift	4003	427	1003
Avg. duration of an utterance	3.59s	3.59s	3.58s

Please visit https://affective-meld.github.io for more details.

Dataset Distribution

	Train	Dev	Test
Anger	1109	153	345
Disgust	271	22	68
Fear	268	40	50
Joy	1743	163	402
Neutral	4710	470	1256
Sadness	683	111	208
Surprise	1205	150	281

The repo is maintained by RAVISING-H

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
Dataset		Dataset
Loader		Loader
Model		Model
config		config
dockerfile		dockerfile
ops		ops
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run.sh		run.sh
setup.cfg		setup.cfg
setup.py		setup.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Soundification

Sound-Classification using Pytorch

Introduction

Feature

TO-DO:

Getting Started

Emotion Detetion

MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation using Pytorch.

Dataset

Dataset Statistics

Dataset Distribution

About

Releases

Packages

Languages

License

ravising-h/Soundification

Folders and files

Latest commit

History

Repository files navigation

Soundification

Sound-Classification using Pytorch

Introduction

Feature

TO-DO:

Getting Started

Emotion Detetion

MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation using Pytorch.

Dataset

Dataset Statistics

Dataset Distribution

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages