Skip to content

Latest commit

 

History

History
21 lines (16 loc) · 907 Bytes

README.md

File metadata and controls

21 lines (16 loc) · 907 Bytes

Multi Speaker classification

In this project I have used supervised learning to classify multi speaker and single speaker segments from a audio file. The audio file chosen is a Republic TV debate available on YouTube.

The attributes taken for buidling the supervised learning models are :
👉 Mean and Standard deviation of audio signal over 2 sec intervals.
👉 Number and density of peaks
👉 Sub-band energy ratio

The following algorithms are used for classification:
▶️ SGDClassifier
▶️ KNN
▶️ XGBoost
▶️ Neural Networks

The best accuracy on unseen data come out to be 81% using Neaural Networks. The performance can be further imporved by adding more features using PyAudio library (Mel spectrum and other audio features)

Where can such analysis be used : ⏳ If you have spend 1 HR watching news, how much of it was really informational?