Acoustic Representation Toolbox

Introduction

What is the best representation of audio for machine learning and neural networks?

Applying machine learning techniques to acoustic data often requires extensive feature engineering: All representations of audio have benefits and drawbacks. For example, raw audio minimizes the number of preprocessing assumptions and lets the neural nets get as close to the data as possible, but can be challenging to work with because understanding audio requires spanning many orders of magnitude in time. Understanding the different between "fast" and "vast" requires millisecond nuance, but understanding "former" in "Of whales and humans, whose voice more powerfully carries? With conversations over hundreds of killometers, it is the former," requiries a thousands of milliseconds. Conventionally, it is common to compute a STFT-based spectrogram, yielding 2D "image" outputs that can be fed into a convolution neural network (CNN). However, spectrogram based approaches force a trade-off between the time resolution and frequency resolution.

This is why we made the Acoustic Representation Toolbox. There are numerous ways to represent audio, and here we explore a few. This is a live toolbox—we are constantly adding and refining representations—and is a work in progress.

Time and frequency domain representations
Spectrograms
Hilbert-Huang transforms
Continuous wavelet transforms
Spectral hyperresolution representations
Miscellaneous representations
Deep-learned representations (constructed via generative models)

Each of these methods is introduced and detailed in a tutorial-style module that provides theoretical background in concert with practical application. Our mission for this toolbox is to construct a relatively comprehensive set of representation tools in order to be able to optimize the acoustic feature engineering process prior to training a machine learning model.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
07_Animations		07_Animations
07_GAIA		07_GAIA
07_GON		07_GON
08_Extras		08_Extras
data		data
.gitignore		.gitignore
01_Introduction_to_Acoustic_Signal_Representation.ipynb		01_Introduction_to_Acoustic_Signal_Representation.ipynb
02_Spectrograms.ipynb		02_Spectrograms.ipynb
03_Hilbert_Huang_Transforms.ipynb		03_Hilbert_Huang_Transforms.ipynb
04_Continuous_Wavelet_Transforms.ipynb		04_Continuous_Wavelet_Transforms.ipynb
05_Spectral_Hyperresolution_Representation.ipynb		05_Spectral_Hyperresolution_Representation.ipynb
06_Miscellaneous_Representations.ipynb		06_Miscellaneous_Representations.ipynb
07a_Deep_Learned_Representations_GAIA.ipynb		07a_Deep_Learned_Representations_GAIA.ipynb
07b_Deep_Learned_Representations_GON.ipynb		07b_Deep_Learned_Representations_GON.ipynb
08_Wavelet_Scattering_Transform.ipynb		08_Wavelet_Scattering_Transform.ipynb
High_Level_Overview.ipynb		High_Level_Overview.ipynb
README.md		README.md
TFR-Spectrogram.png		TFR-Spectrogram.png
venv-toolbox.ipynb		venv-toolbox.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Acoustic Representation Toolbox

Introduction

About

Releases

Packages

Contributors 2

Languages

earthspecies/acoustic-representation-toolbox

Folders and files

Latest commit

History

Repository files navigation

Acoustic Representation Toolbox

Introduction

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages