Skip to content

MaxHolmberg96/WaveGAN

Repository files navigation

WaveGAN

Implementation of the paper https://arxiv.org/pdf/1802.04208.pdf

Authors:

  • Max Holmberg
  • Joel Lidin

Sound samples

Piano sounds (several 1 second sound files stitched togheter) which was trained for ~100k update steps.

SC09 (0-9 digits) which was trained for ~320k update steps.

Kitten meows which was trained for ~100k update steps.

Dependencies

tensorflow=2.1.0
numpy=1.18.4
matplotlib=3.2.1
scipy=1.4.1
librosa=0.7.2
tqdm

In order to generate the dataset files required for training run

python dataset.py -create_piano_wav -path "dataset/piano/train" -output_path "piano.wav"
python dataset.py -create_piano_npy -path "piano.wav" -output_path "piano.npy"
python dataset.py -create_sc09_npy -path "dataset/sc09-spoken-numbers/sc09/train" -output_path "sc09.npy"

To train the model (on for example the piano dataset)

python run.py -train -dataset piano.npy -epochs 100

To continue the training and specify which logging step it should start from in tensorboard (logs to tensorboard every 10th update step, can be changed in hyperparams)

python run.py -train -continue -initial_log_step 5 -dataset piano.npy -epochs 100

To generate samples with weights, run

python run.py -generate -weights piano -n 1000 -output_path "..."

Spectrogram (9 random samples)

Real (Kittens) WaveGAN (Kittens)
Real (Piano) WaveGAN (Piano)
Real (sc09) WaveGAN (sc09)

Releases

No releases published

Packages

No packages published

Languages