Skip to content

Commit

Permalink
DNN (nnet2) functionality (#65)
Browse files Browse the repository at this point in the history
* Update for 0.8.0

* Fix bug in verbose

* Fix API for command line

* Move GP reorganization to dedicated repository

* Fix bug where some files were being skipped

* Add new silence phone

* Add debug flag to tree pdf generation

* Add another debug flag

* Fix TextGrid writing issues

Fixed a bug where rounding issues would cause TextGrids to fail to write
Fixed an issue where a small empty interval was added due to rounding

* Fix pyinstaller develop url (#12)

* fix error in #11 problem with url not found

* update requirements.txt to remove the =

* Add support for finding transcription issues

* Fix DummyArgs

* Fix failing tests

* Add checking for duplicate utterance names

Resolves #14

* Add models for non-speaker adapted triphones to output models

* Add poster citation to docs

* Fix typos

* Fix backwards compat issue

* Updates for paper

* Fix input in tests

* Make sure check_dependencies works

* Fix bug in nodict training

* Split models (#24)

Resolves #15 
Resolves #16 
Resolves #18 
Resolves #19 
Resolves #20 
Resolves #21 
Resolves #23

* Update kaldibinaries.py

* Update kaldibinaries.py

* Build updates

* Add new binary collection scripts

* Update open ngram binary script

* Fix binary script

* Update freeze script

* Catch spaces in speakers and filenames

* Update version

* Fix KeyError issue

* Fix dictionary/acoustic model compat check

Also ignore standalone '-' in text
Fix bug with where not all phones in the model were being generated for
phone mapping files

* Fix tests for dictionary/acoustic model validation

* Re enable xfail for debug test

Need more binaries for Travis to test

* Fix missing path manipulations

* Update mac kaldi binary finding

* Fix kaldibinaries

* Fix binary collection scripts

* Fix for dylib loaders

* Debugging

* Only output speaker-adapted model

Unknown benefit to aligning with regular triphone model, same size as
speaker-adapted

* Re enable debug testing

* Ignore dot for drawing

* Add window size for g2p

* Fix G2PArgs

* Reorganize documentation

* Update precompiled binaries location

* Update download script for caching

* Update download script

* Update binaries script

* Add validation flag for train_g2p

* Updated kaldibinaries for mac

* Update libraries

* Add accuracy return to G2P validation

* Add logging of TG errors

* Update location of dlls

* Different way of packaging thirdparty bins

* Make bins executable

* Fix issue with dictionaries using numbers as phones

* Update hosting for precompiled thirdparty

* Documentation update

* Fix table formatting

* Fix for some bracket detection issue

* Update reference

* Create requirements.txt

* Create .travis.yml

* Update CH example docs

* Update dictionary_generating.rst

* Add warning for output directories

* Update dictionary

* Update Mac distribution note

* Add Spanish link

* Fix generate dictionary bin name

* DOCS: Mock yaml

* Better output of error messages for TextGrid errors

* Fix encoding bug for acoustic model metadata

* FIX: Missing phone error message not outputing correctly

* Possible FIX: PATH variable missing from os.environ on Windows 7

* DOCS: Update Linux installation instructions

Add external atlas package dependency information

* DOCS: Fix typo

* DOCS: Fix typos

* FIX: Raise exception when word does not have a pronunciation

* FIX: Add OOV check before MFCC gen

Resolves #31

* ENH: Add flag to not stop check (-q)

Resolves #32

* TESTS: Update aligner tests

* TESTS: Change default for skip_input

* nnet2 cleaned changes

Does not include docs

* command line update

* Tests

Commented out dummy asserts to make print

* No matplotlib

* Kaldi binaries updated

* Kaldi binaries in bin

* tests passing locally

* git ignore update

* Initial code review changes

* All tests passing locally

* Externally hosted binaries link changed

* Pointing to new binaries

* updated Travis's needed glib packages

* Travis library updates

* Another Travis library change

* apt-get update to True

* Travis updating

* auto-confirm flags

* no dist-upgrade

* no explicit upgrade

* Trying gcc 5

* gcc-5 name change

* just ppa

* dummy commit to restart Travis build

* no implicit update; back to explicit update

* toolchain moved to apt declaration

* readding dist-upgrade

* reinstalling potentially half-configured packages

* dummy commit for reconnection to servers

* removal of postgresql and upgrade-dist

* Perl replacements made more pythonic, stdout cleaned

* cleaning

* Updated linux binaries to 14.04

* Correcting ivector extractor path (to be cleaned yet more soon)

* Dummy commit to initiate Travis build

* Travis build

* Docs update & precompiled binaries to static

* Dummy commit to restart travis build

* Dummy commit for travis build

* Lowering # of iterations for testing non-timeout
  • Loading branch information
a-coles authored and mmcauliffe committed Jun 14, 2018
1 parent 3f548a8 commit 2b75641
Show file tree
Hide file tree
Showing 76 changed files with 3,655 additions and 179 deletions.
5 changes: 1 addition & 4 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,7 @@ __pycache__/
tests/data/generated
docs/source/generated

thirdparty
!thirdparty/*.md
!thirdparty/*.py

*.fst

# C extensions
*.so
Expand All @@ -36,6 +32,7 @@ var/
*.egg-info/
.installed.cfg
*.egg
thirdparty/bin

# PyInstaller
# Usually these files are written by a python script from a template
Expand Down
4 changes: 4 additions & 0 deletions .pytest_cache/v/cache/lastfailed
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
{
"tests/test_aligner.py::test_sick_ivectors": true,
"tests/test_aligner.py::test_sick_nnet_basic": true
}
3 changes: 3 additions & 0 deletions .pytest_cache/v/cache/nodeids
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[
"tests/test_aligner.py::test_sick_nnet_basic"
]
6 changes: 5 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,15 @@ dist: trusty

addons:
apt:
sources:
- ubuntu-toolchain-r-test
packages:
- libatlas3-base
- libstdc++6

before_install:
- pwd
- sudo apt-get update
- pwd
- sudo rm -rf /dev/shm
- sudo ln -s /run/shm /dev/shm
- sudo ln -s -f bash /bin/sh
Expand Down
68 changes: 68 additions & 0 deletions aligner/accuracy_graph.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# Simple script to pull a graph out for accuracy vs training iteration
# for nnet2 implementation in MFA.
# Probably should be integrated with MFA for # jobs parameters, etc.
# eventually.

import sys
import os
import glob
import numpy as np
#from matplotlib import pyplot as plt

def get_accuracy_graph(log_dir, export_dir):
os.chdir(log_dir)
compute_prob_logs = glob.glob('compute_prob_train.*.log')
acc_pairs = []
prob_pairs = []
iterations = []
for log in compute_prob_logs:
print(log)
split_name = log.split('.')
iteration = split_name[1]
if iteration == 'final':
iteration = int(max(iterations))+1
iterations.append(int(iteration))
with open(log, 'r') as fp:
lines = fp.readlines()
#print(lines)
index = len(lines)-1
line = lines[index]
#while not line.startswith('LOG (nnet-compute-prob:main()'):
while not 'and accuracy is' in line:
#print(line)
#print(index)
index = index-1
line = lines[index]
accuracy = line.split(' ')[12]
prob = line.split(' ')[8]
acc_pair = [int(iteration), float(accuracy)]
prob_pair = [int(iteration), float(prob)]
print(prob_pair)
acc_pairs.append(acc_pair)
prob_pairs.append(prob_pair)
acc_pairs.sort(key=lambda x: x[0])
prob_pairs.sort(key=lambda x: x[0])

os.chdir(export_dir)

plt.gcf().clear()
acc_pairs = np.array(acc_pairs)
x, y = acc_pairs.T
plt.scatter(x, y)
plt.title("# Iterations vs. Accuracy")
plt.xlabel("# iterations")
plt.ylabel("accuracy")
plt.savefig('accuracy.png')

plt.gcf().clear()
prob_pairs = np.array(prob_pairs)
x, y = prob_pairs.T
plt.scatter(x, y)
plt.title("# Iterations vs. Log Prob")
plt.xlabel("# iterations")
plt.ylabel("log_prob")
plt.savefig('log_prob.png')

if __name__ == '__main__':
log_dir = sys.argv[1]
get_accuracy_graph(log_dir, log_dir)
Loading

0 comments on commit 2b75641

Please sign in to comment.