Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add aligner #756

Open
wants to merge 39 commits into
base: master
Choose a base branch
from
Open

Add aligner #756

wants to merge 39 commits into from

Conversation

vadimdddd
Copy link
Contributor

@vadimdddd vadimdddd commented Nov 9, 2021

Aligner is a program for aligning words in time relative to other words in audio file. Gentle project used m3.cc and k3.cc as language and acoustic models for alignment, these approaches were reworked into aligner, which made it possible to use different language models and accelerated the alignment process. Also in setup.py was added ability to run the aligner not only from the folder with it was added.

How to work:

  1. You have to download any language model
  2. You have to prepare .wav and .txt files
  3. When starting the program, you have to specify the required arguments:
    a) path to the wavfile; b) path to the textfile; c) path to the language model.

Example(how to run):
python3 vosk_align.py example/glorious.wav example/glorious.txt example/model

python/vosk/aligner/language_model.py Outdated Show resolved Hide resolved
python/aligner/example/lucier.txt Outdated Show resolved Hide resolved
python/vosk/aligner/multipass.py Outdated Show resolved Hide resolved
python/vosk/aligner/multipass.py Outdated Show resolved Hide resolved
python/vosk/aligner/multipass.py Outdated Show resolved Hide resolved
python/vosk/aligner/multipass.py Outdated Show resolved Hide resolved
python/vosk/aligner/multipass.py Outdated Show resolved Hide resolved
python/vosk/aligner/transcriber.py Outdated Show resolved Hide resolved
python/vosk/aligner/transcription.py Outdated Show resolved Hide resolved
@nshmyrev
Copy link
Collaborator

nshmyrev commented Nov 9, 2021

I'm also waiting for the tests for the aligner so we can automatically verify the code

…duration as parametr(input/output), set realign case audiofile start position as tell(), splited lines for more readable script
…ithm of obtaining chunk's start/end idxs, deleted condition non-existing start/end edges of chunk, added adjustment values shift_start/end tuning left/right edges of chunk
…stakes inside, either test_align.py script with 5 tests which using pytest, vosk_align.pt was modified for testing outside
…mber tokens in txt and wav files, forced_aligner.py: wavfile was added as arg for multipass, multipass.py: now getting wavfile as arg, either were added case if first or last token in txt file does not found in transcript or audio and case if start_pos value less than 0 because shift_start can shift it to negative number
@nshmyrev nshmyrev changed the title add aligner(merging gentle and vosk projects) Add aligner Dec 7, 2021
python/vosk/aligner/full_transcriber.py Outdated Show resolved Hide resolved
python/vosk/aligner/multipass.py Outdated Show resolved Hide resolved
python/vosk/aligner/multipass.py Outdated Show resolved Hide resolved
python/aligner/test_align.py Outdated Show resolved Hide resolved
python/vosk/aligner/recognizer.py Outdated Show resolved Hide resolved
…ords was added as variable for left/right words around NFIA or NFIT words; property names was added for non success cases instead of numbers
… recognizer to process_text, either for forced_aligner.py and multipass.py; in cats.txt, dagon.txt, glorious.txt, polar.txt was added mistakes for testing, fixed bug in polar.wav, deleted unused wendy example, added log files cats, dagon, glorious, polar for tests; fixed mistakes in test_align.py, added asserts; vosk_align.py: added logging for msgs and opportunity to call vosk_align.py from test_align.py
Copy link

@CodeFusionFX CodeFusionFX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm rooting for you and your work! Please keep up the great work!

options = {
'sort_keys': True,
'indent': 4,
'separators': (',', ': '),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also 'ensure_ascii': False,

There is also a lot of trailing whitespace in the code.

(Nice PR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


for op, a, b in word_diff(hypothesis, reference):

try:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

double indented (8 instead of 4 spaces)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

amount, length = unalign(words)
logging.info("%d unaligned words (of %d)", amount, length)

if amount != 0:
Copy link

@dynodino dynodino Apr 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

amount is unassigned if logging is None

Also, amount != 0 is duplicated

Also, progress_cb could be None.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, thanks for you help, I will fix it :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, thanks for you help, I will fix it :)

Hello did you have success fixing it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, no for a while, but I hope to start it after finish my current project

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I returned to aligner project, need to rework code a bit

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonderful 👍 So excited, can't wait to try. @vadimdddd Do you an email or way I contact you to collaborate? Would love to share some thoughts and ideas.

Copy link
Contributor Author

@vadimdddd vadimdddd Jun 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vadimdddd and others added 5 commits April 18, 2022 17:50
… spaces; transcription.py: added parameter in options; forced_aligner.py: deleted duplicated amount condition
…ed_aligner.py; vosk_align.py: get_result(args) was extracted from main() for testing; test_align.py: passing of args for testing has been changed to get_result(args)
@ryanfb
Copy link

ryanfb commented Jun 30, 2022

In testing this locally, there needs to be an empty __init__.py file at python/vosk/aligner/__init__.py, otherwise I would get ModuleNotFoundError: No module named 'vosk.aligner' when trying to run vosk-aligner or vosk_align.py.

@vadimdddd
Copy link
Contributor Author

@ryanfb thx for the info. I will fix it.

@Laurian
Copy link

Laurian commented Jan 23, 2023

I'm really interested in this PR, is there anything I can do to help?

@nshmyrev
Copy link
Collaborator

@Laurian just ping me if I forget please, I'll try to merge it

@CodeFusionFX
Copy link

@Laurian just ping me if I forget please, I'll try to merge it

I am also very interested in this as well. What can I do to help?

@Laurian
Copy link

Laurian commented Mar 30, 2023

@nshmyrev ping 🙏

@CodeFusionFX
Copy link

@nshmyrev
Ping hoping this get merged. Anxiously excited for this merge since last year what can the community do to help?

@finnnnnnnnnnnnnnnnn
Copy link

@nshmyrev
Really hoping this can get merged.

@CodeFusionFX
Copy link

Is this Pull dead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

7 participants