Preserve previous result as context for next segment #1335

vsd-vector · 2024-09-10T15:16:03Z

In online transducer recognizer during reset() sherpa-onnx preserves the output of the decoder network (decoder_output), but resets model context to sequence of blanks.

decoder_->UpdateDecoderOut(&s->GetResult());
Ort::Value decoder_out = std::move(s->GetResult().decoder_out);

auto r = decoder_->GetEmptyResult(); # initialized with empty hyp containing blanks
...
s->SetResult(r);
s->GetResult().decoder_out = std::move(decoder_out);

After the reset, at t0 of the new segment the beam-search-decoder reuses decoder_output.

if (t == 0) {
    UseCachedDecoderOut(hyps_row_splits, *result, &decoder_out);
}

Let's assume it outputs some token Z, because it's the most probable considering cached decoder output (calculated for previous context before reset - "X Y").
Then on step t1, the beam-search-decoder calculates new decoder_output using current reset context which now is " Z".

It may happen that Z is no longer most probable hypothesis and so the beam search switches to another path. User sees this as "Z" flickering and getting deleted. Sometimes this switch can also happen after outputting 2-3 and probably even more tokens. Besides user discomfort, deleted words frequently contain correct transcript (at least in my subjective experience).

This PR fixes this by using previous result tokens as "context" for next segment instead of caching the decoder output for one timestep.

csukuangfj · 2024-09-11T02:44:07Z

Thank you for your contribution!

Preserve previous result as context for next segment

4467fbe

csukuangfj merged commit fa20ae1 into k2-fsa:master Sep 11, 2024
190 of 203 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preserve previous result as context for next segment #1335

Preserve previous result as context for next segment #1335

vsd-vector commented Sep 10, 2024

csukuangfj commented Sep 11, 2024

Preserve previous result as context for next segment #1335

Preserve previous result as context for next segment #1335

Conversation

vsd-vector commented Sep 10, 2024

csukuangfj commented Sep 11, 2024