Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression and Context leak in examples/zipformer.rs since v0.5.0 #45

Open
solaoi opened this issue Oct 31, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@solaoi
Copy link

solaoi commented Oct 31, 2024

Description

After upgrading from v0.4.1 to v0.5.0, we've observed two significant issues in examples/zipformer.rs:

  1. Significant performance degradation: Response times have increased substantially
  2. Context leak: The application now shows Context leak detected, msgtracer returned -1 messages in logs

Previous Behavior (v0.4.1)

  • The example worked normally with expected response times
  • No Context leaks were detected

Current Behavior (v0.5.0-)

  • Significantly slower response times
  • Context leaks are being detected as shown in the logs below

Logs

cargo run --example zipformer -- "sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/test_wavs/1.wav" "sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/encoder-epoch-99-avg-1.onnx" "sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/decoder-epoch-99-avg-1.onnx" "sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/joiner-epoch-99-avg-1.onnx" "sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/tokens.txt" 
   Compiling sherpa-rs-sys v0.5.0 (/Users/koaoyama/Projects/solaoi/sherpa-rs/sys)
   Compiling sherpa-rs v0.5.0 (/Users/koaoyama/Projects/solaoi/sherpa-rs)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 2.70s
     Running `target/debug/examples/zipformer sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/test_wavs/1.wav sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/encoder-epoch-99-avg-1.onnx sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/decoder-epoch-99-avg-1.onnx sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/joiner-epoch-99-avg-1.onnx sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/tokens.txt`
Context leak detected, msgtracer returned -1
Context leak detected, msgtracer returned -1
Context leak detected, msgtracer returned -1
✅ Text: 気象庁は雪や路面の凍結による交通への影響暴風雪や高波に警戒するとともに雪崩や屋根からの落雪にも十分注意するよう呼びかけています
@thewh1teagle
Copy link
Owner

Hi @solaoi
What OS do you use? Architecture?
Do you use the example exactly as in this repository with the same model?
If it's macos: can you compare cpu vs coreml provider?

@solaoi
Copy link
Author

solaoi commented Oct 31, 2024

@thewh1teagle
Thanks for the quick response!

I'm using an M2 Mac, and yes, I'm using exactly the same model and example from this repository.
I've found that the issue occurs specifically with the CoreML provider, while using the CPU provider works fine.

Here are my exact steps to reproduce:

# Get the code and switch to v0.5.0
git fetch
git checkout -b v0.5.0 refs/tags/v0.5.0
git submodule update --init --recursive

# Download and extract the model
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01.tar.bz2
tar xvf sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01.tar.bz2
rm sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01.tar.bz2

# Run the example
cargo run --example zipformer -- "sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/test_wavs/1.wav" "sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/encoder-epoch-99-avg-1.onnx" "sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/decoder-epoch-99-avg-1.onnx" "sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/joiner-epoch-99-avg-1.onnx" "sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/tokens.txt"

@ghjdegithub
Copy link
Contributor

@solaoi
You try examples/non_streaming_decode_files.rs on https://github.com/ghjdegithub/sherpa-rs/tree/dev and see if the same problem happens.

@solaoi
Copy link
Author

solaoi commented Nov 4, 2024

@ghjdegithub
Thank you for the suggestion.

I tried running your examples/non_streaming_decode_files.rs with the following command using the CPU, and it worked correctly:

cargo run --example streaming_decode_files -- \
  --paraformer ./sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx \
  --tokens ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt \
  --model-type paraformer \
  --rule-fsts ./itn_zh_number.fst \
  --debug 0 \
  ./itn-zh-number.wav \
  --provider cpu

However, when I enabled CoreML with the command below:

cargo run --example streaming_decode_files -- \
  --paraformer ./sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx \
  --tokens ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt \
  --model-type paraformer \
  --rule-fsts ./itn_zh_number.fst \
  --debug 0 \
  ./itn-zh-number.wav \
  --provider coreml

I encountered the following error:

Reading ./itn-zh-number.wav
Initializing recognizer (may take several seconds)
Recognizer created!
Start decoding!
2024-11-04 22:14:37.433 non_streaming_decode_files[26702:7208661] 2024-11-04 22:14:37.433412 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running CoreML_15675115799816541534_7 node. Name:'CoreMLExecutionProvider_CoreML_15675115799816541534_7_7' Status Message: Error executing model: Unable to compute the prediction using a neural network model. It can be an invalid input data or broken/unsupported model (error code: -1).
/Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/offline-recognizer-paraformer-impl.h:DecodeStreams:191 

Caught exception:

Non-zero status code returned while running CoreML_15675115799816541534_7 node. Name:'CoreMLExecutionProvider_CoreML_15675115799816541534_7_7' Status Message: Error executing model: Unable to compute the prediction using a neural network model. It can be an invalid input data or broken/unsupported model (error code: -1).

Return an empty result
Decoding done!
thread 'main' panicked at examples/non_streaming_decode_files.rs:231:38:
called `Option::unwrap()` on a `None` value

As a side note, the example command for examples/non_streaming_decode_files.rs seems to contain a typo. The documentation lists it as:

cargo run --example streaming_decode_files -- \
  --paraformer ./sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx \
  --tokens ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt \
  --model-type paraformer \
  --rule-fsts ./itn_zh_number.fst \
  --debug 0 \
  ./itn-zh-number.wav

However, it should be --example non_streaming_decode_files instead.

@ghjdegithub
Copy link
Contributor

@solaoi It does seem to be a mistake, there is a no- missing. The model you wrote in the CoreML hint is not supported, try another model, this example has a full set of parameter inputs.

@solaoi
Copy link
Author

solaoi commented Nov 4, 2024

@ghjdegithub
I see, I tried running it with the model I mentioned.

cargo run --example non_streaming_decode_files -- \
  --encoder ./sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/encoder-epoch-99-avg-1.onnx \
  --decoder ./sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/decoder-epoch-99-avg-1.onnx \
  --joiner ./sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/joiner-epoch-99-avg-1.onnx \
  --tokens ./sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/tokens.txt \
  --debug 0 \
  --provider coreml \
  ./sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/test_wavs/1.wav

However, I still encountered this issue error(Context leak detected, msgtracer returned -1):

Initializing recognizer (may take several seconds)
Recognizer created!
Start decoding!
Context leak detected, msgtracer returned -1
Context leak detected, msgtracer returned -1
Context leak detected, msgtracer returned -1
Decoding done!
Text: 気象庁は雪や路面の凍結による交通への影響暴風雪や高波に警戒するとともに雪崩や屋根からの落雪にも十分注意するよう呼びかけています
Emotion: 
Lang: 
Event: 
Timestamp: 0.0
Timestamp: 0.48
Timestamp: 0.64
Timestamp: 0.88
Timestamp: 1.24
Timestamp: 1.4399999
Timestamp: 1.8
Timestamp: 2.0
Timestamp: 2.12
Timestamp: 2.3999999
Timestamp: 2.56
Timestamp: 2.8
Timestamp: 2.96
Timestamp: 3.04
Timestamp: 3.4399998
Timestamp: 3.6
Timestamp: 3.8799999
Timestamp: 4.0
Timestamp: 4.2799997
Timestamp: 4.4
Timestamp: 4.7599998
Timestamp: 4.96
Timestamp: 5.2
Timestamp: 5.4
Timestamp: 5.72
Timestamp: 5.92
Timestamp: 6.16
Timestamp: 6.48
Timestamp: 6.64
Timestamp: 6.8799996
Timestamp: 6.96
Timestamp: 7.08
Timestamp: 7.2799997
Timestamp: 7.48
Timestamp: 7.64
Timestamp: 8.0
Timestamp: 8.16
Timestamp: 8.36
Timestamp: 8.679999
Timestamp: 8.8
Timestamp: 9.04
Timestamp: 9.12
Timestamp: 9.28
Timestamp: 9.639999
Timestamp: 9.8
Timestamp: 10.0
Timestamp: 10.16
Timestamp: 10.44
Timestamp: 10.639999
Timestamp: 10.92
Timestamp: 11.04
Timestamp: 11.24
Timestamp: 11.36
Timestamp: 11.5199995
Timestamp: 11.639999
Timestamp: 11.88
Timestamp: 11.92
Timestamp: 12.16
Timestamp: 12.28
Timestamp: 12.44
Timestamp: 12.639999
Timestamp: 13.16
Timestamp: 13.2
Token: 気
Token: 象
Token: 庁
Token: は
Token: 雪
Token: や
Token: 路
Token: 面
Token: の
Token: 凍
Token: 結
Token: に
Token: よ
Token: る
Token: 交
Token: 通
Token: へ
Token: の
Token: 影
Token: 響
Token: 暴
Token: 風
Token: 雪
Token: や
Token: 高
Token: 波
Token: に
Token: 警
Token: 戒
Token: す
Token: る
Token: と
Token: と
Token: も
Token: に
Token: 雪
Token: 崩
Token: や
Token: 屋
Token: 根
Token: か
Token: ら
Token: の
Token: 落
Token: 雪
Token: に
Token: も
Token: 十
Token: 分
Token: 注
Token: 意
Token: す
Token: る
Token: よ
Token: う
Token: 呼
Token: び
Token: か
Token: け
Token: て
Token: い
Token: ま
Token: す
Wave duration: 13.433 seconds

I also tried running it with the CPU, and that worked without any issues.

Additionally, I realize I didn’t clarify my point well in the previous message.
I was referring specifically to this part: ghjdegithub#1

@thewh1teagle thewh1teagle added the bug Something isn't working label Nov 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants