-
Notifications
You must be signed in to change notification settings - Fork 58
DataRecognitionClient is slow with results (6 secs or greater) #61
Comments
here are the logcats: of particular interest is the section of the logcat in the slow file: 10-24 16:47:59.463 Samsung SM-T713 Info 17248 NLPLAT ConnectionStateChanged |
How long is the audio? What's the delay to the first partial result? The speech service is designed and optimized for the recognition of audio spoken by humans “in real time” (“in real time” = recognizing audio that is streamed to the service at the rate that a human can speak that audio). The fastest it can process the bulk audio is ca. 2x the playback speed. If audio size if ~ 12 seconds, the final results will not be fired sooner than 6 seconds after the audio has been uploaded. |
Looking closer, it takes 6 seconds to receive the Upgrade from the service. Weird. The code snippet above does not look like valid java. Could you try the unmodified sample and verify that you still experience the delay? |
The audio is approx 5 secs long. Previously I was able to get sub second responses from the service with no problem. That code snipit was not meant to be valid java. Rather it was an example of how I am implementing the SDK. I have downloaded the sample and ran it with the same results. Can you do the same and verify that there is an issue? Do you have any idea why the upgrade request would take so long? I am using the speechrecognition-1.2.2.aar just like the demo. I was reading the web sockets doc and a couple things stood out. Does the client SDK handle the requirements laid out for web sockets in this doc? Specifically Is there some way to crank up the logging coming from the NPLAT in logcat? I am trying to verify that all of the above criteria is met to zero in on the bug. If the sdk supports cookies, how is it doing so. Do I need some other permission in my app in order to store the cookie? I'm pretty sure that second faster attempt mentioned in my initial post is being considered a retry from the web sockets point of view and is using the first, slower initializing connection. |
Documented requirements apply to the new protocol. The SDK is built on top of a legacy protocol. Second retry is faster, because it uses the same WS connection (which expires in 30 seconds). I cannot repro the issue. Looking at the service logs. |
Please keep in mind that while the demo app works when downloaded and the keys are replaced, the issue we are attempting to solve is the slow connection/response/translation. IOW, pulling down the android demo app, and entering my app key then running it will get you a very slow response when you click the speech to text button |
@raaaar any progress on this issue? |
Nothing yet... It's definitely a service issue, will escalate. |
Thank you for the update. Once its fixed, let me know and I will be happy to test. ;) Thanks Again |
@BeanStalka Could you please try this JavaScript sample and see if you can reproduce the issue? If you do, the log from the browser console would help a lot to get to the bottom of this. |
The javascript version of the web sockets is much faster. that looks and feels like a 400ms round trip. The speeds on my android tablet have gotten better. they have gone from 6 seconds every time to around 4 seconds. Attached please find the latest logcats for a fast and slow response |
Turns out, there's a bug in how the non-windows sdk code converts guids to strings and prints them out. In place of all the 'S' symbols in the logcat output ("NLPLAT ImpressionId: S", "Response request id: 'S'", etc.) there should've been proper guids. As a result, the Android doesn't log metadata it receives from the server and whatever it logs is not logged on the server side (at least, I wasn't able to get anywhere based on the logcat info). I know that this is getting beyond ridiculous, but can you try the windows sample (https://github.com/Azure-Samples/Cognitive-Speech-STT-Windows), which does log server replies and if you see a delay there, post the Response Impression Id? |
I'm confused. The newslow.txt I attached contains this: 10-27 12:09:41.957 Samsung SM-T713 Info 7679 NLPLAT Create ImpressionId: b6221717220a668d61182ca5d4ed1087 It would seem you have an impression id to investigate the issues I am seeing with the Android SDK. P.S. - I downloaded and tried the windows demo with no issues. The translation was snappy and there was hardly any wait. Please advise |
This is a request impression id, (guid is generated locally by the client). For some obscure historical reason, this is not logged by the server side (or, not in a way to make the lookup straight-forward). The whole thing is, unfortunately, confusing. At any rate, raised a ticket, hoping that the service team will be able to triage based on the available info. What timezone are you based in? |
Raaaar, I am in the Eastern United States time zone. When you get a ticket going, how can I track its progress? Thanks, Bean |
Raaaar, I am seeing this issue again. Any update to the SDK? |
Hi, |
When using the DataRecognitionClient it is taking quite a long time to get any results. On Average 6 seconds.
The first call is always slow.
If I make another call in rapid succession it will return in around a second.
If I wait more than 30 seconds between calls they will both average 6 seconds from when I call
SendAudio to when OnFinalResponseReceived is called.
Steps to reproduce:
1.) Instanatiate a DataRecognitionClient.
dataRecognitionClient = SpeechRecognitionServiceFactory.CreateDataClient(SpeechRecognitionMode.ShortPhrase, Locale, this, ApiKey);
2.) Supply it an audio format through the DataRecognitionClient via SendAudioFormat
var audioFormat = SpeechAudioFormat.Create16BitPCMFormat(16000);
dataRecognitionClient.SendAudioFormat(audioFormat);
3.) Send raw PCM data through the DataRecognitionClient via SendAudio
dataRecognitionClient.SendAudio(rawData, rawData.Length);
4.) Time how long it takes to hit the OnFinalResponseReceived call back.
If I had to guess its an issues opening and closing web sockets. This is just a guess though.
The text was updated successfully, but these errors were encountered: