Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

webrtc的p2p模式为什么数字人刷新之后还是会说上一个问题的一点尾巴。(推理的队列我都已经清空了) #374

Open
kkkwjr opened this issue Feb 11, 2025 · 5 comments

Comments

@kkkwjr
Copy link

kkkwjr commented Feb 11, 2025

No description provided.

@chentiejin1
Copy link

同问,有同样的问题,有什么方式可以解决?

@lipku
Copy link
Owner

lipku commented Feb 13, 2025

是用的新的代码吗,第二次连接吗,应该不会有这个问题了,重新创建对象的。

@xing-xing448
Copy link

因为预热的那里,的那个维护音频特征的循环矩阵还包含了上一段音频末尾的几个音频帧,将那个循环矩阵置零就好了

@lipku
Copy link
Owner

lipku commented Feb 16, 2025

具体是哪部分代码,方便提个pr吗

@xing-xing448
Copy link

xing-xing448 commented Feb 16, 2025

具体是哪部分代码,方便提个pr吗

刚刚看了下,需要改nerfasr.py的self.feat_queue的状态维护,每句话推理完,应该需要将最近的8个特征置零(其他asr没看,不知道有没有问题),但是这样改有点麻烦,可以将将每句推理的音频拼接一段0.56s的静音片段:

def stream_tts(self,audio_stream,msg):
    text,textevent = msg
    first = True
    for chunk in audio_stream:
        if chunk is not None and len(chunk)>0:          
            #stream = np.frombuffer(chunk, dtype=np.int16).astype(np.float32) / 32767
            #stream = resampy.resample(x=stream, sr_orig=32000, sr_new=self.sample_rate)
            byte_stream=BytesIO(chunk)
            stream = self.__create_bytes_stream(byte_stream)
            streamlen = stream.shape[0]
            idx=0
            while streamlen >= self.chunk:
                eventpoint=None
                if first:
                    eventpoint={'status':'start','text':text,'msgenvent':textevent}
                    first = False
                self.parent.put_audio_frame(stream[idx:idx+self.chunk],eventpoint)
                streamlen -= self.chunk
                idx += self.chunk

            # +++++++++++++++++
            for _ in range(28):
                 self.parent.put_audio_frame(np.zeros(self.chunk), None)
            # +++++++++++++++++

    eventpoint={'status':'end','text':text,'msgenvent':textevent}
    self.parent.put_audio_frame(np.zeros(self.chunk,np.float32),eventpoint)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants