-
Notifications
You must be signed in to change notification settings - Fork 241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ChannelVolume usage causes crackling/distortion #680
Comments
This occurs in both release / non-release mode (although more evident in release). Particularly evident during 13 to 20s section. |
can you try out PR #669? That might fix it. |
I'm wondering is this simply a miss on the stereo->mono conversion algorithm? The way it works at the moment is the samples are all summed together (https://github.com/RustAudio/rodio/blob/master/src/source/channel_volume.rs#L36-L45) A simple algorithm for stereo->mono conversion is to average all of the samples so perhaps (for stereo) we are just missing a division by two or for another number of channels a division by the number of channels. |
As far as my ears can hear it that seems to fix it or at least come very close to fixing it. |
@dvdsk Thanks for the input - what's your thinking on the timelines for the next release of rodio? We're planning on moving to production next week with our app and so will be running off a Rodio fork with a few of these changes (ChannelVolume temporary fix or ChannelRouter and this f32 change). Happy to understand how we can support at all on the development - we're really interested in the changes to the resampler (e.g. Rubato) and ChannelRouter. |
I agree this does not sound correct. I am still kinda surprised it gives crackling issues. A f32 (channel volume works with f32) can represent integers up tot 24bit. So an i16 (current decoder type) sample doubled takes up a maximum of 17 bits. So the volume would be louder then intended but not crackling. It could be that the sample is converted back to an i16 if the output does not support taking floats but then the conversion should scale it back down.
Nice, though I am not fully sure how it helps :) We are probably changing the whole audio pipelines to f32 soon, that might help further. There is also work being done on ChannelRouter which might deprecate ChannelVolume if channel volume gives no clear perf advantage.
Ill be honest that will take a while. The next release will be the first breaking in a long time, we try to put all the breaking changes in one release that way users only have to upgrade their code once. Though given the sheer number of changes I am starting to doubt that plan. In any case we do keep the main (master) branch pretty stable, you should be able to use that especially if you pin to a specific commit. Running a fork with cherry picks is probably your best bet.
For now the rubato effort is paused till we decide whether to go with f32's or not, because that change would massively simplify the work. I'm not sure about the rest. Thanks for the support it would help if you could figure out what is going wrong regarding this issue. It might not be easy, you can ask me for help if you run into a wall. |
From what I can see from ChannelVolume is that it will return samples of the same type that they are provided in and not f32. Given that the mp3 decoder provides samples in i16 could this be the issue? If we are summing two i16 samples then we will be reaching the maximum value the resulting i16 can hold. I'm still working through this to be sure but this seems to be the case. |
Adding in some debug logs and after Mono conversion there are quite a few samples that do reach the bounds of maximum i16 so I think that is what is happening. |
I'm certain this is the issue here, in Rodio ChannelVolume uses In comparison awedio's ChannelCountConverter (https://github.com/10buttons/awedio/blob/56c7ac3810d6593a823bba4cd28ae67bcf9852bb/src/sounds/wrappers/channel_count_converter.rs#L128) which operates on i16s (and only Stereo->Mono) first casts both samples to i32 and then averages them (sums together and divides by 2) which will prevent a clip. Moving to f32 will help but I believe can still overflow/clip as we are not averaging but instead summing. I would suggest we also adjust this algorithm to divide by the channel count to ensure we are averaging and we don't overflow. If you're happy with this I will make a PR. |
correct, however by using the new decoders pr the ChannelVolume got f32 as input.
f32 should have plenty headroom for the sum right?
even if we do not overflow it still throws off the volume
Yes please! :) |
The I think channel mixer from #653 does proper mixing. |
We are using ChannelVolume to output Sources in mono on multichannel devices.
We have noticed that when using ChannelVolume the audio is distorted and crackly versus when not using ChannelVolume.
Code sample for good audio:
Code sample for distortion:
Sample mp3: https://github.com/will3942/test-rodio/raw/refs/heads/main/assets/sample.mp3
The text was updated successfully, but these errors were encountered: