Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stuffing always audible #89

Closed
joerg-krause opened this issue Jun 18, 2015 · 97 comments
Closed

Stuffing always audible #89

joerg-krause opened this issue Jun 18, 2015 · 97 comments

Comments

@joerg-krause
Copy link

When streaming from iTunes to my custom device I always hear small cracks when shairport-sync does basic stuffing. I changed several parameters but nothing helped. The cracks are also audible when I change the volume. You write in the README that

The number of frames inserted or deleted is so small as to be almost inaudible.

The cracks are very good to hear when playing a sinus tone.

How can I influence the number of frames inserted/deleted? Or do you have another idea what might be wrong here?

@mikebrady
Copy link
Owner

You are right that you can hear small crackles when Shairport Sync does basic stuffing on a sine wave. I can't really hear it in normal use, TBH. Using soxr stuffing helps a lot, as you might imagine.

To minimise it, I suggest that you try to minimise the amount of stuffing necessary. If you enable the generation of statistics, you can see some figures, including "net corrections" and "corrections" in parts per million. A correction is a stuffing action – an insertion or deletion of a single audio frame in basic stuffing mode. If the number of "corrections" is greater than the number of "net corrections", then the system is over-correcting and doing more stuffing than is necessary. To reduce overcorrection, increase the level of "drift" permitted – this essentially allows the synchronisation to drift for longer before attempting to correct it using a stuffing action.

@mikebrady
Copy link
Owner

When you say the "cracks are also audible when I change the volume", I suspect that some other phenomenon might be responsible for that. Is it possible that it is a kind of electrical noise? I notice that DACs and audio cables can easily pick up electrical interference from the power supply or nearby WiFi transceivers, and volume control changes are accompanied by extra network and processor activity, possibly generating electrical noise at the instant of volume control change. You could possibly test this by making the drift value insanely large, to stop stuffing, and then see if the volume-control-associated crackles remain...

@mikebrady
Copy link
Owner

Just FYI, here is a link to a thread that, among other stuff, discusses the crackle: abrasive#296 (comment)

@joerg-krause
Copy link
Author

Thanks for linking the thread - very interesting, although I do not understand everything.

I changed the drift to 65535 and the crackles occur exactly when changing the volume.

@joerg-krause
Copy link
Author

I plotted the spectrum of the output for the sine wave with a spectrum analyzer. This is a time-based graph were you can see that the amplitude gets much higher when changing the volume:

https://www.dropbox.com/s/wf9x9ujw14bcdtb/wat1_017.jpg?dl=0

@joerg-krause
Copy link
Author

When playing the sine wave with UPnP and changing the volume I can also here the crackles, though not so clearly.

@mikebrady
Copy link
Owner

Hmm. I'm afraid I can't interpret the graph you sent. I don't really know what can be done about the crackle due to volume control changing.

Were you able to reduce the amount of crackle due to stuffing by increasing the drift value? Did it have any useful effect?

@joerg-krause
Copy link
Author

The x axes shows the frequency, the y axes the time. The red line is 1000 Hz. The light-blue parts in the upper half of the graph shows an increased ampiltude when changing the volume. However, I am not sure how it connects to the audible crackles.

When increasing the drift value the crackles start later. For the default value it starts immediately, if I set it to 800 frames it starts about 40 - 60 seconds and does not stop after that.

@mikebrady
Copy link
Owner

That sounds right regarding the drift setting. Tell me, can you hear the crackles due to stuffing in normal programme material?

Regarding the volume control crackles, I'm not able to come up with anything much, except that it might be a function of whatever DSP is being done – after all, a sudden volume change is a bit like applying a step function, which could result in lots of harmonics, etc.

@joerg-krause
Copy link
Author

Yeah! After changing my ALSA configuration it works much better now! The sine wave plays without crackles now.

I added a dmixer (software mixer) and changed period_size and buffer_size. Surpisingly, no corrections are done now by shairport-sync (version 2.2.4, running with default options):

Sync error: 7524.9 (frames); net correction: 0.0 (ppm); corrections: 0.0 (ppm); missing packets 0; late packets 222; too late packets 0; resend requests 85; min DAC queue size 364, min and max buffer occupancy 208 and 269.

No stuffing, but underrun errors occur which are audible as small crackles:

ALSA lib pcm.c:7905:(snd_pcm_recover) underrun occurred
Error -32 in delay(): Broken pipe. Delay reported is 0 frames.
Error getting dac_delay at start of loop.
ALSA lib pcm.c:7905:(snd_pcm_recover) underrun occurred
Error -32 in delay(): Broken pipe. Delay reported is 0 frames.
Error getting dac_delay at start of loop.

Furthermore, changing the volume produces almost no audible crackles now.

@joerg-krause
Copy link
Author

Okay, obviously no stuffing is done now because I set the buffer_size for the dmixer to 4096 wich is smaller then DAC_BUFFER_QUEUE_MINIMUM_LENGTH (5000). Where did you get this value from?

@mikebrady
Copy link
Owner

I picked DAC_BUFFER_QUEUE_MINIMUM_LENGTH to be 5000 to stop low-powered devices from completely choking if they were stuffing or stripping.

The sync error on your statistics looks extremely large, and the number of late packets and resend requests are high too. Here are some stats from some of my setups:

iMac on ethernet to base station to Raspberry PI on WiFi with a Topping TP30 amp and with 880 frame "drift":

Jun 21 15:23:39 kitchen shairport-sync[2319]: Play connection from user agent "iTunes/12.1.2 (Macintosh; OS X 10.10.4)".
Jun 21 15:23:39 kitchen shairport-sync[2319]: Active-Remote string seen: "337530437".
Jun 21 15:23:39 kitchen shairport-sync[2319]: DACP-ID string seen: "F18FE780B35CFD8D".
Jun 21 15:23:39 kitchen shairport-sync[2319]: Connection from IPv4: 192.168.2.22:58901
Jun 21 15:24:11 kitchen shairport-sync[2319]: Sync error: 1.3 (frames); net correction: 0.0 (ppm); corrections: 0.0 (ppm); missing packets 0; late packets 0; too late packets 0; resend requests 0; min DAC queue size 4568, min and max buffer occupancy 218 and 269.
Jun 21 15:24:41 kitchen shairport-sync[2319]: Sync error: 15.1 (frames); net correction: 0.0 (ppm); corrections: 0.0 (ppm); missing packets 0; late packets 0; too late packets 0; resend requests 0; min DAC queue size 21573, min and max buffer occupancy 218 and 220.
...
Jun 21 15:59:41 kitchen shairport-sync[2319]: Sync error: 52.9 (frames); net correction: 0.0 (ppm); corrections: 0.0 (ppm); missing packets 0; late packets 1; too late packets 0; resend requests 1; min DAC queue size 21612, min and max buffer occupancy 216 and 220.
Jun 21 16:00:01 kitchen shairport-sync[2319]: RTSP shutdown requested.

That's right – no corrections whatever, 1 missing packet.

Here's another:

iPhone 6 on 5GHz WiFi to base station to Raspberry Pi on 2.4 GHz WiFi with 880 frame "drift":

Jun 21 16:41:30 kitchen shairport-sync[2319]: new RTSP connection.
Jun 21 16:41:31 kitchen shairport-sync[2319]: Encrypted session requested
Jun 21 16:41:31 kitchen shairport-sync[2319]: Play connection from device named "Mike Brady's iPhone 6".
Jun 21 16:41:31 kitchen shairport-sync[2319]: Play connection from user agent "AirPlay/220.68".
Jun 21 16:41:31 kitchen shairport-sync[2319]: Active-Remote string seen: "3432179597".
Jun 21 16:41:31 kitchen shairport-sync[2319]: DACP-ID string seen: "1D417F22EDBF20D3".
Jun 21 16:41:31 kitchen shairport-sync[2319]: Connection from IPv4: 192.168.2.21:62799
Jun 21 16:42:03 kitchen shairport-sync[2319]: Sync error: -527.9 (frames); net correction: 0.0 (ppm); corrections: 0.0 (ppm); missing packets 0; late packets 0; too late packets 0; resend requests 0; min DAC queue size 4617, min and max buffer occupancy 170 and 234.
Jun 21 16:42:33 kitchen shairport-sync[2319]: Sync error: -379.0 (frames); net correction: 0.0 (ppm); corrections: 0.0 (ppm); missing packets 0; late packets 0; too late packets 0; resend requests 0; min DAC queue size 21044, min and max buffer occupancy 174 and 190.
Jun 21 16:43:03 kitchen shairport-sync[2319]: Sync error: -239.6 (frames); net correction: 0.0 (ppm); corrections: 0.0 (ppm); missing packets 0; late packets 0; too late packets 0; resend requests 0; min DAC queue size 21256, min and max buffer occupancy 183 and 189.
Jun 21 16:43:33 kitchen shairport-sync[2319]: Sync error: -103.4 (frames); net correction: 0.0 (ppm); corrections: 0.0 (ppm); missing packets 0; late packets 0; too late packets 0; resend requests 0; min DAC queue size 21404, min and max buffer occupancy 171 and 190.
Jun 21 16:44:03 kitchen shairport-sync[2319]: Sync error: 47.4 (frames); net correction: 0.0 (ppm); corrections: 0.0 (ppm); missing packets 0; late packets 0; too late packets 0; resend requests 0; min DAC queue size 21500, min and max buffer occupancy 181 and 189.
Jun 21 16:44:32 kitchen kernel: [ 5390.758017] r8712u 1-1.2:1.0 wlan0: r8712u: nRemain_Length is 1020 and nSubframe_Length is: 23510
...
Jun 21 16:55:02 kitchen shairport-sync[2319]: Sync error: 866.9 (frames); net correction: -102.8 (ppm); corrections: 102.8 (ppm); missing packets 0; late packets 4; too late packets 0; resend requests 4; min DAC queue size 22101, min and max buffer occupancy 181 and 191.
Jun 21 16:55:32 kitchen shairport-sync[2319]: Sync error: 866.2 (frames); net correction: -116.4 (ppm); corrections: 116.4 (ppm); missing packets 0; late packets 4; too late packets 0; resend requests 4; min DAC queue size 22097, min and max buffer occupancy 172 and 191.
Jun 21 16:56:02 kitchen shairport-sync[2319]: Sync error: 865.5 (frames); net correction: -104.3 (ppm); corrections: 104.3 (ppm); missing packets 0; late packets 36; too late packets 0; resend requests 13; min DAC queue size 22139, min and max buffer occupancy 182 and 206.

Finally:
iMac on ethernet to base station to TP-Link TL-WR710N with "3D Sound" USB DAC, running trunk OpenWrt ( CHAOS CALMER (Bleeding Edge, r45907)) and Shairport Sync 2.3.3 on 2.4 GHz WiFi.

Sun Jun 21 16:59:02 2015 daemon.info shairport-sync[969]: Sync error: -25.7 (frames); net correction: 12.1 (ppm); corrections: 12.1 (ppm); missing packets 0; late packets 40; too late packets 0; resend requests 22; min DAC queue size 4696, min and max buffer occupancy 205 and 268.
Sun Jun 21 16:59:32 2015 daemon.info shairport-sync[969]: Sync error: -70.3 (frames); net correction: 80.9 (ppm); corrections: 80.9 (ppm); missing packets 0; late packets 41; too late packets 0; resend requests 23; min DAC queue size 21267, min and max buffer occupancy 205 and 220.
Sun Jun 21 17:00:02 2015 daemon.info shairport-sync[969]: Sync error: -69.3 (frames); net correction: 82.4 (ppm); corrections: 82.4 (ppm); missing packets 0; late packets 64; too late packets 0; resend requests 46; min DAC queue size 21367, min and max buffer occupancy 205 and 220.
Sun Jun 21 17:00:32 2015 daemon.info shairport-sync[969]: Sync error: -69.4 (frames); net correction: 85.4 (ppm); corrections: 85.4 (ppm); missing packets 0; late packets 88; too late packets 0; resend requests 71; min DAC queue size 21280, min and max buffer occupancy 205 and 220.
Sun Jun 21 17:01:02 2015 daemon.info shairport-sync[969]: Sync error: -71.2 (frames); net correction: 84.7 (ppm); corrections: 84.7 (ppm); missing packets 0; late packets 118; too late packets 0; resend requests 101; min DAC queue size 21264, min and max buffer occupancy 204 and 221.
Sun Jun 21 17:01:32 2015 daemon.info shairport-sync[969]: Sync error: -70.9 (frames); net correction: 78.6 (ppm); corrections: 78.6 (ppm); missing packets 0; late packets 142; too late packets 0; resend requests 121; min DAC queue size 21177, min and max buffer occupancy 205 and 220.
Sun Jun 21 17:02:02 2015 daemon.info shairport-sync[969]: Sync error: -69.0 (frames); net correction: 83.9 (ppm); corrections: 83.9 (ppm); missing packets 0; late packets 161; too late packets 0; resend requests 140; min DAC queue size 21187, min and max buffer occupancy 205 and 220.
Sun Jun 21 17:02:32 2015 daemon.info shairport-sync[969]: Sync error: -69.2 (frames); net correction: 82.4 (ppm); corrections: 83.9 (ppm); missing packets 0; late packets 175; too late packets 0; resend requests 154; min DAC queue size 21422, min and max buffer occupancy 205 and 220.
Sun Jun 21 17:03:02 2015 daemon.info shairport-sync[969]: Sync error: -70.7 (frames); net correction: 87.7 (ppm); corrections: 87.7 (ppm); missing packets 0; late packets 201; too late packets 0; resend requests 180; min DAC queue size 21172, min and max buffer occupancy 205 and 220.
...
Sun Jun 21 17:10:02 2015 daemon.info shairport-sync[969]: Sync error: -70.7 (frames); net correction: 83.9 (ppm); corrections: 83.9 (ppm); missing packets 0; late packets 446; too late packets 0; resend requests 421; min DAC queue size 21225, min and max buffer occupancy 206 and 220.
Sun Jun 21 17:10:32 2015 daemon.info shairport-sync[969]: Sync error: -70.0 (frames); net correction: 80.9 (ppm); corrections: 80.9 (ppm); missing packets 0; late packets 447; too late packets 0; resend requests 422; min DAC queue size 21479, min and max buffer occupancy 205 and 220.
Sun Jun 21 17:11:02 2015 daemon.info shairport-sync[969]: Sync error: -70.3 (frames); net correction: 81.6 (ppm); corrections: 81.6 (ppm); missing packets 0; late packets 451; too late packets 0; resend requests 426; min DAC queue size 21491, min and max buffer occupancy 205 and 220.

The TP-Link is in a marginal enough location for WiFi, but isn't great even in a good location. Also, BTW, it doesn't have the horsepower to do soxr-based stuffing.

So I think these are pretty typical. What kind of device etc. are you using?

@joerg-krause
Copy link
Author

Many thanks for running the tests!

I'm using an iPhone 4s to base station to an ARMv5 device (450 MHz) with an on-board DAC. Maybe the device is to low-powered for stuffing?

I played with the buffer_size parameter. If I set it lower then 5000, as expected, no stuffing is done at all, so no crackles are audible (of course). On the downside the drift increases and I'm loosing synchronization resulting in a large sync error after some minutes.

If I set the buffer_size a little bit higher then 5000, some stuffing is done but the drift still increases over time. Some crackles, but almost not audible. After two minutes the devices get out of sync and shairport does resync.

If I set the buffer_size higher then 5600, enough stuffing is done to keep the devices synchronized, but the crackles are audible very clearly for a sine wave and also for some music tracks with quiet parts.

Increasing the tolerance does not really avoids the necessity of stuffing. It just delays it and the stuffing is done somehow later. Increasing the tolerance would only make sense for me if there is the possibility that the drift would decrease at some point of time instead of always increasing.

I wonder if the audible crackles can be avoided in my case and how to debug the system to find the bottleneck. Do you hear crackles when streaming a sine wave with 0dB from your iPhone to the Pi and stuffing is done?

And I do not understand why inserting or deleting a frame into the buffer can cause a crackle. Or the volume change? If read in the thread you linked that @yenchee1970 had the same problem: abrasive#296 (comment).

He also wrote that soxr helped, if set to low quality for the Pi: abrasive#296 (comment). I tested soxr stuffing, but it does not help for me. I guess my device is not powerful enough. What is the setting of soxr use by shairport-sync. Can the quality by adjusted?

@mikebrady
Copy link
Owner

Can you share a few lines of the statistics output? If a resynchronisation is being triggered in two minutes, it implies a net correction of about 400 ppm, which is very large, and would cause a good deal of stuffing or stripping and hence, on a 1 kHz sine wave, an audible hiss or crackle. It would also imply that the device's clock is really inaccurate.

The noise seems to be unavoidable (see the graphs on that thread) – hence the move to soxr-based resampling, but with normal material and under normal listening conditions the noise is inaudible; that's what they seem to conclude in that thread, and is certainly my own experience.

Offhand, I'd guess that the device is powerful enough for basic stuffing or stripping. For soxr-based operation, a lot depends on whether it has floating point hardware and whether it's supported by the compiler. My guess is that it's very unlikely. The processor in the TP-Link 710N above is a 400MHz device and is incapable of soxr-based operation.

I did not explore any of the 'simpler' modes of operation of the soxr library.

@joerg-krause
Copy link
Author

The is a statistics about two and a half minute from iPhone 4s to base station to shairport device with tolerance set to 882:

Jun 23 08:10:15 kitchen daemon.info shairport-sync[562]: Sync error: -23.9 (frames); net correction: 0.0 (ppm); corrections: 0.0 (ppm); missing packets 0; late packets 9; too late packets 0; resend requests 5; min DAC queue size 4143, min and max buffer occupancy 178 and
Jun 23 08:10:45 kitchen daemon.info shairport-sync[562]: Sync error: 536.6 (frames); net correction: 0.0 (ppm); corrections: 0.0 (ppm); missing packets 0; late packets 17; too late packets 0; resend requests 13; min DAC queue size 5867, min and max buffer occupancy 177 a
Jun 23 08:11:15 kitchen daemon.info shairport-sync[562]: Sync error: 874.4 (frames); net correction: -395.4 (ppm); corrections: 395.4 (ppm); missing packets 0; late packets 26; too late packets 0; resend requests 19; min DAC queue size 6455, min and max buffer occupancy 
Jun 23 08:11:45 kitchen daemon.info shairport-sync[562]: Sync error: 880.3 (frames); net correction: -438.5 (ppm); corrections: 438.5 (ppm); missing packets 0; late packets 35; too late packets 0; resend requests 25; min DAC queue size 6235, min and max buffer occupancy 
Jun 23 08:12:15 kitchen daemon.info shairport-sync[562]: Sync error: 878.1 (frames); net correction: -413.5 (ppm); corrections: 413.5 (ppm); missing packets 0; late packets 45; too late packets 0; resend requests 31; min DAC queue size 6247, min and max buffer occupancy 
Jun 23 08:12:45 kitchen daemon.info shairport-sync[562]: Sync error: 873.8 (frames); net correction: -437.7 (ppm); corrections: 437.7 (ppm); missing packets 0; late packets 65; too late packets 0; resend requests 39; min DAC queue size 6320, min and max buffer occupancy

I will certainly check if the system clock can be improved. Many thanks for the hint! Maybe the tool adjtimex will help...

@yenchee1970 did some work on adding an option to set the soxr quality: yenchee1970@6d97953. The default value used by soxr is 'Low quality'. Five standard qualities are defined in soxr: http://sourceforge.net/p/soxr/code/ci/master/tree/src/soxr.h#l281. Maybe we can add such an option to shairport-sync. Certainly, it will not help me on my low-powered device, but it maybe useful for high-powered device to improve resampling.

I've also learned that the alsa parameter period_size and buffer_size are important for AirPlay.
@yenchee1970 did some work on this subject, too: yenchee1970@5fb2d97. However, I'm not sure how to determine the best values for AirPlay. Maybe it's worth to take a look closer?

@mikebrady
Copy link
Owner

Thanks. So the clock is a long way off alright. I don't think adjtimex will alter the clock frequency.

I'll have a look at the quality settings, but not too soon :)

Regarding period size and buffer size, I did experiment with these early on, and came to the conclusion that they shouldn't be messed with, especially as the likely optimal parameters will change from device to device. In practice, this seems to have been the right thing to do.

@joerg-krause
Copy link
Author

You're right with adjtimex - it's for another use case.

Can you explain the calculation of the actual delay, please?
https://github.com/mikebrady/shairport-sync/blob/development/player.c#L986.

I'm still looking for the cause why the delay increases over time (without stuffing and resync). The systems crytal clock should not drift so much.

@joerg-krause
Copy link
Author

I did some further investigations. I transposed the terms of calculating the delay to

int64_t delay = td_in_frames + current_delay + rt - nt;

and plotted the three termes td_in_frames, current_delay and rt - nt to see which of these is statically increasing over time. This is the plot (sorry, it's very large-sized):
https://www.dropbox.com/s/yha6p1yx14qqnpn/shairport-delay.pdf?dl=0

So td_in_frames is constant over time, which means the system clock does not drift significantly from the remote NTP timestamp. The current_delay increases at the beginning near to the value of alsa's buffer_size and then stays also constant. What increases over time is rt - nt which is, as far as I understand, the delay between the reference timestamp and the timestamp of the current audio packet, right? But why?

@mikebrady
Copy link
Owner

This is good work, thanks; it's something I haven't done. Let me think about it for a couple of days.

@joerg-krause
Copy link
Author

I repeated the test with a MacBook on ethernet and the iPhone on wifi with the WHAALE app (instead of iTunes). Both are showing similar results.

I'll think about it, too. Many thanks so far!

@mikebrady
Copy link
Owner

You must tell me how you generated the data and what tools you used to plot it. Anyway, here are my thoughts:

  • Shairport Sync (SS) sends a timing request every three seconds back to the source. It uses this and the reply to calculate the local_to_remote_time_difference in rtp.c. (If we plotted this figure, it would show the raw drift between the source and local (SS) clock.)
  • Every second or so, the source sends a control packet containing a packet’s timestamp, which is the frame number of the first frame, and the time it should play according to the source’s clock, assuming no delay. We use the local_to_remote_time_difference (which is adjusted every three seconds for drift) to turn that into the time at which that frame should play according to SS’s local clock. This is what reference_timestamp and reference_timestamp_time are. Thus, the reference_timestamp and reference_timestamp_timegives us the exact local time, allowing for drift, at which the frame bearing the reference_timestamp should be the current frame.
  • td is the difference between the time now and the reference timestamp time. By adding the number of frames in td, i.e. td_in_frames, to the reference_timestamp, we get the index number of the frame that should be output right now, assuming no delay, correctly synchronised to the source time. Thus the subexpression td_in_frames + rt gives us the frame number of what should be the current frame if there was no delay – call this the zero-delay current frame.
  • nt is the frame number of the first frame in SS’s buffers and current_delay is the number of frames in the audio backend’s buffer (usually this is the ALSA DAC’s buffer). The subexpression nt - current_delay gives the index number of the actual current frame.
  • The difference between the actual current frame number and the zero-delay current frame number gives us the actual delay, in frames. This delay is composed of the desired latency and an error term, positive or negative.
  • Thus, if you plot zero-delay frame number - actual frame number, i.e. td_in_frames + rt - (nt - current_delay) you should see a drifting trend starting close to the desired latency.
  • If the local clock is running slow, frames will be taken more slowly from the DAC buffer than they are put in, so the buffer will grow and current_delay will drift up. Conversely, the buffer will contract if frames are being taken out faster than they are being put in; current_delay will drift down. The sync_error is used to determine what to do – stuff, strip or nothing at all.

@mikebrady
Copy link
Owner

OK, so I've just pushed an update in development which outputs the drift in parts per billion. A positive figure indicates the SS clock is running faster than the source clock, so stuffing would have to be done, as the DAC would be depleting the buffer faster than frames were arriving from the source.

The drift does not quite correspond with the net corrections figure produced as part of the statistics – they should be the same, give or take. I don't know why this is – maybe it's a simple calculation error, but it needs investigation.

@joerg-krause
Copy link
Author

Many thanks for the explanation! I will try to fully understand the technical background.

I added the following line to the code after this line https://github.com/mikebrady/shairport-sync/blob/development/player.c#L986.

debug(1, "%lld %lld %lld", td_in_frames, current_delay, rt - nt);

I'm using gnuplot for plotting. To ease the plotting I only dump values, no text description or something else.

In the conf file I changed the following settings (no stuffing, no resync):

drift = 65535
log_verbosity = 1
resync_threshold = 0

I'm starting shairport-sync in non-daemon mode and copy the debug output after some minutes to a file, eg debug-delay.data:

5831 4511 89056
5952 4751 88704
6165 4885 88352
[...]

Starting gnuplot:

plot for [col=1:3] 'debug-delay.data' using col

So I will make a plot of local_to_remote_time_difference and the parts-per-billion drift.

@mikebrady
Copy link
Owner

Very useful, thanks.

@mikebrady
Copy link
Owner

I need to modify the stuff I pushed yesterday to make it output raw data suitable for plotting.

@joerg-krause
Copy link
Author

Thanks!

I did a plot of the clock offset, the local_to_remote_time_difference and the buffer_delay from alsa.

For this I added the following code snippet into rtp.c https://github.com/mikebrady/shairport-sync/blob/development/rtp.c#L402:

int64_t current_delay = config.output->delay();
uint64_t offset = ((distant_receive_time - departure_time) + (distant_transmit_time - arrival_time)) >> 1;
debug(1, "%llu %llu %lld", offset, local_to_remote_time_difference, current_delay);

https://www.dropbox.com/s/gw4xur95jf7zaxs/offset-local_to_remote_time_difference-alsa_delay.pdf?dl=0

The first graph shows the offset, the second the local_to_remote_time_difference and the third current_delay. I added current_delay to see if it somehow correlates to the others.

The x-axes represents the time in seconds where one interval represents three seconds (the timining interval). So the first 3*60 seconds the offset is decreasing, the for around 60 seconds it's increasing fast, after this it is increasing constantly, but slower.

What do you think about this?

@joerg-krause
Copy link
Author

Two obtain the raw data I changed the stuff from yesterday to:

      uint64_t clock_drift;
      if (first_local_to_remote_time_difference>=local_to_remote_time_difference)  {
        clock_drift = (((first_local_to_remote_time_difference - local_to_remote_time_difference) * 1000000000) / (get_absolute_time_in_fp() - first_local_to_remote_time_difference_time));
        //debug(1,"Clock drift is -%lld ppb.",clock_drift);
      }
      else {
        clock_drift = (((local_to_remote_time_difference - first_local_to_remote_time_difference) * 1000000000) / (get_absolute_time_in_fp() - first_local_to_remote_time_difference_time));
        //debug(1,"Clock drift is %lld ppb.",clock_drift);
      }
      debug(1, "%llu", clock_drift);

This is the corresponding plot:
https://www.dropbox.com/s/22lfvnw4drzt2aq/clock_drift.pdf?dl=0

The same plot with truncated y-axes:
https://www.dropbox.com/s/dt4o5vim1bbk8t1/clock_drift_truncated.pdf?dl=0

y-axes shows ppb, y-axes shows time with an interval of 3 seconds (timing interval).

@mikebrady
Copy link
Owner

Thanks for all these fantastic plots – they are very rich in terms of visualisations, and hopefully will hep in understanding what's happening.

A couple of things. First, the first local_to_remote_time_difference values are obtained at half-second intervals, and thereafter at three second intervals. (I think this is a trivial issue, not worth bothering about.)
Second, there's a slight error in your code. Whether the drift is positive or negative, you output it as positive.

Third, the last plot is of clock drift averaged over the entire interval, so I think it probably doesn't tell us enough about how the drift is evolving. If the code you cited above was replaced by this:

     uint64_t clock_drift;
      if (first_local_to_remote_time_difference>=local_to_remote_time_difference)  {
        clock_drift = ((first_local_to_remote_time_difference - local_to_remote_time_difference) * 1000000)>>32;
        debug(1, "-%llu", clock_drift);
        //debug(1,"Clock drift is -%lld usec.",clock_drift);
      }
      else {
        clock_drift = ((local_to_remote_time_difference - first_local_to_remote_time_difference) * 1000000) >>32;
        //debug(1,"Clock drift is %lld usec.",clock_drift);
        debug(1, "%llu", clock_drift);
      } 

it would fix that slight error and give the drift in microseconds, unaveraged.

@mikebrady
Copy link
Owner

Sorry – didn't mean to close the issue!

@mikebrady
Copy link
Owner

Nothing wrong with either of those parts :)

About the timestamps, I never thought about it too much, but I imagine that communication with each extra airplay device would be handled by separate threads, each going about its business by itself. I'd have been a bit surprised if they were the same, frankly.

@mikebrady
Copy link
Owner

Full of silly questions, sorry, but here's another: does your CPU go into a low-power mode or anything of that nature?

@joerg-krause
Copy link
Author

Full of silly questions, sorry, but here's another: does your CPU go into a low-power mode or anything of that nature?

No, any low-power features are disabled in the Kernel config:

# CONFIG_PM is not set

@joerg-krause
Copy link
Author

Yeah!!!

I have it! The divider of the SAIF clock is without software interaction not precise enough for synchronisation. A guy from the Freescale community have had the same problem. Fortunalty he shared his solution just today :-)
https://community.freescale.com/thread/357691

I will post a graph after some tests later.

@mikebrady Many thanks for all your help and work! It helped me really a lot to understand the implementation and the synchronization part. It is indeed a lot of good work you've done!

I hope the work done by @mikebrady will help others to quickly identify audio problems because of bad clock synchronization!

[Edit: Added missing link to the solution.]

@joerg-krause
Copy link
Author

One quick question: What drift value did you use for the last test #89 (comment)?

@mikebrady
Copy link
Owner

It was left as the default – 88 frames. Looking forward to seeing some graphs :)

@mikebrady
Copy link
Owner

By the way, thanks are due to you, @joerg-krause, for providing the setting, motivation and a good deal of help in understanding the synchronisation system more completely.

@joerg-krause
Copy link
Author

I applied the patches and rerun the test two times. It looks much better now compared to my last test. With one exception:

clock-adjusted

After around 145 timing packets a lot of corrections are done. Why? This was the same in both tests.

@mikebrady
Copy link
Owner

That looks somewhat better alright. Can you plot the sum of Clock Drift + Source Drift + Corrections? That should tend to zero, but allowing for some slop due to the drift figure. I'm just wondering if the error begins to cross the threshold at around 142.

@joerg-krause
Copy link
Author

I removed DAC delay in favour of Net (= Clock Drift + Source Drift + Corrections):

clock-adjusted-with-net

Just for clarification: all parameters are set to default (eg drift is 88 frames).

I'm just wondering if the error begins to cross the threshold at around 142

Which error?

@joerg-krause
Copy link
Author

Ok, I enabled debug output for the Linux audio driver to see when the frequency is adjusted. In the current test I can see that around the frequency is adjusted (faster) and the number fo corrections rapidly is growing. After 52 seconds the frequency is adjusted to run slower (it is the same value as before the adjustment).

In the current running test the number of corrections before the adjustment was done was 0!

@mikebrady
Copy link
Owner

The "error" I meant was just the latency error calculation. But that would not account for the duration of the change in the rate of corrections – there seems to be some kind of hysteresis effect working there, but I don't know where it could be coming from. Trying to understand the patches for that clock problem on the Freescale i.MX28 CPU, is it possible that the dividing ratios are being changed from time to time? Also, that clock drift figure is still rough – it might be a further clue...

@mikebrady
Copy link
Owner

Ah – our posts crossed. So my guess might be right?

@joerg-krause
Copy link
Author

Yes, it is. Every 420.1 seconds the frequency is adjusted for 52 seconds to maintain audio synchronization within +/-20ms of sample rate.

I will run a test now with setting the drift to 882 (= 20ms).

@mikebrady
Copy link
Owner

That looks like it then. The first 420 seconds is around 144 samples. The rapid change lasts to around sample 161, thus 15 samples or 45 seconds. It's plausible.

@joerg-krause
Copy link
Author

This is the graph with MCLK frequency adjustment. Drift is set to 882:

frequency-adjustment

You can see clearly that every 420 seconds the frequency (the clock is slower then the desired clock) is adjusted for 52 seconds (clock runs faster then the desired clock).

What do you think about this?

[Edit]: The tests lasts about 3000 seconds.

@mikebrady
Copy link
Owner

That graph looks exactly as I would expect. The clock drift still looks rough. What does it sound like, I wonder? I think it should sound pretty okay, though you might hear bursts of crackle every 420 seconds...

@joerg-krause
Copy link
Author

Sounds much better now! Far less crackles.

This is a graph a made when I measured the roundtrip time. Maybe that's why the clock drift looks so rough?

roundtrip_time

@mikebrady
Copy link
Owner

Yep. The filter should be discarding the outliers, but there is still plenty of variation below 10,000 (10 milliseconds?). Do you have any ideas about the round-trip time variations yourself?

@joerg-krause
Copy link
Author

Yes, 10 ms.

Do you have any ideas about the round-trip time variations yourself?

Not sure about this. Maybe some wifi driver issues? Do you think it's worth to look at this?

@mikebrady
Copy link
Owner

Off the top of my head, yes, I do – something is going on there. I guess I should do a few tests myself here to compare it with, but my vague memory is that the times were in the low single milliseconds, often much lower.

@joerg-krause
Copy link
Author

Okay, I'll have a look at this tomorrow. Many thanks so far!

@mikebrady
Copy link
Owner

Here's an example on a TP-Link 710N using its built-in WiFi here at home. Seems to be just over 2 ms on average, and pretty even.

turnaround time tp-link 710

@joerg-krause
Copy link
Author

Thanks for the test! I run several tests om my own, but did not succeed in getting an average ping time under 5ms. I will discuss this on the linux-wireless mailing list, as it is not shairport specific.

Nevertheless, I will post any results here.

@mikebrady
Copy link
Owner

Thanks. We might be able to close this issue soon, I'm thinking.

@joerg-krause
Copy link
Author

I close this thread until any new breakthrough discoveries are made.

Again, @mikebrady many thanks for all the work and investigations!

@mikebrady
Copy link
Owner

Thanks – it was very interesting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants