Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kronos dependency triggers a Local Network Permission alert on iOS >= 14 #647

Open
thibauddavid opened this issue Oct 22, 2021 · 44 comments
Assignees
Labels
bug Something isn't working

Comments

@thibauddavid
Copy link

Hi,

We are experiencing an issue with your SDK.
It relies on Kronos, which has an opened issue here MobileNativeFoundation/Kronos#94 about a local network permission popup being asked.

Any plan on your side to mitigate this ?

Thanks

@buranmert buranmert self-assigned this Oct 27, 2021
@buranmert
Copy link
Contributor

hi @thibauddavid 👋
thanks for the heads-up!

i can reproduce the issue only with local NTP pools, such as 127.0.0.1 or 127.255.255.255:

Clock.sync(from: "127.0.0.1", ...) { ... }

otherwise i can't reproduce the issue and Kronos source code doesn't seem to try to access local network by itself.

can you please make sure that your NTP pool isn't in your local network by any chance?

@buranmert buranmert added the awaiting response Waiting for response / confirmation from the reporter label Oct 27, 2021
@thibauddavid
Copy link
Author

Hi @buranmert , thanks for answering.
I've double checked, and I'm using default DataDog configuration, which itself uses one of these NRP server

    static let datadogNTPServers = [
        "0.datadog.pool.ntp.org",
        "1.datadog.pool.ntp.org",
        "2.datadog.pool.ntp.org",
        "3.datadog.pool.ntp.org"
    ]

So I don't understand why we get this popup.
Maybe Kronos makes somewhere a call using it's default pool (time.apple.com), which the OS might resolve to a local web server somehow ?

@buranmert
Copy link
Contributor

could there be anything else in your project which triggers this popup?
can you please try Datadog SDK in a newly created project which does basically nothing else than initializing Datadog SDK?
i'm trying such a project and can't get the popup unless i use 127.0.0.1 as NTP pool address

@thibauddavid
Copy link
Author

I'll have a look, but according to MobileNativeFoundation/Kronos#94 it really seems to be a problem with Kronos.
But why can you only trigger the popup by setting your pool to a local IP meanwhile we seem to get it by using default one, I don't know for now.

@buranmert
Copy link
Contributor

hi again @thibauddavid 👋
it's been a while and i was wondering if you had a chance to reproduce the issue in a newly created project?

@thibauddavid
Copy link
Author

I've been quite busy, I'll have a look as soon as I can !

@sergiocampama
Copy link

I'm seeing a similar thing after adding the DataDog SDK to my app. It's not reproducible though, we've just gotten some random reports from some users asking why we need to scan the local network when the app shouldn't.

Would it be possible to add a config to disable clock syncing? We don't really care that much about the logs being on an exact time. We segment them by user anyways, so they should be relative to each other.

@buranmert
Copy link
Contributor

@sergiocampama is it possible to ask those users for more info?
for example: did that happen on WiFi or mobile data? do they use specific DNS servers? were they in a private network, like a workplace network, etc?

i still can't reproduce the issue although i tested scenarios such as:
• mobile data and WiFi
• using my router's local IP (192.168.X.X) as DNS server (in order to make a DNS request to a local IP)
• bad connection
• inaccessible NTP server (e.g: foo.bar.ntp)
i can only reproduce the issue if i use a local IP or an address with .local extension as NTP server.

please let me know if you suspect of anything else.
syncing time is essential for most of our users and we'd like to find the root cause and fix it if possible.

@sergiocampama
Copy link

I'm asking a few users for this data, but they might not be savvy enough to know this. Still, an API to disable clock sync would be ideal for us to avoid this, since we don't care as our logs are user relative, so timing the logs to all users is not really worth it for us. Is this something that could be done? More than happy to send a patch if yes.

@buranmert
Copy link
Contributor

that would be great if we could have more info from those users @sergiocampama 🙏

meanwhile,
• we will be filing a radar to Apple, although there's quite a bit of unknowns in this case.
• we might try a different NTP implementation, yet again as we can't reproduce the issue we can't validate any new implementation solves the problem.

if that's okay for you too, you can fork our repo and patch it according to your needs.
it should be straightforward to use your fork with Swift Package Manager/Carthage.

please let me know if that helps

@AJ9
Copy link

AJ9 commented Dec 22, 2021

@buranmert we've just integrated DataDog into our app and like @sergiocampama are getting users report this pop up. We're not able to replicate this on our side in any way though. Did you get any further?

@buranmert
Copy link
Contributor

hi @AJ9 , do you have any kind of more info about these reports by any chance? any possible patterns, a certain iOS version, device model, connection type, etc.?

to be honest, unfortunately we haven't got any further yet as we still don't have data around the issue and we still can't reproduce it.

@AJ9
Copy link

AJ9 commented Dec 22, 2021

Hey 👋, information is quite minimal this is what we do know:

iPhone 12 Pro
iOS 15.1
Wifi (But quite poor wifi, possible connection drop?)
Seemed to have happened on a first launch of the new app with DD in.
No special phone setup (No VPN/DNS settings etc etC)

We also can't reproduce this, even whilst setting up similar conditions, it certainly isn't happening for all users. We're going to keep exploring the drop in connection problem as that seems the most likely cause...

@ncreated
Copy link
Collaborator

ncreated commented Jan 7, 2022

👋 Hey all! We're actively working on tracking this issue ☝️. Unfortunately, we couldn't manage to reproduce it locally and our efforts lead to conclusion that this might be occurring very rarely in some very specific and flaky network circumstances.

To move our investigation forward, in #709 we're going to add extra monitoring to collect telemetry from Kronos internal execution in our dogfood 🐶 projects. Hopefully, this will lead to gathering enough data from production environment to nail down the problem.

NTP time synchronisation takes important place in our SDK and it is there to guarantee that all collected telemetry is in sync (both on client and when propagated to backend in Datadog Distributed Tracing). Hence we're doing maximum effort to mitigate this issue.

@buranmert buranmert removed the awaiting response Waiting for response / confirmation from the reporter label Jan 7, 2022
@leearmstrong
Copy link

Hey all, have you managed to get anything from the logging at all?

@ncreated
Copy link
Collaborator

Hello @leearmstrong 👋. So far we collected ~1K samples from our dogfood projects (mainly our iOS app in beta channel). None of ~3.7K resolved IPs led to local network connection (checked with NWConnection API). In our static analysis, among all resolved IPv4 addresses there was not a single one that would match the local network domain. So far we see no evidence of faulty behaviour in this logic, but the number of samples is still pretty low - soon we expect receiving way more.

@leearmstrong
Copy link

Hey @ncreated,

I had this while on a cellular network connected to a VPN. I am still digging through the real cause myself but wondered if you had managed to capture any info either?

@eseay
Copy link

eseay commented Apr 23, 2022

@ncreated Any update on this issue? We are having users report the issue as well, and a few of our internal team members have seen it as well.

Having our app inexplicably ask customers for obscure permissions erodes trust and is overall detrimental to our product's image and the user experience.

@ncreated
Copy link
Collaborator

Hey @eseay 👋. Although we haven't manage to reproduce this issue (also with VPN as pointed out by @leearmstrong), recently we got one hit in telemetry we collect in our own products. We now have a clear evidence that (in some circumstances) Kronos can try to query private IP during its NTP sync, bringing up the Local Network Permission alert.

We already discussed mitigation plan for this problem and it is scheduled in our backlog 👍 with a high priority - we will add IP range filtering to prevent Kronos from connecting to private IPs.


To leave more context for other folks stepping across this issue, in #709 we added monitoring to several phases of Kronos execution, including DNS resolution with CFHostStartInfoResolution(_, .addresses, _). We were using NWConnection API to pre-check each IP and inspect its NWPath.Status.

We got one hit reporting NWPath.UnsatisfiedReason.localNetworkDenied on connecting to private IP (Class C):

Screenshot 2022-04-25 at 12 24 20

One log is not much (vs dozen of thousands successful syncs), but it proves that Kronos can indeed reach a private IP and bring the Local Network Permission alert. It also means that IP range filtering can be a desired logic.

In the same log we can also see that .other was listed as available network interface - which may suggest VPN connection. However, in my tests I couldn't reproduce this problem even if using VPN.

Screenshot 2022-04-25 at 12 24 40

@ncreated
Copy link
Collaborator

Closing - fixed in 1.11.0. Thanks to everyone who helped nailing this problem 🍻🙂. We hope it's ultimately gone.

@victor-yn
Copy link

victor-yn commented Jun 22, 2022

Hello @ncreated ! Coming with bad news here :(

We've started to implement Datadog a few weeks ago and sent 1.10.0 to production. We've started to receive some internal reports about local network permission popup being asked at the launch of the app. I could not reproduce on my side.

I came across this Github issue and updated our SDK to 1.11.0 - however we just received an internal report of that local network permission being triggered, on 1.11.0. The user was in 4G.

Is there anything I can do to help the investigations? I can have access to the device in question if needed

@Arlindo-g
Copy link

I've also just updated to 1.11.0 and saw the prompt again.

@ncreated
Copy link
Collaborator

Hello 👋🥲 thanks for reports folks.

There are two things that come to my mind for now:

  • The private IP check we added in RUMM-2153 Prevent Kronos logic from querying private IPs #830 is too weak and it might be missing some cases. To validate this, we'd need to know the IP Kronos is trying to hit (@victor-yn is it possible to know it?). Eventually, an educated hint on what IPv4/IPv6 ranges we could be missing in there would be very helpful too (ofc, I will check it once again).
  • @victor-yn @Arlindo-g are you sure the issue is caused by Datadog SDK? I know it's hard to prove and I'm not rejecting it - rather doing sanity check and asking for more context 🙂

@victor-yn
Copy link

victor-yn commented Jun 24, 2022

Hello @ncreated!

Yes, we are 100% sure this is caused by Datadog SDK, because the app release we did only contained the SDK implementation and network permission prompt reports started on that specific app version. This never happened before.

To validate this, we'd need to know the IP Kronos is trying to hit (@victor-yn is it possible to know it?).

I can have access to the device on Monday so I can take a look. Could you tell me where to look at? (e.g. where to put a breakpoint/print, etc.)

I'm not familiar with Kronos, I would appreciate help here to avoid spending time on research 🙏

@ncreated
Copy link
Collaborator

ncreated commented Jun 27, 2022

Yes, we are 100% sure this is caused by Datadog SDK, because the app release we did only contained the SDK implementation and network permission prompt reports started on that specific app version. This never happened before.

Very clear, got it 👍👌.

I can have access to the device on Monday so I can take a look. Could you tell me where to look at? (e.g. where to put a breakpoint/print, etc.)

The Kronos logic which resolves IPs to perform NTP queries goes through these lines:

var isPrivate: Bool {
guard let host = host else {
return false
}

We decide weather given host is a private IP (stop) or not (go). @victor-yn if you manage to capture the host value on a breakpoint before it brings the Local Network Permission prompt, that would help a lot (💯🎯).

I'm opening this issue again for more visibility.

@ncreated ncreated reopened this Jun 27, 2022
@victor-yn
Copy link

victor-yn commented Jun 28, 2022

@ncreated thanks a lot for the detailed explanation, very much appreciated. 🙌

I had access to the internal device that could reproduce this issue on every app installation, those are the findings:

I can not capture the exact host value on a breakpoint before it brings the Local Network Permission prompt as suggested, as

is the method that trigger the prompt, so I listed all host values that were used to determine that the IP is not private.

I did the experience 4 times. (uninstall + reinstall, get the host prints before it shows the prompt)

Host values, by printing host property

Try 1 Try 2 Try 3 Try 4
64:ff9b::5be0:9529 2a05:f480:1400:53d::123 64:ff9b::a29f:c801 64:ff9b::a29f:c801
64:ff9b::253b:3f7d 2a05:f480:2000:1834::123 64:ff9b::c2b1:2274 64:ff9b::c2b1:2274
64:ff9b::5cf3:605 2001:41d0:305:2100::3f3e 64:ff9b::33c3:7585 64:ff9b::33c3:7585
64:ff9b::d453:9e53 2001:41d0:8:7a7d::1 64:ff9b::5cde:7573 64:ff9b::5cde:7573
82.64.172.48 178.170.37.31 62.210.244.146 62.210.244.146
missing info 188.165.236.162 193.200.43.105 193.200.43.105
51.15.175.180 95.81.173.74 51.195.117.133 51.195.117.133
151.80.211.8 92.222.117.115 51.75.17.219 51.75.17.219

So at this point:

resolver.completion?(IPs)
retainedSelf.release()

IPs is an array of 8 KronosInternetAddress, each of them representing one of the values listed above. Calling resolver.completion?(IPs) does trigger the Local Network prompt.

Does that help? I can generate more tries if needed, or please feel free to guide me if I need to look for something else 🙏

@ncreated
Copy link
Collaborator

@victor-yn thanks a lot! It's great data - I will analyse it tomorrow. To clairfy - do you have a device at your dosposal that reproduces this problem easily 😲? What is the iOS version and device model? Does it use wifi or 3g / LTE / some other connection kind (hotspot? VPN?) when the problem occurs?

@victor-yn
Copy link

victor-yn commented Jun 29, 2022

@ncreated Yes! We do have an internal user who manages to reproduce it every time. He's using an iPhone XS & iOS 15.4.1. He's always on 4G, no VPN or hotspot used. Let me know how it goes and if you need more information! 🙏

@ncreated
Copy link
Collaborator

@victor-yn I looked at these IPs and I wonder if the problem is not coming from 64:ff9b pattern in IPv6 addresses. It appears in 12 IPs you listed and in 3 out of 4 attempts you made. More digging and I found that this is defined in RFC 8125 as a Well-Known Prefix reserved for use with the RFC6052 IPv4/IPv6 address translation algorithms.

Would it be possible for you to prove this assumption with more attempts? With few more, if we notice that having 64:ff9b on the list brings the LNP prompt and lack of it doesn't, that would let me dig deeper into this.

Also, I tried hitting exactly the IPs you listed from my iPhone (on LTE and WIFI) and it doesn't bring the alert 🧩.

@victor-yn
Copy link

@ncreated I'm coming with bad news again 😬

As requested, I had access to the device in question this week and... we could not reproduce the prompt anymore 😭

I confirmed with the owner of the device, nothing special was done on it. Conditions were the same: 4G, no hotspot or VPN, the device did not restart, we used the exact same version of the app.

The exact same steps we did to have the local prompt at every try the day before... did not work anymore. That also surprised the device owner as well, as he could always reproduce it before and now it's just magically gone.

I still got the logs out of it:

Host values, by printing host property

Try 1 Try 2
64:ff9b::c39a:aed1 2a00:1080:800::6:1
64:ff9b::3ed2:f492 2606:4700:f1::123
64:ff9b::330f:bfef 2001:41d0:e:119e::1
64:ff9b::b220:de1d 2001:41d0:8:4d0d::1
176.137.36.37 193.107.56.66
82.64.45.50 31.170.8.123
178.32.222.29 51.15.175.180
82.64.84.116 51.178.43.227

None of the try led to the local network prompt, so I assume that the 64:ff9b lead is unfortunately not the right one.

I'll keep watching for reports, and try to see if I can find a pattern; but the fact that we can no longer reproduce it does not reassure me because the issue is unfortunately still there

@eseay
Copy link

eseay commented Jul 5, 2022

Has the option of just not relying on this extra dependency been considered? I understand that it was obviously added for some meaningful purpose, but if it is behaving in ways that we are unable to fully understand, I'd say that it's better to do without.

As I mentioned in an earlier comment, especially as privacy continues to be a greater concern among the general public, having an app inexplicably ask for access to a customer's local network erodes trust and is overall detrimental. The drawback of delivering that customer experience far outweighs any kind of benefit we may perceivably get from logging and tracing.

@xgouchet
Copy link
Contributor

xgouchet commented Jul 7, 2022

Hi @eseay, this topic has been a long and dificult one and we still haven't been able to get a satisfying explanation for why this issue happens, which means we can't have a proper and clean fix. This is mostly due to Apple's hidden logic on how they categorize a request as Local or not (apparently it goes beyond the standards of private network). We reached out to them to get some help but because we can't reproduce it locally, this is taking some time (and time away from adding or improving other features in our iOS SDK).

In the meantime we're working on providing a mitigation to avoid this issue: in a few words, we will add an option to let developers provide their own NTP resolution. This means you will be able to use different NTP servers than ours and/or a different NTP library than Khronos, or write your own logic to compute the exact time. I don't have yet a timeline for when this option will be available but it should arrive sooner than later.

@victor-yn
Copy link

victor-yn commented Jul 9, 2022

@ncreated coming with good news again.

For some unknown reasons, the internal user who couldn't reproduce the issue anymore... could reproduce it again. Nothing was changed: still on 4G, no hotspot or VPN. No iOS update. Nothing special was done on the device. Same app version.

I ran the scenario 8 times: uninstall + reinstall, get the host prints before it shows the prompt as requested.

Host values, by printing thehost property

Try 1 Try 2 Try 3 Try 4 Try 5 Try 6 Try 7 Try 8
64:ff9b::9750:d308 2a01:e0a:1f1:5cd0::1:1 2a01:e0a:1f1:5cd0::1:1 64:ff9b::25bb:7a0b 64:ff9b::3359:2165 64:ff9b::3359:2165 64:ff9b::330f:b6a3 64:ff9b::330f:b6a3
64:ff9b::bca5:eca2 2001:41d0:8:8759::1 2001:41d0:8:8759::1 64:ff9b::95ca:5b27 64:ff9b::33b2:2be3 64:ff9b::33b2:2be3 64:ff9b::a3ac:96b7 64:ff9b::4efb:810a
64:ff9b::5c4:a08b 2001:678:8::123 2001:678:8::123 64:ff9b::3626:de3f 64:ff9b::d433:b5f2 64:ff9b::d433:b5f2 64:ff9b::4efb:810a 64:ff9b::d455:9e0a
64:ff9b::253b:3f7d 2001:41d0:1004:b27:: 2001:41d0:1004:b27:: 64:ff9b::9750:d308 64:ff9b::253b:3f7d 64:ff9b::253b:3f7d 64:ff9b::d433:b5f2 64:ff9b::c39a:dc59
51.254.208.135 91.121.68.116 91.121.68.116 51.15.175.180 129.250.35.251 212.51.181.242 82.64.45.50 54.38.222.63
164.132.166.29 194.177.34.116 194.177.34.116 194.57.169.1 162.159.200.123 162.159.200.123 37.187.5.167 149.202.2.105
51.68.44.27 51.178.43.227 51.178.43.227 162.159.200.1 212.51.181.242 51.89.33.101 151.80.168.4 37.59.63.125
195.154.220.89 51.15.191.239 51.15.191.239 37.187.205.149 95.81.173.8 176.31.102.171 195.154.200.68 51.195.117.133

In every of the 8 tries, the Local Network prompt was displayed as soon as resolver.completion?(IPs) was called.

Does that information help you? Is there anywhere else you want me to log or don't hesitate to guide me if I need to look for something else that the host value.

@ncreated
Copy link
Collaborator

ncreated commented Aug 8, 2022

In the meantime we're working on providing a mitigation to avoid this issue: in a few words, we will add an option to let developers provide their own NTP resolution. This means you will be able to use different NTP servers than ours and/or a different NTP library than Khronos, or write your own logic to compute the exact time. I don't have yet a timeline for when this option will be available but it should arrive sooner than later.

As a follow-up to this, it is now possible to provide your own server time (or opt-out using Kronos with all its consequences) with the new API we added in 1.12.0-beta1 (please watch our Release page for being notified on 1.12.0 GA):

/// Sets a custom NTP synchronization interface.
///
/// By default, the Datadog SDK synchronizes with dedicated NTP pools provided by the
/// https://www.ntppool.org/ . Using different pools or setting a no-op `ServerDateProvider`
/// implementation will result in desynchronization of the SDK instance and the Datadog servers.
/// This can lead to significant time shift in RUM sessions or distributed traces.
///
/// - Parameter serverDateProvider: An object that complies with `ServerDateProvider`
/// for provider clock synchronisation.
public func set(serverDateProvider: ServerDateProvider) -> Builder {

@alarkirikal
Copy link

Hey - we're also struggling with this prompt and just found this thread. Any updates since a year ago apart from allowing the usage of other NTP libraries?

@valerio-bettini
Copy link

Also here. Our organisation already uses Datadog for our systems and platform and we really want to integrate it on our Mobile app (1.1 million customers) but we can't because of this!
Where is Datadog? Where are the people who should fix this?

@onelittlefish
Copy link

I don’t use DataDog, but I ran into this issue while using Kronos. At the time, I was able to test the IP addresses individually. In case this helps with debugging or a workaround, none of them appeared to be private IPs, and the IPv6 addresses all triggered the local network alert on the device that was seeing the issue, while IPv4 addresses did not.

@nchase
Copy link

nchase commented Oct 2, 2023

@ncreated shared the solution - you can override serverDateProvider with a no-op, which should stop the library from requesting this permission.

We implemented this a few weeks ago and it seems to have worked for us.

@valerio-bettini
Copy link

Thank you!
Will the log be associated with the correct date though?
Still, I'd love Datadog to fix it on their side, without us implementing workarounds.

@valerio-bettini
Copy link

@nchase I can't manage to set the no-op. There is nothing in the documentation on how to do it:
let configuration = Datadog.Configuration(clientToken: token, env: environment, site: .eu1, service: "ourService", serverDateProvider: whatHere?)

@sergiocampama
Copy link

class ZeroDateProvider: ServerDateProvider {
  func synchronize(update: @escaping (TimeInterval) -> Void) {}
}

let configuration = Datadog.Configuration(clientToken: token, env: environment, site: .eu1, service: "ourService", serverDateProvider: ZeroDateProvider)

@valerio-bettini
Copy link

class ZeroDateProvider: ServerDateProvider {
  func synchronize(update: @escaping (TimeInterval) -> Void) {}
}

let configuration = Datadog.Configuration(clientToken: token, env: environment, site: .eu1, service: "ourService", serverDateProvider: ZeroDateProvider)

If i do that, then no logs are recorded anymore

@sergiocampama
Copy link

we have that configuration and we get the logs just fine

@ViniciusCamposGarcia
Copy link

Here we have the same problem, and we remove Kronos in the following way while we don't develop a solution on our side.

final class DeviceDateTimeProvider: ServerDateProvider {
    func synchronize(update: @escaping (TimeInterval) -> Void) {
        update(0)
    }
}

dataDogConfiguration = dataDogConfiguration?.set(serverDateProvider: DeviceDateTimeProvider())

We added it to the configuration object used to start the lib but it is based on version 1 of the Datadog SDK

@maciejburda maciejburda added the bug Something isn't working label May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests