-
-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rust] Increase user awareness of their right to opt-out #15317
base: trunk
Are you sure you want to change the base?
Conversation
…anager is used This increases the awareness of users that they have the right to opt-out
CI Feedback 🧐(Feedback updated until commit 82c8c3c)A test triggered by this PR failed. Here is an AI-generated analysis of the failure:
|
That's a much more broad interpretation of PII than what has been established by case law. For data to be considered PII under the GDPR, the organization collecting it must either be able to directly identify an individual or have reasonable means to access additional data that would allow such identification. Aggregated information about browsers and operating systems among millions of users does not meet that standard. Regardless, I'm not interested in playing a game of yes it is/no it isn't, so thank you for providing your opinion, but I'm only interested in specific feedback for how we might make this implementation better, not how to remove it, and I would appreciate your being respectful of that. |
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it is a good idea to display this message. However, the implementation can be improved as follows:
- The message should be logged using the
Logger
struct since this is how bindings capture the output (currently using the JSON format). - There is no need to create this empty file called
initialized
. Selenium Manager already manages a metadata file calledse-metadata.json,
which is located in the cache root. If you want to display that message once, you simply can check if that file exists.
No, that's not. I guess you restrict PII to data allowing you to namely identify a person. That a huge mistake.
So your data are not anonymous, and so are only at most pseudonymous, and so are PII. User agent was already declared PII by the CNIL, in Criteo fine: https://gdprhub.eu/index.php?title=CNIL_(France)_-_SAN-2023-009
|
Sure. I'm not exposing my opinion but the law. And so I go this day open a claim on my APD because Selenium infringe the law. Good luck. |
Your generic critiques do not apply to us, and we'll happily cooperate with any authority who asks for this information. To reiterate Correlation: Selenium Manager does not persistently track users across sessions or associate multiple telemetry events. Inference: The collected data does not allow for behavioral profiling or inference about an individual’s actions or preferences. |
You are totally wrong. PII definition don't suppose you ACTIVELY want to identicate a person but only if you ARE ABLE to do so if wanted. It's even worst given Breyer CJEU case, because this possibility must even be study from an UNLAWFUL point of view, or for example because legal seizure by authorities. And the reidentification can even be performed by a third party entity.
You definitively have no idea about what is the real PII definition or what is really the GDPR |
I've continued engaging with this conversation to demonstrate to anyone who wants to view it that I understand your concerns and I'm not ignoring them. We each believe we understand the law correctly, and neither of us seem to have the ability to persuade the other, so there's no point in continuing. I'm still open to feedback from anyone with actionable steps we can take to improve our product on behalf of our users and the community. |
I hate authority argument, but… I'm one of the most expert people about GDPR, with currently more than 250 cases opened on DPA and a 100% win rate on more of 150 cases currently decided at the end. I predict 6 month in advance future EDPB guidelines and most of the time word for word. I'm currently one of the only 20 experts in the world with the technical and legal skills of the Special Pool of Expert of EDPB. I work with people who litterally write and negociate the GDPR with UE institutions. Usually, when I say something about the GDPR, I'm not wrong… Really. |
@aeris, assuming you are coming in good faith and with a collaborative spirit, how would you help a non-profit open-source project like Selenium become aware of its usage across the community in a GDPR-compliant way? |
@diemol > exactly how I say on the first issue : telemetry MUST be opt-in and not opt-out. This is just pure GDPR and ePrivacy application. And in non profit open-source project, we are not supposed to have to fight for such basic privacy feature. |
@aeris how would you do it opt-in in a simple way? Have you used Selenium? What would be the best way? |
And in general, and worse since Trump election, you CAN’T rely on US providers because US have non compatible constitution with GDPR (see Schrems I & II CJUE case, FISA 702, Cloud Act, and more recently the PCLOB dismiss by Trump, with a formal letter from LIBE to ComUE about DPF invalidation) |
@diemol > the way this PR add a warning at first use seems good, but must be opt-in and not opt-out. |
@aeris why do you mention US providers? Plausible has guaranteed us that they host everything in the EU. |
Classic providers position, but totally wrong. Data location are NOT relevant from FISA 702 point of view. If a US agency can enforce a gag order on Plausible to exfiltrate data from a EU point, it's enough to not being able to use this provider. |
Yeah, but how do you recommend doing it in a simple way for the user? We would like to know how much the tool is used. This data is relevant to show how current Selenium is compared against other for-profit tools. |
For example, because of those US law, Microsoft is not able to ensure data localization |
I take the trouble the other way: why a opensource non-profit project would ever need such audience stat? Are we not supposed to be better than for-profit tool and don't want to track user just for metrics? |
After checking Plausible privacy policy, they say they use only EU provider. Would have to check why I notice Bandcamp on the first issue 🤔 But they also mention hCaptcha, already striked for GDPR violation (one of my case with french DPA, but not public at this time), and Mailchimp (https://gdprhub.eu/index.php?title=BayLfD_(Bavaria)_-_LDA-1085.1-12159/20-IDV) |
If I may interject here, this is the core issue at hand. The project doesn't need most of your user base reporting anything, you just need a bunch of people who know and like your product, who trust you and who willingly share their anonymized usage data with you. Sure you'll see a dramatic drop in entries, but they'll also increase in quality. And you'll be compliant. |
@GuillaumeRossolini I put our reasons for needing opt-in here: There is no way to approximate a representative sample population with the way users would have for opting in. The very nature of the opting-in mechanism varying significantly by implementation prevents being able to compare information about those implementations. There is legitimate interest in knowing whether anyone is still testing on Internet Explorer so we know if we should sunset that driver, or if it is worth dropping JavaScript or Ruby bindings rather than continuing to maintain it with the dwindling user-base. And importantly for an open source project, we need to be able to demonstrate that we're still growing as a project even if other tools are gaining market share. |
My frustration here is that if your broad interpretation of the statute were accurate, then there would be a huge number of much more privileged institutions and organizations in much more egregious violation. So throwing your weight around in github issues on open source projects feels like you're mostly interested in trolling and bullying volunteers who are legitimately trying to make things better for their users. If things are as clear cut as you believe, then go work with Plausible so that everyone using their tool can better protect their users. Reach out to the larger organizations who are doing what we are doing and have the time to engage you the way you want to be engaged so there is more visibility on the boundaries and clear precedence for behavior. |
Or perhaps… Plausible by itself can't be compliant and have no way to be legally usable? 🤷 |
Look I'm make it easy: your views on growth at all costs, on not respecting your users' privacy, on harvesting metrics (including IP addresses) on a system you don't control, and generally on laws will drive users away. |
Right, but if you prove that, then everyone who thinks they are covered by plausible's terms will know that isn't the case. Ideally you can help them find a way to be better so that everyone wins. |
To avoid any misunderstanding: it's not Plausible per se which is unlawful, but just the fact there is no lawful way to use it. You also still need TOO:
All of those conditions are CUMULATIVES, having a single one missing leads to unlawful processing Plausible barely achieve article 28, all others are very very hard to justify and worse for a non-profit project |
The way is very simple: do you REALLY need such metrics? If not, just don't process data. |
@GuillaumeRossolini your response is why this feels like trolling. You are exaggerating what we are actually doing to justify your position. If you really think the legislation's scope is as broad as you claim, why do you need to characterize what we're doing as negatively as you do, and ascribe motivations to us that we don't have? @aeris the standard isn't "legitimate need" but "legitimate interest" and the fact that you want to apply the former standard makes the rest of our conversation feel disingenuous. |
@titusfortner I bet you never read a single guideline about legitimate interest or a single DPA decision about legitimate interest revocation. "Legitimate interest" lawful legal basis is NOT just having a legitimate interest. This is just the first step of the triple test. You ALSO need real need and proportionality. "Legitimate interest" is one of the most dangerous and complex legal basis of the GDPR and most of the "legitimate interest" claim in front of a DPA has been revoked and placed under consent instead. Usually with a fine to have infringe the GDPR. |
I'm not ascribing motivations to you, because I don't know you. Nor do I need to. If your project is gathering user data without their consent (this is an established fact regardless of your intent with said data), 1- it's against the law and 2- I'll stop using it |
With that information we are able to see different things:
These are just some of the reasons why we want to do this. But there are more. |
But I can give you one trolling comment: does this data tell you how many users were driven out by this practice? |
https://www.edpb.europa.eu/system/files/2024-10/edpb_guidelines_202401_legitimateinterest_en.pdf As stated by EDPB, a processing CAN'T be based on IL if there is a less intrusive way to achieve the same result. Can you achieve the same result with regular user studies, for example twice a year? Yes? So Plausible CAN'T be legitimate. |
@diemol all of this would be just as useful with consent, and more meaningful |
We have tried by counting downloads and doing surveys but we have not been able to achieve that. |
@diemol > For which reasons? Because people don't want to reply? And so you just want to force them to feed data? |
Because each package manager tracks information differently and not using the same time frames. Also because downloading a tool does not mean it is being used. I wouldn't say we are forcing them because we have been open about this in different releases, documentation, and posts, showing how to avoid sharing the information. |
Avoid sharing the information is not the only trouble. You also have: ePrivacy 5(3): GDPR article 12 + ePrivacy 5(3) + GDPR article 21(5): You have to inform at the first time about the telemetry, with full description of the processing on a clear and comprehensive manner. But ALSO provide a way at this first time to refuse the processing GDPR article 15 to 22: right to access, oppose, rectify, suppress Once again, having to fight for GDPR compliance is something I expect to have to do against a GAFAM/for-profit organization, with lawyers nitpicking about any word, with years if not decades of legal action. Definitively not with a non-profit org supposed to have nothing in common but just currently arguing against article 7 of Fundamental Rights… Each time it very incredible to have non-profit "privacy friendly" project not even able to have a single clue on one of the most important law of the century about privacy… |
If no-one voluntarily maintains it (e.g. because they need it), call on contributors to take care of it. Then put a warning that it could be deprecated for lack of maintainers, then disable it by default with the possibility of reactivating it explicitly (with the deprecation warning when it’s enabled), and finally deletion if nobody reacts. It's fairly unlikely that the proportion of users who absolutely need it and the proportion of users who are willing to maintain it (or pay someone to do so) won't overlap.
I disagree. For me, the "look, lots of people are using our tool" argument seems less relevant than having these "lots of people" actually recommending you by themselves. Especially as there's nothing to prevent you from cheating about it, and even if you’re not, there's no guarantee that it's reliable either. I’ll add a reason why you might want to concentrate on a modest open-source project rather than a giant, without being trolling: you might want to use your tool rather than the giant’s one. For example, if we're an organisation that's also going to process other people's data, we have to be compliant ourself. If neither you nor the commercial tool is compliant, we have no solution. Then we ask to the one we’d prefer to use to be compliant, so we can use it. Otherwise we'll complain to the giant, and if it becomes compliant before you do, we'll use it instead... Today, the vast majority of projects, commercial or no, are illegal under the GPDR. Admittedly, some have greater breaches than others, but the question is not who is breaching the law the most, but who is respecting it. Being the only one compliant can be a very good selling point… |
And you can find this kind of comment about selenium
Just nothing distinguish your project from a commercial one… |
I am ready to give up, not because of the reasoning, I believe at some point it is possible to collaborate and get to an agreement. But rather the tone used against us, every single message feels like a threat to us and I just don't want to have that in my inbox every day. |
User description
Motivation and Context
This should address the concerns raised in #14588
To clarify:
BUT, it's a nice thing to be as up front as possible about what's happening, and I think a one time message is reasonable.
The message will only appear the first time the user attempts to send data to Plausible after the cache has been created or cleared.
Looks like:
This is in draft until I get the privacy policy posted, but looking for feedback on idea and implementation while that's in progress.
PR Type
Enhancement
Description
Added a one-time message to inform users about anonymous telemetry collection.
Implemented a mechanism to check and create a marker for the first report.
Enhanced user awareness of opt-out rights with a clear message.
Changes walkthrough 📝
lib.rs
Add telemetry opt-out awareness message functionality
rust/src/lib.rs
first_report_msg
function to display a one-time message.stats
function.