Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Sticky: Please read first] Broken website / login ! Now what ? #83

Open
Cimbali opened this issue Mar 15, 2020 · 15 comments
Open

[Sticky: Please read first] Broken website / login ! Now what ? #83

Cimbali opened this issue Mar 15, 2020 · 15 comments
Labels
whitelist/rules A redirection or functionality is broken by CleanLinks

Comments

@Cimbali
Copy link
Owner

Cimbali commented Mar 15, 2020

Is there a website that is not working anymore? Maybe reloading infinitely? Here’s what you can do:

1. Whitelist the URL

  1. Open the CleanLinks menu by clicking on the toolbar button,
  2. Search for the problematic link (possibly filtering with “Embedded Link” only),
  3. Click the “Whitelist Embedded URL” button.

You’re done ! In very few cases there might be multiple requests redirecting to the same page.

2. Consider contributing the cleaning/whitelisting rule

This is important as CleanLinks has no telemetry at all, not even anonymous, and I don’t visit every website of the internet. So I can’t possibly gather all of the needed information to build the perfect cleaning rules for everyone!

Do you think the website is used by many people, or could be useful to the wider community as a default rule? Please open an issue or post it here as a comment and I’ll try to integrate it. What I ned to know is:

  • on which pages does the problem happen?
  • which parameters should be removed or whitelisted for the website to work?

You can search for the rule by filtering by website in CleanLink’s configuration page, or in the rules file that you can export from that page.

@Cimbali Cimbali added the whitelist/rules A redirection or functionality is broken by CleanLinks label Mar 15, 2020
@Cimbali Cimbali pinned this issue Mar 15, 2020
@Cimbali Cimbali changed the title Broken website / login ! What should I do? [Sticky: Please read first] Broken website / login ! What should I do? Mar 15, 2020
@Cimbali Cimbali changed the title [Sticky: Please read first] Broken website / login ! What should I do? [Sticky: Please read first] Broken website / login ! Now what ? Mar 15, 2020
@Grossdm

This comment has been minimized.

@Cimbali

This comment has been minimized.

@RazielZnot

This comment has been minimized.

@Cimbali

This comment has been minimized.

@Cimbali
Copy link
Owner Author

Cimbali commented Apr 7, 2020

I’ve minimised previous comments as they were about general usability of the add-on. Please open a new issue (or comment on one that already exists and is on that topic) for such problems.

In this issue, users can report which websites/parameters for review to be included in the whitelist.

Please report, for every proposed entry:

  1. domain/path + parameter (or path) to whitelist (or remove)
  2. example link which fails to load
  3. page on which such a link can be found
  4. any further comments
  5. why should it be included

I’ll try to get around to each suggestion and test it and see if it should be included.


For example for this report from a different bug:

  1. On invidio.us/get_video_info the eurl parameter contains the current URL which causes the load to fail
  2. example link: https://invidio.us/get_video_info?html5=1&video_id=tyTTVuG6vcw&cpn=IaObfF5H62G6D93_&eurl=https%3A%2F%2Fwccftech.com%2Fcrytek-invites-developers-to-try-out-cryengine-on-android%2F&el=embedded&hl=en_US&sts=18353&lact=10&c=WEB_EMBEDDED_PLAYER&cver=20200404&cplayer=UNIPLAYER&cbr=Firefox&cbrver=73.0&cos=X11&width=740&height=416&ei=QueMXoS2H-fDxgL2k5-wAg&iframe=1&embed_config=%7B%7D
  3. example page: https://wccftech.com/crytek-invites-developers-to-try-out-cryengine-on-android/
  4. comments: invidio.us is not directly embedded in the page, but it’s a privacy proxy for youtube. Some addons redirect youtube requests to invidio.us.
  5. invidio.us is maybe not very widely used, but it’s targetted at privacy-conscious, so CleanLinks users might probably be using it.

@Cimbali
Copy link
Owner Author

Cimbali commented Apr 8, 2020

Parameters which are being redirected here might of interest to you:

https://gbhackers.com/facebook-tried-to-buy-nso-spyware/

  • One thing I want to say @Cimbali is that you should take my suggestions only after giving it a more thorough inspection. The parameters which I report with some doubt like that "domain" is maybe used in some legitimate cases so I'm not completely sure if it should completely be ignored that's why I specified "in coordination with /auth"(I also found "d" & "D" used in place of "domain" in some cases). The parameters which can easily be seen breaking the stuff without a doubt like those of instagram, soundcloud, twitter etc can obviously be made a easy decision upon. So please keep checking them first on your machine as well.

Originally posted by @Rtizer-9 in #106 (comment)

No worries @Rtizer-9 I do check every suggestion before including it. I think it makes sense to be rather lenient on the login rules though. The only reason I decided not to go for whitelisting all parameters (i.e. .+) on these pages is that whitelists always override remove lists. Since the login rule has no domains specified, I think we should minimise its potential side-effects. Say a page that has /sso/ in its path but does nothing with logins, which is unlikely, then we only whitelist some of its parameters instead of all of them.

@Rtizer-9
Copy link

Rtizer-9 commented Apr 9, 2020

  • On Facebook, when you click on "see more posts", CL takes the click to an intermediate blank page. I've tried to reproduce it after disabling CL and the error goes away.
    The url is something like
    facebook.com/ajax/feed/substories...

  • Also, when you are viewing a post which has "see more" option and you click on it, the page scrolls to the top.
    The redirection here is of the pattern facebook.com/profileidhere ---> sameurl with #

@Cimbali
Copy link
Owner Author

Cimbali commented Apr 11, 2020

I think the facebook links should be fixed from the new release. I didn’t have time to give those examples specifically though, but the new version is a fix for # links.

@Rtizer-9
Copy link

Rtizer-9 commented Apr 19, 2020

I was trying to solve above instagram breakage using different regex and got success in few cases but the .+ or other were not working for every case. In the end while I accidentally hovered over ?, it was written that to match all the paths leave it empty and it worked really.

Thought I should let you know.

@Cimbali
Copy link
Owner Author

Cimbali commented Apr 19, 2020

.+ matches but there must be something to match. .* allows to match even if it is empty (. = any character, * = 0 or more, + = 1 or more). The empty path means we don’t even try matching so it should catch everything indeed.

@Rtizer-9
Copy link

CL breaks page functionality here:
https://www.timeanddate.com/worldclock/converter.html (try to add an entry)

The entries in log are
nullmenc(0)
-javascript:menc(0)

@Rtizer-9
Copy link

Hey @Cimbali , hope you're safe and doing well.

I just found out a bug in facebook rules. The whole scenario is like this:

If you visit a url of pattern: https://www.facebook.com/photo.php?fbid=xxxxxxxxxxxxxxx&set=a.xxxxxxxxxxxxxxx&type=3 , the fbid parameter gets stripped out because of rules but when I looked at ClearUrls' rules, I saw that there are some exceptions and "photo" is one of them. So maybe when you imported the rules from there, the exceptions were not taken care of. I mean that's the only conclusion one care draw given how CL is playing with those links.

The fbid parameters gets stripped out and after visiting the link you get that "page isn't available" error giving the user an illusion that the post is indeed private.

Please also have a look at other such rules which needs to have exceptions.

@Faedelity
Copy link

I am reading the wiki, and I didn't see the information I was looking for to allow a generally well-behaved website to do complex internal links, but not external. Do I put the domain in the parameters of what is allowed, or...?

@Cimbali
Copy link
Owner Author

Cimbali commented May 1, 2023

In general it’s the target of the link you’re whitelisting. Unless you’re looking at javascript link, in which case there’s no way of knowing in advance what the target is before executing the javascript and its potential tracking actions -- in that case you have to whitelist the origin domain.

So for a website on domain.com:

  • whitelisting parameters and/or path-embedded URLs on *.domain.com/.* will cover internal links on that website and links from other websites to domain.com, basically saying the layout and structure of the pages on this website is legitimate.
  • allowing javascript in links on *.domain.com/.* will cover all javascript actions on that domain without the possibility to filter on whether they redirect you to internal or external pages.

I hope that somewhat answers your question? From what I understand of your problem, if you trust that website, whitelist its redirects (i.e. matching *.domain.com/.*) and that won’t cover (non-javascript) external links.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
whitelist/rules A redirection or functionality is broken by CleanLinks
Projects
None yet
Development

No branches or pull requests

5 participants