-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem running nodriver in headless mode #5
Comments
The irony in a library designed to ensure Chrome's stealth as a web scraper, yet inadvertently revealing itself by failing to suppress the very "HeadlessChrome" signature it was supposed to conceal in headless mode. |
Hello. That is unnecessary. you can manually replace it with useragent_override and replace() method. |
I know that it is not necessary or practical in the long run, but I couldn't apply your logic, could you be more specific about using the useragent_override method since I can't find any documentation about that, besides the idea is that before initializing the browser , carry the useragent without the word Headless like undetected chromedriver does. If you could give me an example code in which you perform this fix, that would be great and I could conclude the thread. PD: I have also tried this code but it only injects the cdp of the current tab, and not the entire browser |
Just run a javascript that does it or start chrome with the custom agent from the commands and stop crying. |
Study the documentation on user agents |
I was doing tests with the nodriver module, when I tried to test the headless mode and I discovered that when activating this mode, the user-agent is modified and this makes the browser detectable as a bot, I attach the user-agent that is returned to me when using headless. Thank you!
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/128.0.0.0 Safari/537.36
TEMPORALY FIX:
Inside the nodriver module there is a class called Config, on line 185 after
if self.headless: args.append("--headless=new")
I have included a request with the requests module to obtain the latest useragent for chrome without that supposed 'Headless' and thanks to this before the execution the 'Headless' text disappears, I leave the code here in case it helps someone
so_key = {"windows": "windows", "linux": "linux", "darwin": "mac"}[platform.system().lower()] ua = next(ua for ua in requests.get("https://jnrbsn.github.io/user-agents/user-agents.json").json() if so_key in ua.lower() and "chrome" in ua.lower() and "firefox" not in ua.lower()) args.append('--user-agent=' + ua)
The text was updated successfully, but these errors were encountered: