-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No Longer Valid #1
Comments
I ran the crawler again, the URLs are still the same and the code works. |
You are right. I tried again and realized you must be using python 2 and I am using python 3. Running this though, how did you not get blocked by LinkedIn? I am creating my own version of this and they blocked the test accounts I used to scrape the directory after about 80 page requests. |
Yes, the code is using Python 2. I will be releasing Python 3 version of this code soon. Use appropriate politeness policy to not get block by LinkedIn. |
I tried being polite. I have throttled it down to a random wait time between 15 and 30 seconds, but after ~80 requests I am locked out with a captcha that won't let me through even when I fill it out in person. I have gone so far as to randomly shutdown for a few minutes and then start up again to look more like a person. I also have randomized headers. No luck. Any suggestions...? |
I have checked this script in python 2.7.2. the code doesn't work for me. |
Yes, it doesn't work anymore, pages like https://www.linkedin.com/directory/topics-a/ don't exist anymore. |
Anyone has a solution to make it work now? |
The URLs for the site have changed. This code no longer works.
The text was updated successfully, but these errors were encountered: