Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Use Selenium to Enable Javascript / Real-Browser Scraping + Misc Fixes #302

Open
wants to merge 45 commits into
base: master
Choose a base branch
from

Commits on Jun 5, 2020

  1. Implement query_js.py using Selenium

    andrew committed Jun 5, 2020
    Configuration menu
    Copy the full SHA
    5c1c08f View commit details
    Browse the repository at this point in the history
  2. make headless

    andrew committed Jun 5, 2020
    Configuration menu
    Copy the full SHA
    9564c6d View commit details
    Browse the repository at this point in the history
  3. misc fixes

    andrew committed Jun 5, 2020
    Configuration menu
    Copy the full SHA
    7f7b753 View commit details
    Browse the repository at this point in the history
  4. fix retries

    andrew committed Jun 5, 2020
    Configuration menu
    Copy the full SHA
    3cc39c3 View commit details
    Browse the repository at this point in the history
  5. fix typo, make headless

    andrew committed Jun 5, 2020
    Configuration menu
    Copy the full SHA
    f0a47c6 View commit details
    Browse the repository at this point in the history
  6. sleep less

    andrew committed Jun 5, 2020
    Configuration menu
    Copy the full SHA
    c5a61aa View commit details
    Browse the repository at this point in the history
  7. include @abhisheksaxena1998's fix

    andrew committed Jun 5, 2020
    Configuration menu
    Copy the full SHA
    2dfb19d View commit details
    Browse the repository at this point in the history
  8. allow interoperability between old and js

    andrew committed Jun 5, 2020
    Configuration menu
    Copy the full SHA
    9246ad5 View commit details
    Browse the repository at this point in the history
  9. peg selenium-wire version

    andrew committed Jun 5, 2020
    Configuration menu
    Copy the full SHA
    56c88f8 View commit details
    Browse the repository at this point in the history
  10. remove get_query_url

    andrew committed Jun 5, 2020
    Configuration menu
    Copy the full SHA
    9b9a218 View commit details
    Browse the repository at this point in the history
  11. remove unused functions

    andrew committed Jun 5, 2020
    Configuration menu
    Copy the full SHA
    38fb35e View commit details
    Browse the repository at this point in the history

Commits on Jun 6, 2020

  1. misc fixes

    andrew committed Jun 6, 2020
    Configuration menu
    Copy the full SHA
    c226cdd View commit details
    Browse the repository at this point in the history
  2. implement limit for selenium

    andrew committed Jun 6, 2020
    Configuration menu
    Copy the full SHA
    59b7f5f View commit details
    Browse the repository at this point in the history

Commits on Jun 7, 2020

  1. remove requests requirement

    andrew committed Jun 7, 2020
    Configuration menu
    Copy the full SHA
    3c256d8 View commit details
    Browse the repository at this point in the history
  2. fix misleading log line

    andrew committed Jun 7, 2020
    1 Configuration menu
    Copy the full SHA
    8786a57 View commit details
    Browse the repository at this point in the history
  3. enable fetching all history from user

    andrew committed Jun 7, 2020
    Configuration menu
    Copy the full SHA
    b584d92 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    4a0277d View commit details
    Browse the repository at this point in the history
  5. fix get user data

    andrew committed Jun 7, 2020
    Configuration menu
    Copy the full SHA
    ede0303 View commit details
    Browse the repository at this point in the history
  6. fix query.py get_user_info

    andrew committed Jun 7, 2020
    Configuration menu
    Copy the full SHA
    9b47cce View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    8db4b85 View commit details
    Browse the repository at this point in the history

Commits on Jul 23, 2020

  1. add test cases (failing rn)

    andrew committed Jul 23, 2020
    Configuration menu
    Copy the full SHA
    197435c View commit details
    Browse the repository at this point in the history

Commits on Sep 20, 2020

  1. Configuration menu
    Copy the full SHA
    27ef71c View commit details
    Browse the repository at this point in the history
  2. fix missing return

    andrew committed Sep 20, 2020
    Configuration menu
    Copy the full SHA
    22b6278 View commit details
    Browse the repository at this point in the history
  3. fix type error

    andrew committed Sep 20, 2020
    Configuration menu
    Copy the full SHA
    c804799 View commit details
    Browse the repository at this point in the history

Commits on Sep 21, 2020

  1. Configuration menu
    Copy the full SHA
    67fb182 View commit details
    Browse the repository at this point in the history

Commits on Sep 22, 2020

  1. date range fix

    andrew committed Sep 22, 2020
    Configuration menu
    Copy the full SHA
    a81e4af View commit details
    Browse the repository at this point in the history

Commits on Sep 23, 2020

  1. add geckodriver

    andrew committed Sep 23, 2020
    Configuration menu
    Copy the full SHA
    a73b339 View commit details
    Browse the repository at this point in the history
  2. upgrade geckodriver

    andrew committed Sep 23, 2020
    Configuration menu
    Copy the full SHA
    c6081e3 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    4fd12c8 View commit details
    Browse the repository at this point in the history
  4. implement use_proxy

    andrew committed Sep 23, 2020
    Configuration menu
    Copy the full SHA
    954c696 View commit details
    Browse the repository at this point in the history

Commits on Sep 24, 2020

  1. fix logger

    andrew committed Sep 24, 2020
    Configuration menu
    Copy the full SHA
    d72dde5 View commit details
    Browse the repository at this point in the history
  2. fix logging

    andrew committed Sep 24, 2020
    Configuration menu
    Copy the full SHA
    faf75af View commit details
    Browse the repository at this point in the history
  3. Update Dockerfile

    updated dockerfile to pull install latest firefox
    webcoderz authored Sep 24, 2020
    Configuration menu
    Copy the full SHA
    7e28f6b View commit details
    Browse the repository at this point in the history
  4. Merge pull request #1 from webcoderz/patch-2

    Update Dockerfile
    lapp0 authored Sep 24, 2020
    Configuration menu
    Copy the full SHA
    2a7d9d5 View commit details
    Browse the repository at this point in the history
  5. merge user query

    andrew committed Sep 24, 2020
    Configuration menu
    Copy the full SHA
    0a05ad7 View commit details
    Browse the repository at this point in the history
  6. Update Dockerfile

    sym link to both geckodriver and selenium wires
    webcoderz authored Sep 24, 2020
    Configuration menu
    Copy the full SHA
    d39a47a View commit details
    Browse the repository at this point in the history
  7. Update Dockerfile

    webcoderz authored Sep 24, 2020
    Configuration menu
    Copy the full SHA
    34367fb View commit details
    Browse the repository at this point in the history
  8. Merge pull request #3 from webcoderz/patch-5

    Update Dockerfile
    lapp0 authored Sep 24, 2020
    Configuration menu
    Copy the full SHA
    bb31bdf View commit details
    Browse the repository at this point in the history

Commits on Sep 25, 2020

  1. Configuration menu
    Copy the full SHA
    eea4954 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    9f8bf13 View commit details
    Browse the repository at this point in the history
  3. Merge pull request #4 from webcoderz/selenium

    updating dockerfile with firefox dependencies
    lapp0 authored Sep 25, 2020
    Configuration menu
    Copy the full SHA
    7ad6f54 View commit details
    Browse the repository at this point in the history

Commits on Sep 26, 2020

  1. do multiple-passes, fix proxy, faster scrolling

    andrew committed Sep 26, 2020
    Configuration menu
    Copy the full SHA
    d6cd1d8 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    dc816cd View commit details
    Browse the repository at this point in the history

Commits on Sep 27, 2020

  1. Configuration menu
    Copy the full SHA
    a6a76f7 View commit details
    Browse the repository at this point in the history

Commits on Sep 28, 2020

  1. remove unused imports, increase timeout

    andrew committed Sep 28, 2020
    Configuration menu
    Copy the full SHA
    8d665d0 View commit details
    Browse the repository at this point in the history