Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple scraper from database #49

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

miguelpduarte
Copy link
Contributor

@miguelpduarte miguelpduarte commented Oct 17, 2021

Decouple scraper from database, instead exporting to a CSV file.
Also clearing up some technical debt and upgrading of older versions of packages.

Then another module to import the CSV to the DB with proper validation and queries should also be created.

Course spider will have to read from the faculties CSV eventually, but
just focusing on selectors and structure for now.
Now fetches faculties from previous results.
…rses spider to resolve some of the found issues

Additionally, we are now storing the plan_id since the course_units URL
can be constructed from that+school year (see README.md for details).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant