-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sprint Day One report - Brisbane #2
Comments
"Comparing two sets of data for differences, eg two sets of article IDs - one locally and one on a vendor database - and you want to know which are missing form each - ie sets." I just had to do this in the past few days, so this is a reasonable use case to me :) |
Pad for this at http://pad.software-carpentry.org/lc-new-python |
@libADS Can I ask how you did your comparing? Excel? Manually? |
@richyvk initially, in Python, after loading the csv files in memory. The tricky part came from not necessarily having exact match, for instance titles might be spelled slighty differently between two files of articles. In the end I imported the data into a local postgres instance, it allows me to try different query much faster. For fuzzy matching in Python I used: def similar(a, b):
from difflib import SequenceMatcher
return SequenceMatcher(None, a, b).ratio() Postgres has an extension to do this too |
Hi all
So, we've had a day of talking! We've deliberated a lot. We in Brisbane have concluded that the existing lesson is too Pandas, but the Software Carpentry gapminder lesson could work really well as the basis for the LC Python lesson.
So, handing over to whoever wants to take this up, or we'll be working on it tomorrow. We've imported the SC lesson into data-lessons account, the repo url is: https://github.com/data-lessons/library-python-intro
Stuff we are intending needs doing:
But, we figure a lot of it can pretty much stay as it is!
We have failed to come up with one single compelling 'superpower' example to run through the lesson. But, some more ideas we've had for examples (a lot of these might be useful for certain episodes):
That's pretty much it from us for today. We'll get stuck in again tomorrow with editing the lesson. But go for it in the mean time if you want to!
The text was updated successfully, but these errors were encountered: