Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(act): add activity 3 + test and check lessons that support it #18

Merged
merged 9 commits into from
Oct 11, 2024

Conversation

lwasser
Copy link
Member

@lwasser lwasser commented Oct 9, 2024

No description provided.

Copy link
Contributor

@ucodery ucodery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two suggestions I have for this lesson:
encourage attendees to "fail fast"
only when it is unrecoverable
this makes the failure happen earlier, wasting less time waiting
this can make the stack trace more useful, as it points the the code experiencing the error

clearly describe the error.
rather than just printing out "exiting now", try to tell the caller why it was a mistake, and what they could maybe do to fix it.
maybe also explain that exceptions can (most of the time) take a message string, and it doesn't have to be printed also.

clean-modular-code/activity-3/clean-code-activity-3.md Outdated Show resolved Hide resolved
"""
try:
return title[0]
except (TypeError, IndexError):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be really helpful to the attendees if you explained how YOU knew what exceptions to catch, and why you handled (only) two exceptions here.

This is still a very hard skill, exceptions are not always documented, and even when they are, potential transitive exceptions can still be unexpected.

edit: I see the most common exception classes are touched on later, but IMHO it would still be very valuable to have attendees reason about what exceptions might occur for novel edge cases and be able to check their work. Lesson time permitting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree! I do think this content by itself could be a full workshop!
What if i fleshed out that concept a bit. more in the lesson below (this document is the activity)

then maybe we could have them spend time looking at some examples of issues in the data and identifying what type of error it may throw. i could make the notebook more interactive. And it could link to the explanations in the other lesson. i think your idea is a great one AND it might be more valuable than forcing them to debug the code honestly. debugging takes time and is super hard too. we have 4 hours and 3 activities.

"""

date_str = (
f"{date_parts[0][0]}-{date_parts[0][1]:02d}-{date_parts[0][2]:02d}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is repeating the logic you wrote in format_date (which isn't used anywhere).


:::{admonition} On your own 2
:class: attention
Ideas welcome?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An observation I have is that this code uses neither conditionals nor exception blocks, even though those were the two stated tool for data "edge cases"

@@ -68,7 +65,7 @@ After completing this lesson, you will be able to:
* Apply the PEP 8 Style Guide standards to your **Python** code.
:::

"Pythonic" code is code that follows the conventions and best practices of the Python programming language. It emphasizes code that is clear, concise, and readable--principles that adhere to Python's design philosophy. <link to zen of python>
"Pythonic" code is code that follows the conventions and best practices of the Python programming language. It emphasizes code that is clear, concise, and readable--principles that adhere to Python's design philosophy.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to link out to get the zen of python, just ask the readers to import this

@lwasser
Copy link
Member Author

lwasser commented Oct 10, 2024

@ucodery i knew you'd have a lot to provide here!!

I have a few questions - specifically related to raising errors.

encourage attendees to "fail fast" only when it is unrecoverable this makes the failure happen earlier, wasting less time waiting this can make the stack trace more useful, as it points the the code experiencing the error

  1. Can you help me understand what failing fast would look like? If we catch an exception where it occurs, isn't that what you'd ideally want? Or perhaps there is a deeper concept here that I just need to better understand!

clearly describe the error. rather than just printing out "exiting now", try to tell the caller why it was a mistake and what they could maybe do to fix it. maybe also explain that exceptions can (most of the time) take a message string, and it doesn't have to be printed also.

this is a really good point. i'd love some guidance. What i've done in the past in packages is done

except ValueError as ve:
    # Handle the exception and provide a custom message
    print(f"Error: {ve}. Better message here.")

but I'm still using a print statement.

the alternative would be this

if value < 0:
        raise ValueError("The value cannot be negative.")

But that is a bit more look before you leap style right?

Any input that you have is super welcome. this is totally new content AND tricky content. I think your input here is really valuable!

@lwasser
Copy link
Member Author

lwasser commented Oct 11, 2024

Because this is a new file, it's hard to track changes in a review. I am going to merge this PR. Then, i'll integrate the changes into a new pr. That way the changes that I made are super clear. I still have the question about failing fast. but let me merge, and i'll bring your comments over into the new pr and highlight what's changed to avoid confusion!.

@lwasser
Copy link
Member Author

lwasser commented Oct 11, 2024

Ok i am going to merge this. and pull the comments into a new pr so the changes are super clear. I have thought more about the suggestions around failing fast, and I think i understand - we probably want more language in the checks and tests lesson that supports this and then we can mention it here and link over to more detail!

@lwasser lwasser merged commit 8804651 into pyOpenSci:main Oct 11, 2024
1 of 2 checks passed
@lwasser lwasser deleted the activity-3 branch October 11, 2024 20:20
@ucodery
Copy link
Contributor

ucodery commented Oct 11, 2024

  1. Can you help me understand what failing fast would look like? If we catch an exception where it occurs, isn't that what you'd ideally want? Or perhaps there is a deeper concept here that I just need to better understand!

I think that "fail fast" is an industry term in software engineering. Rather that come up with my own explanation, there are some great ones already online. Like
https://stackoverflow.com/questions/2807241/what-does-the-expression-fail-early-mean-and-when-would-you-want-to-do-so which also links to more explanations.

but here is a short contrived example of how to not fail fast

...

def do_work(num, user_choice):
    # this can fail!
    # but we've been holding on to this value for the entire execution, doing a lot of work that was meaningless because the task couldn't be completed
    # If it fails now, the stack trace doesn't point to where the value came from, or indicate how the user can correct it
    user_num = int(user_choice)

    # this can also fail! As a different, unrelated, exception
    return num / user_num

def gather_input():
    # this will fail fast, but IndexError doesn't describe the real issue to the user, that there is a required arg
    return sys.argv[1]

def main():
    user_choice = gather_input()
    data = I_load_a_large_datafile()
    number = find_magic_entry(data)
    print(do_work(number, user_choice))

...

this is a really good point. i'd love some guidance.

Building off of my earlier example, I might add this edge case logic to the program. Don't forget new excetption types are easy and cheap, and can be very descriptive to the user even without a message.

class UsageError(RuntimeError):
    pass

def gather_input():
    try:
        user_input = sys.argv[1]
    except IndexError:
        raise UsageError("You must supply a value to the program")
    try:
        user_input = int(user_input)
    except ValueError:
        raise UsageError("The value supplied to the program must be an integer")

This is still very much EAFP as the function tries grabs a good int in the happy path without conditionals. And all the messages appear only when the exception "breaks out" of the program - when it causes the program to exit, but not if any caller catches these exceptions.

Slightly off-topic, but print statements during exception handling (and most other places) should probably be replaces with logging. But that is its own potentially multi-lesson topic.

@lwasser
Copy link
Member Author

lwasser commented Oct 11, 2024

  1. Can you help me understand what failing fast would look like? If we catch an exception where it occurs, isn't that what you'd ideally want? Or perhaps there is a deeper concept here that I just need to better understand!

I think that "fail fast" is an industry term in software engineering. Rather that come up with my own explanation, there are some great ones already online. Like https://stackoverflow.com/questions/2807241/what-does-the-expression-fail-early-mean-and-when-would-you-want-to-do-so which also links to more explanations.

but here is a short contrived example of how to not fail fast

...

def do_work(num, user_choice):
    # this can fail!
    # but we've been holding on to this value for the entire execution, doing a lot of work that was meaningless because the task couldn't be completed
    # If it fails now, the stack trace doesn't point to where the value came from, or indicate how the user can correct it
    user_num = int(user_choice)

    # this can also fail! As a different, unrelated, exception
    return num / user_num

def gather_input():
    # this will fail fast, but IndexError doesn't describe the real issue to the user, that there is a required arg
    return sys.argv[1]

def main():
    user_choice = gather_input()
    data = I_load_a_large_datafile()
    number = find_magic_entry(data)
    print(do_work(number, user_choice))

...

this is a really good point. i'd love some guidance.

Building off of my earlier example, I might add this edge case logic to the program. Don't forget new excetption types are easy and cheap, and can be very descriptive to the user even without a message.

class UsageError(RuntimeError):
    pass

def gather_input():
    try:
        user_input = sys.argv[1]
    except IndexError:
        raise UsageError("You must supply a value to the program")
    try:
        user_input = int(user_input)
    except ValueError:
        raise UsageError("The value supplied to the program must be an integer")

This is still very much EAFP as the function tries grabs a good int in the happy path without conditionals. And all the messages appear only when the exception "breaks out" of the program - when it causes the program to exit, but not if any caller catches these exceptions.

Slightly off-topic, but print statements during exception handling (and most other places) should probably be replaces with logging. But that is its own potentially multi-lesson topic.

This is great, thank you! I’ll work this into #21, which should be easier to review now that the file is merged. Having your input is helpful because many scientists, like me, don’t come from a software engineering background. I’ve learned over time through reviews (like with stravalib) and reading/learning in my free time. At CU, where I taught, it was difficult /impossible for scientists outside of computer science to take CS courses, so most learn programming as a way to process data, not the other way around.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants