Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some thoughtful handling around "None" cases #12

Merged
merged 2 commits into from
Sep 9, 2024

Conversation

m3brown
Copy link
Contributor

@m3brown m3brown commented Sep 7, 2024

This PR is triggered by a scenario I had where some of my photos had thumbhash: None. This was throwing the following exception:

TypeError: '<' not supported between instances of 'NoneType' and 'str'

This made me wonder... do we want to stack files where a key is None? Probably not!

I'm not sure if it's possible for localDateTime or originalFilename values to be None or absent, but the concern would apply there as well. Imagine stacking all photos that don't have a localDateTime - it's not logical.

The adjustments I recommend in this PR are:

  • Be more explicit (in code, comments, tests) that an empty list [] is how we represent a scenario where we had trouble creating the criteria keys. This could be because of a regex mismatch, null value, or something we haven't thought of yet.
    • [] is better than putting None list because None causes with python's sort function.
  • Raise an exception if the empty key ([]) appears in the list of stacks. Also to be safe, raise an exception if any None values appear in the stack keys.
  • Repurpose SKIP_MATCH_MISS to filter any "empty key" ([]) results, not just regex mismatch.

A realistic scenario:

Suppose:

  • Our criteria is [{"key": "thumbhash"},{"key":"localDateTime"}]
  • We have 5000 photos in the system.
  • All files have localDateTime
  • 100 files have thumbhash=None

The changes in the PR will:

  1. Raise an exception on a default run because 100 files do not match the criteria
  2. With SKIP_MATCH_MISS=true, 4900 files will be processed and 100 will be skipped

The important takeaway from my perspective is there are no scenarios were all 5000 files are processed given the provided criteria.

@tenekev tenekev merged commit 32d7e6e into tenekev:main Sep 9, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants