Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zipstore delitems via override #1184

Conversation

I-am-Emmanuel
Copy link

@I-am-Emmanuel I-am-Emmanuel commented Oct 13, 2022

Fixes #828

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/tutorial.rst
  • Changes documented in docs/release.rst
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

@I-am-Emmanuel
Copy link
Author

I want to know if this is actually expected of me to do. Please if it is not give me a details explanation of what to do.

@lgtm-com
Copy link

lgtm-com bot commented Oct 13, 2022

This pull request introduces 1 alert when merging 96d1183 into eb6d143 - view on LGTM.com

new alerts:

  • 1 for Unused local variable

Copy link
Member

@jakirkham jakirkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Emmanuel! 🙏

Generally this seems like the right approach. Thanks for diving in 😄

Had a couple comments below on improvements we might want to make.

The next thing would be to start updating some tests that currently are not running (like this one), but should now be able to with this change.

Please let us know if you have any questions 🙂

zarr/storage.py Outdated Show resolved Hide resolved
zarr/storage.py Outdated Show resolved Hide resolved
@I-am-Emmanuel
Copy link
Author

That's incredible. I'm not sure whether that is what is expected of me. I'll continue with the testing. Thanks

@jakirkham
Copy link
Member

I think it should be doable and believe you are up to the task. I hope I've given enough info. As always though, please feel free to ask questions. Any question that occurs to you is worth asking 🙂

@I-am-Emmanuel
Copy link
Author

I-am-Emmanuel commented Oct 14, 2022 via email

@lgtm-com
Copy link

lgtm-com bot commented Oct 14, 2022

This pull request introduces 1 alert when merging a83bff2 into 2c52b17 - view on LGTM.com

new alerts:

  • 1 for Unused local variable

@I-am-Emmanuel
Copy link
Author

Something is perplexing me as I prepare to participate to the test. I see the test is checking for the pop item rather than the del item, therefore I rewrite a test for the previously worked-on function del_item. To minimize confusion, I also included a function for pop_items in the storeitem class.

@lgtm-com
Copy link

lgtm-com bot commented Oct 14, 2022

This pull request introduces 1 alert when merging 0144f95 into 2c52b17 - view on LGTM.com

new alerts:

  • 1 for Unnecessary delete statement in function

zarr/storage.py Outdated Show resolved Hide resolved
Copy link
Member

@jakirkham jakirkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposing a few changes below to fix lints

zarr/storage.py Outdated Show resolved Hide resolved
zarr/storage.py Outdated Show resolved Hide resolved
zarr/storage.py Outdated Show resolved Hide resolved
zarr/storage.py Outdated Show resolved Hide resolved
zarr/storage.py Outdated Show resolved Hide resolved
Copy link
Member

@jakirkham jakirkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since value is only used once, suggesting we inline it.

zarr/storage.py Outdated Show resolved Hide resolved
zarr/storage.py Outdated Show resolved Hide resolved

store.pop(self.root + 'foo')

assert store[self.root + 'foo'] == b""

def test_popitem(self):
# override because not implemented
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would also update this test

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.

@jakirkham
Copy link
Member

Thanks Emmanuel! 🙏

Looks like many things are done. So marked them resolved above.

The linter raised a few stylistic issues. Went ahead and just added these to the PR for simplicity.

There is one more test remaining that I've commented on above.

There is a test failure we need to look at. Will give it some thought.

@jakirkham
Copy link
Member

Think we need to tweak a few of ZipStore's methods to treat keys with values equal to b"" as missing. Once that is done this should fix the remaining test issues (and remove the need for special casing ZipStore in the tests)

@I-am-Emmanuel
Copy link
Author

Waiting for this work to be reviewed.

@jakirkham
Copy link
Member

Thanks @I-am-Emmanuel! 🙏

That's a good start. Think we still need similar changes to __contains__ and keylist (skipping empty entries)

@MSanKeys963
Copy link
Member

@I-am-Emmanuel, would you like to finish this?

Copy link
Member

@jakirkham jakirkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry had some review comments that didn't post. So completing to get those out

if data:
return data
else:
raise KeyError("Key not found")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raise KeyError("Key not found")
raise KeyError(key)

@@ -1789,7 +1810,8 @@ def __eq__(self, other):

def keylist(self):
with self.mutex:
return sorted(self.zf.namelist())
namelist = [key for key in self.zf.namelist() if self.zf.getinfo(key) != b""]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would suggest passing namelist through a set since there may be duplicated keys (thanks to overwrites).

Also with getinfo, this provides metadata about the file. Think we can just check the file size is non-zero (so no need to read the file).

Otherwise this seems reasonable.

Suggested change
namelist = [key for key in self.zf.namelist() if self.zf.getinfo(key) != b""]
namelist = [key for key in set(self.zf.namelist()) if self.zf.getinfo(key).file_size]

value = self.zf.getinfo(key)

if value == b"":
raise KeyError
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raise KeyError
raise KeyError(key)

@jhamman
Copy link
Member

jhamman commented Oct 11, 2024

I'm going to close this as stale. Folks should feel free to reopen if there is interest in continuing this work. Note there is a new implementation of the Zip store in v3 that could use the same patch for whoever is interested.

@jhamman jhamman closed this Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implementing ZipStore's __delitem__ via overwrite
5 participants