-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(datasets): Add parameter to enable/disable lazy saving for PartitionedDataset
#978
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think callable data (that is not intend to be lazy saved) is rare and is not worth to break PartitionedDataset for it. Left the comment in the original issue since I saw two PRs are associated.
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
PartitionedDataset
PartitionedDataset
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, @ElenaKhaustova ! LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice and clean solution 👌
kedro-datasets/kedro_datasets/partitions/partitioned_dataset.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Elena Khaustova <[email protected]>
Description
Fixes #759
Cause of the issue: #759 (comment)
Dataset lazy saving docs: https://docs.kedro.org/en/stable/data/partitioned_and_incremental_datasets.html#partitioned-dataset-lazy-saving
Development notes
save_lazily
) to enable/disable lazy saving forPartitionedDataset
. The parameter is enabled by default but users still need to wrap objects with callable as it was required before to apply lazy saving, so the change is not breaking.AbstractDataset.save
How to test without TensorFlow
The following code will execute without errors and save two objects if
save_lazily
is set toFalse
. The code will fail andTestSaveCallable.__calll__
will be called ifsave_lazily
is set toTrue
or not set.Checklist
jsonschema/kedro-catalog-X.XX.json
if necessaryRELEASE.md
file