Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update partitioned dataset lazy saving docs #4402

Merged
merged 7 commits into from
Jan 22, 2025
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Updated Partitioned dataset lazy saving docs
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
ElenaKhaustova committed Jan 7, 2025
commit 4b8d9b575ec851d6c1ff30a1cebe19bbc5486cc7
6 changes: 5 additions & 1 deletion docs/source/data/partitioned_and_incremental_datasets.md
Original file line number Diff line number Diff line change
@@ -213,7 +213,7 @@ Writing to an existing partition may result in its data being overwritten, if th
### Partitioned dataset lazy saving
`PartitionedDataset` also supports lazy saving, where the partition's data is not materialised until it is time to write.

To use this, simply return `Callable` types in the dictionary:
To use this, simply wrap your object with `lambda` function in the dictionary before return:

```python
from typing import Any, Dict, Callable
@@ -234,6 +234,10 @@ def create_partitions() -> Dict[str, Callable[[], Any]]:
}
```

```{note}
Other `Callable` types but `lambda` provided will be ignored and processed as is without apllying lazy saving.
```

```{note}
When using lazy saving, the dataset will be written _after_ the `after_node_run` [hook](../hooks/introduction).
```