incremental_partitions materialization #40
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi, i wonder whether this materialization called "incremental_partitions" is useful. It adds another partition level called data_creation_ts to every created table which contains the timestamp (in milliseconds) when a data row was created. To increment a table, the "insert" command is used, also adding the newest timestamp, resulting in different timestamps for potentially the same data. Therefore, in the final posthook a cleanup operation is run which makes sure that for each partition only the latest data_creation_ts partition is kept.
So a table can be updated on a partition level later on, for instance if for a certain date partition a new run is invoked. However, the partitions are not hardcoded and can be arbitrary.
Works pretty well for me and has less than 50 rows of code.
Is this already covered with previous materializations? They have far more code.