incremental_partitions materialization #40

timnon · 2021-11-01T08:23:17Z

Hi, i wonder whether this materialization called "incremental_partitions" is useful. It adds another partition level called data_creation_ts to every created table which contains the timestamp (in milliseconds) when a data row was created. To increment a table, the "insert" command is used, also adding the newest timestamp, resulting in different timestamps for potentially the same data. Therefore, in the final posthook a cleanup operation is run which makes sure that for each partition only the latest data_creation_ts partition is kept.

So a table can be updated on a partition level later on, for instance if for a certain date partition a new run is invoked. However, the partitions are not hardcoded and can be arbitrary.

Works pretty well for me and has less than 50 rows of code.

Is this already covered with previous materializations? They have far more code.

incremental_partitions materialization

d0a30b3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

incremental_partitions materialization #40

incremental_partitions materialization #40

timnon commented Nov 1, 2021

incremental_partitions materialization #40

Are you sure you want to change the base?

incremental_partitions materialization #40

Conversation

timnon commented Nov 1, 2021