You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Define multiple MATCHED conditions with different actions (e.g., UPDATE, DELETE), each with its own condition.
Define multiple NOT MATCHED BY SOURCE conditions with different actions (e.g., DELETE, UPDATE).
Specify which columns to update in an UPDATE action instead of updating all columns by default.
However, in dbt-databricks v1.9.7, these capabilities are not fully supported:
It is not possible to specify multiple MATCHED or NOT MATCHED BY SOURCE actions.
The action for MATCHED is not configurable.
We cannot specify which columns should be updated in an UPDATE action.
Describe alternatives you've considered
Manually implementing a MERGE INTO statement using a custom post-hook, but this reduces the benefits of dbt’s incremental logic and maintainability.
Using a two-step process: updating/deleting records separately by 2 merging, but this introduces additional complexity and performance overhead.
Additional context
Databricks supports defining multiple conditions within MERGE INTO, making it more flexible for incremental processing. Supporting these features in dbt-databricks would allow users to take full advantage of this flexibility and optimize their data pipelines.
Who will this benefit?
Data engineers & analysts using incremental models in dbt-databricks.
Use cases where for example:
Different update conditions need to be applied to different rows.
Certain rows should be deleted instead of updated based on specific criteria.
Only specific columns should be updated to prevent unnecessary writes.
Are you interested in contributing this feature?
The configurations could be like this:
matched_delete_condition
matched_update_condition
not_matched_by_source_delete_condition
not_matched_by_source_update_condition
The text was updated successfully, but these errors were encountered:
Describe the feature
Currently, in dbt-databricks v1.9.7, the incremental merge strategy does not fully leverage the flexibility of Databricks' MERGE INTO statement.
According to the Databricks documentation, it is possible to:
However, in dbt-databricks v1.9.7, these capabilities are not fully supported:
It is not possible to specify multiple MATCHED or NOT MATCHED BY SOURCE actions.
The action for MATCHED is not configurable.
We cannot specify which columns should be updated in an UPDATE action.
Describe alternatives you've considered
Manually implementing a MERGE INTO statement using a custom post-hook, but this reduces the benefits of dbt’s incremental logic and maintainability.
Using a two-step process: updating/deleting records separately by 2 merging, but this introduces additional complexity and performance overhead.
Additional context
Databricks supports defining multiple conditions within MERGE INTO, making it more flexible for incremental processing. Supporting these features in dbt-databricks would allow users to take full advantage of this flexibility and optimize their data pipelines.
Who will this benefit?
Data engineers & analysts using incremental models in dbt-databricks.
Use cases where for example:
Are you interested in contributing this feature?
The configurations could be like this:
matched_delete_condition
matched_update_condition
not_matched_by_source_delete_condition
not_matched_by_source_update_condition
The text was updated successfully, but these errors were encountered: