You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are using Databricks and would like to implement all data transformations—both streaming and batch—entirely within dbt. Currently, dbt supports Streaming Tables in Databricks, which internally create Delta Live Tables (DLT). Additionally, as per Databricks documentation, it is possible to write stream output to a Kafka topic using writeStream in a Delta Live Tables pipeline.
Describe alternatives you've considered
Additional context
The outlined approach includes:
Setting up Kafka configurations (broker URL, topic, security settings)
Creating a DLT pipeline
Defining a streaming source (files, Delta tables, etc.)
Using writeStream with Kafka options to publish the data
The introduction of the new Sinks API in DLT addresses the need to write processed data to external event streams, such as Apache Kafka and Azure Event Hubs, as well as writing to a Delta Table. These features are currently in Public Preview, with plans for further expansion.
Who will this benefit?
This feature will benefit users who are looking to integrate their data pipelines entirely within dbt and require seamless publishing of streaming data to external platforms such as Kafka and Azure Event Hubs. Specific use cases include real-time data processing and integration with external event streaming platforms for further analytics and monitoring.
Are you interested in contributing this feature?
--
The text was updated successfully, but these errors were encountered:
@benc-db I wonder if there's a world where you could create sinks as a custom materialization (Materialize supports it in their dbt adapter but they can do that because they're 100% SQL based)
Describe the feature
We are using Databricks and would like to implement all data transformations—both streaming and batch—entirely within dbt. Currently, dbt supports Streaming Tables in Databricks, which internally create Delta Live Tables (DLT). Additionally, as per Databricks documentation, it is possible to write stream output to a Kafka topic using writeStream in a Delta Live Tables pipeline.
Describe alternatives you've considered
Additional context
The outlined approach includes:
The introduction of the new Sinks API in DLT addresses the need to write processed data to external event streams, such as Apache Kafka and Azure Event Hubs, as well as writing to a Delta Table. These features are currently in Public Preview, with plans for further expansion.
Who will this benefit?
This feature will benefit users who are looking to integrate their data pipelines entirely within dbt and require seamless publishing of streaming data to external platforms such as Kafka and Azure Event Hubs. Specific use cases include real-time data processing and integration with external event streaming platforms for further analytics and monitoring.
Are you interested in contributing this feature?
--
The text was updated successfully, but these errors were encountered: