Skip to content

satzk/modern-data-warehouse-dataops

 
 

Repository files navigation

page_type languages products description
sample
python
csharp
typeScript
bicep
azure
microsoft-fabric
azure-sql-database
azure-data-factory
azure-databricks
azure-stream-analytics
azure-synapse-analytics
Code samples showcasing how to apply DevOps concepts to common data engineering patterns and architectures leveraging different Microsoft data platform technologies.

DataOps

This repository contains numerous code samples and artifacts on how to apply DevOps principles to common data engineering patterns and architectures utilizing Microsoft data platform technologies.

The samples are either focused on a single microsoft service (Single-Technology Samples) or showcases an end-to-end data pipeline solution as a reference implementation (End-to-End Samples). Each sample contains code and artifacts related to one or more of the following capabilities:

  • Infrastructure as Code (IaC)
  • Build and Release Pipelines (CI/CD)
  • Testing
  • Observability / Monitoring

In addition to the samples, this repository also contains Utilities. These are simple scripts or code snippets that can be used as-is or as a starting point for more complex automation tasks.

Single-Technology Samples

Technology Samples
Microsoft Fabric ▪️ CI/CD for Microsoft Fabric
▪️ Feature engineering on Microsoft Fabric
Azure SQL database ▪️ CI/CD for Azure SQL database
Azure Data Factory ▪️ CI/CD for ADF with Auto publish
▪️ Data pre-processing using Azure Batch
Azure Stream Analytics ▪️ CI/CD for Azure Stream Analytics

End-to-End Samples

DataOps for Medallion with Azure Data Factory and Azure Databricks

This sample demonstrates batch, end-to-end data pipeline utilizing Azure Data Factory and Azure Databricks built according to the medallion architecture, along with a corresponding CI/CD process, observability and automated testing.

Medallion with Azure Data Factory and Azure Databricks

DataOps for Medallion with Microsoft Fabric

  • This sample would demonstrate end-to-end batch data processing utilizing Microsoft Fabric built according to the medallion architecture, along with a corresponding CI/CD process, observability and automated testing.

    In the current version, the sample is showcasing the deployment of Azure and Fabric resources together using Terraform. The deployment uses a service principal or managed identity for authentication where supported and falls back to Entra user authentication where it is not.

Utilities

Technology Utility Description
Microsoft Fabric ▪️ Script to upload file in GIT repo to Fabric lakehouse

Contributing

This project welcomes contributions and suggestions. Please see our Contributing guide.

About

DataOps for Microsoft Data Platform technologies. https://aka.ms/dataops-repo

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 26.2%
  • Shell 25.4%
  • Python 13.8%
  • Bicep 10.9%
  • HCL 8.3%
  • PowerShell 7.2%
  • Other 8.2%