This GitHub repository houses terraform script to deploy Azure persistent Infrastructure to support Big Data operations for Data Engineering using Azure Databricks (Spark).
- Azure Resiurce Group - The logical wrapping of application specific resources together.
- Azure Datalake Storage - This stores data in the form of
Files
andBlobs
.- Azure Databricks - The Computation layer that provided
Spark
for Data Engineering.- Azure Virtual Network - The Network restrictions that restricks
open internet access
for Data Engineering infrastructure.- Azure Subnet - The resource or application level restrictions to restrict traffic between Infrastructure components.
- Azure Network Security Group (NSG) - The security groups that houses rules to restrict
ingress
andegress
networktraffic
.- Azure Network security rules - The
rules
inside NSGs to restrict application specific Ingress/Egress traffic.- Azure NSG Subnet association - The association between
Azure Subnet
andAzure NSGs
to apply specific rules to specific applications.- Azure SQL Server - The SQL server to house
SQL databases
on Azure.- Azure Synapse - The
Data warehousing
layer on SQl server to store and process huge amount of data on azure.- Azure Virtual machine - The
Linux Virtual machines
to support Data Engineering needs and Visualization.- Azure Keyvault - The Vault to store
Keys
,Secrets
andCertificate
on Azure instead of hard coding.
- The repository also houses
Dockerfile
to supportJenkins
slave to supportDevOps
automating. PowerShell
scripts to provision access uisngaz cli commands
and modify resource levelconfiguration
.Jenkins
file to support automatedDevOps
deployment integrated withGitHub
.
Latest emhancements will be updated to Master branch for release.
-
master
branch- merge needs
Pull Request
review/approval. - Once reviewed and merged with
develop
, raisePull Request
formaster
for enhancement to be made.
- merge needs
-
develop
branch- Create Enhancement wise branches out of it.
- Work enhancements wise contribution.
- Push latest code with
Pull Requests
and get reviewed. - merge needs
Pull Requests
review/approval.