Azure ML Federated Learning Sandboxes

This page describes the different sandboxes that you can fully provision and use out-of-the-box with our real-world examples. Each sandbox has distinct properties depending on what you'd like to test.

Minimal sandbox : the quickest path to a demo environment (only horizontal FL)
Eyes-on sandboxes : a sandbox where you can debug your code, but the data is still accessible by the users of your subscription
Eyes-off sandboxes : a sandbox where the data is kept in storages without public network access, and only accessible by the computes through a vnet
Private sandboxes : an eyes-off sandbox where the Azure ML workspace and resources are also protected behind a vnet
Confidential VM sandboxes : a sandbox where the data is kept in storages without public network access, and only accessible by the computes through a vnet, and the computes are Confidential VMs
Configurable sandboxes : at the root of our eyes-on/eyes-off sandboxes, these bicep scripts allow you to modify multiple parameters to fit your needs.

🚨 🚨 🚨 IMPORTANT: These sandboxes require you to be the Owner of an Azure resource group. Contributor role is not enough. In your subscription, depending on admin policies, even if you can create a resource group yourself, you might not be the Owner of it. Without ownership, you will not be able to set the RBAC roles necessary for provisioning these sandboxes. Ask your subscription administrator for help.

Minimal sandbox

Deploy a completely open sandbox to allow you to try things out in an eyes-on environment. This setup is intended only for demo purposes. The data is still accessible by the users of your subscription when opening the storage accounts, and data exfiltration is possible. This supports only Horizontal FL scenarios.

❗ Important parameters

Parameter	Description	Values
compute1SKU	SKU of the first compute to provision.	ex: `Standard_DS4_v2`
siloRegions	List of regions used for the silos. All our samples work with 3 regions.	ex: `["australiaeast", "eastus", "westeurope"]`
kaggleUsername and kaggleKey	Optional: some of our samples require kaggle credentials to download datasets, this will ensure the credentials get injected in the workspace secret store properly (you can also do that manually later).

Relevant Documentation

To manually reproduce this full provisioning, see relevant documentation:

Create workspace resources you need to get started with Azure Machine Learning
Manage user-assigned managed identities

Eyes-on sandboxes

Deploy a sandbox where computes are located in a vnet, and can communicate with one another for Vertical FL through vnet peering, but the storages remain eyes-on to allow for debugging. This is recommended for a good sandbox for figuring things out on synthetic data.

These sandboxes are typical of a cross-geo federated learning scenario. Each silo is provisioned with a single-tenant, but in different regions.

Deploy	Description
	Eyes-on with 1 CPU compute per silo
	Eyes-on with 1 GPU compute per silo
	Eyes-on with 2 computes per silo (1 CPU, 1 GPU)

❗ Important parameters

Parameter	Description	Values
primarySKU	SKU of the first compute to provision.	ex: `Standard_DS4_v2`
secondarySKU	SKU of the second compute to provision.	ex: `STANDARD_NC6`
siloRegions	List of regions used for the silos. All our samples work with 3 regions.	ex: `["australiaeast", "eastus", "westeurope"]`
applyVNetPeering	Peer the silo networks to the orchestrator network to allow for live private communication between jobs (required for Vertical FL).	`true` or `false`
kaggleUsername and kaggleKey	Optional: some of our samples require kaggle credentials to download datasets, this will ensure the credentials get injected in the workspace secret store properly (you can also do that manually later).

Architecture

Relevant Documentation

To manually reproduce this full provisioning, see relevant documentation:

Compute instance/cluster with public IP
create a new vnet and subnet, with a network security group
create a new managed identity (User Assigned) to manage permissions of the compute
create a new storage account in a given region

Eyes-off sandboxes

Deploy a sandbox where the silos storages are kept eyes-off by a private service endpoint, accessible only by the computes through a vnet. This sandbox is typical of a cross-geo federated learning scenario. Each silo is provisioned with a single-tenant, but in different regions. Each silo has a distinct virtual network enabling private communication between the silo compute and the silo storage.

Deploy	Description
	Eyes-off with 1 CPU compute per silo
	Eyes-off with 1 GPU compute per silo
	Eyes-off with 2 computes per silo (1 CPU, 1 GPU)

❗ Important parameters

Parameter	Description	Values
primarySKU	SKU of the first compute to provision.	ex: `Standard_DS4_v2`
secondarySKU	SKU of the second compute to provision.	ex: `STANDARD_NC6`
siloRegions	List of regions used for the silos. All our samples work with 3 regions.	ex: `["australiaeast", "eastus", "westeurope"]`
orchestratorEyesOn	Sets the orchestrator network access to either public (`true`) or private (`false`, default).	`true` or `false`
applyVNetPeering	Peer the silo networks to the orchestrator network to allow for live private communication between jobs (required for Vertical FL).	`true` or `false`
kaggleUsername and kaggleKey	Optional: some of our samples require kaggle credentials to download datasets, this will ensure the credentials get injected in the workspace secret store properly (you can also do that manually later).

Architecture

Relevant Documentation

To manually reproduce this full provisioning, see relevant documentation:

Compute instance/cluster with public IP
create a new vnet and subnet, with a network security group
create a new managed identity (User Assigned) to manage permissions of the compute
create a new storage account in a given region, with a private endpoint inside the vnet

Private sandboxes

This is an eyes-off sandbox, but in addition the Azure ML workspace and all its related resources (container registry, keyvault, etc) are also provisioned behind a vnet. All those are made accessible to each silo by private endpoints.

Deploy	Description
	Private with 1 CPU compute per silo
	Private with 1 GPU compute per silo
	Private with 2 computes per silo (1 CPU, 1 GPU)

❗ Important parameters

Parameter	Description	Values
primarySKU	SKU of the first compute to provision.	ex: `Standard_DS4_v2`
secondarySKU	SKU of the second compute to provision.	ex: `STANDARD_NC6`
siloRegions	List of regions used for the silos. All our samples work with 3 regions.	ex: `["australiaeast", "eastus", "westeurope"]`
workspaceNetworkAccess	To make it easier to debug, use `public` to make the Azure ML workspace accessible through its public IP in the Azure portal (default: public).	`public` or `private`
applyVNetPeering	Peer the silo networks to the orchestrator network to allow for live private communication between jobs (required for Vertical FL).	`true` or `false`
kaggleUsername and kaggleKey	Optional: some of our samples require kaggle credentials to download datasets, this will ensure the credentials get injected in the workspace secret store properly (you can also do that manually later).

Architecture

Relevant Documentation

To manually reproduce this full provisioning, see relevant documentation:

Secure an Azure Machine Learning workspace with virtual networks
Compute instance/cluster with private IP
create a new vnet and subnet, with a network security group
create a new managed identity (User Assigned) to manage permissions of the compute
create a new storage account in a given region, with a private endpoint inside the vnet

Confidential sandboxes

Deploy an eyes-off sandbox where the computes leverage confidential computing to keep your training and processing within an enclave.

Deploy	Description
	A sandbox with AKS clusters with confidential computes per silo and orchestrator.

Note: to take full benefit of the VMs, you will need to finalize the setup of the AKS cluster by creating an instance type and use it in pipeline configs.

❗ Important parameters

Parameter	Description	Values
computeSKU	VM to provision in the AKS cluster (default will use a CVM from dcasv5). You can also use any non-confidential SKU.	ex: `Standard_DC4as_v5`
siloRegions	List of regions used for the silos. All our samples work with 3 regions. ❗ make sure you have quota in those regions for confidential compute in particular.	ex: `["australiaeast", "eastus", "westeurope"]`
orchestratorEyesOn	Sets the orchestrator network access to either public (`true`) or private (`false`, default).	`true` or `false`
applyVNetPeering	Peer the silo networks to the orchestrator network to allow for live private communication between jobs (required for Vertical FL).	`true` or `false`
kaggleUsername and kaggleKey	Optional: some of our samples require kaggle credentials to download datasets, this will ensure the credentials get injected in the workspace secret store properly (you can also do that manually later).

Architecture

Note: in current sandbox, we're provisioning only in the eastus region by default, to allow for capacity and quick deployment.

Relevant Documentation

To manually reproduce this full provisioning, see relevant documentation:

Secure an Azure Machine Learning workspace with virtual networks
Introduction to Kubernetes compute target in Azure Machine Learning
DCasv5 and DCadsv5-series confidential VMs
Quickstart: Deploy an AKS cluster with confidential computing Intel SGX agent nodes by using the Azure CLI
Network concepts for applications in Azure Kubernetes Service (AKS)
How to deploy Kubernetes extension
How to attach Kubernetes to Workspace
Manage user-assigned managed identities

Configurable sandboxes

In this section, we'll expose the generic sandbox provisioning scripts we're using to create the sandboxes above. Feel free to adapt the parameters to your own needs.

Using the Azure Portal

Deploy	Description
	Deploy a sandbox with a vnet and public IP for the orchestrator, either eyes-on/eyes-off, vnet peering or not.
	Deploy a sandbox with a vnet and public IP for the orchestrator, using confidential computes in AKS, either eyes-on/eyes-off, vnet peering or not.

Using bicep

In this section, we will use bicep scripts to automatically provision a set of resources for an FL sandbox.

Using the az cli, log into your Azure subscription:

az login
az account set --name <subscription name>

Optional: Create a new resource group for the demo resources. Having a new group would make it easier to delete the resources afterwards (deleting this RG will delete all resources within).
```
# create a resource group for the resources
az group create --name <resource group name> --location <region>
```
Notes: If you have Owner role only in a given resource group (as opposed to in the whole subscription), just use that resource group instead of creating a new one.
Run the bicep deployment script in a resource group you own:
```
# deploy the demo resources in your resource group
az deployment group create --template-file ./mlops/bicep/vnet_publicip_sandbox_setup.bicep --resource-group <resource group name> --parameters demoBaseName="fldemo"
```
Notes:
- If someone already provisioned a demo with the same name in your subscription, change demoBaseName parameter to a unique value.
- By default, only one CPU compute is created for each silo. Please set the compute2 parameter to true if you wish to create both CPU & GPU computes for each silo.
- Some regions don't have enough quota to provision GPU computes. Please look at the headers of the bicep script to change the region/computeSKU.

Alternatively, you can provision the confidential compute sandbox the same way:

# deploy the demo resources in your resource group
az deployment group create --template-file ./mlops/bicep/vnet_publicip_sandbox_aks_confcomp_setup.bicep --resource-group <resource group name> --parameters demoBaseName="fldemo"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sandboxes.md

sandboxes.md

Azure ML Federated Learning Sandboxes

Minimal sandbox

❗ Important parameters

Relevant Documentation

Eyes-on sandboxes

❗ Important parameters

Architecture

Relevant Documentation

Eyes-off sandboxes

❗ Important parameters

Architecture

Relevant Documentation

Private sandboxes

❗ Important parameters

Architecture

Relevant Documentation

Confidential sandboxes

❗ Important parameters

Architecture

Relevant Documentation

Configurable sandboxes

Using the Azure Portal

Using bicep

Files

sandboxes.md

Latest commit

History

sandboxes.md

File metadata and controls

Azure ML Federated Learning Sandboxes

Minimal sandbox

❗ Important parameters

Relevant Documentation

Eyes-on sandboxes

❗ Important parameters

Architecture

Relevant Documentation

Eyes-off sandboxes

❗ Important parameters

Architecture

Relevant Documentation

Private sandboxes

❗ Important parameters

Architecture

Relevant Documentation

Confidential sandboxes

❗ Important parameters

Architecture

Relevant Documentation

Configurable sandboxes

Using the Azure Portal

Using bicep