Skip to content

Commit

Permalink
Task Hub configuration (#10)
Browse files Browse the repository at this point in the history
* Various breaking changes (see CHANGELOG.md)
* Upgraded package versions to 0.6.0
* Updated documentation
  • Loading branch information
cgillum authored Mar 15, 2021
1 parent 3025fa0 commit 3df2541
Show file tree
Hide file tree
Showing 22 changed files with 425 additions and 153 deletions.
18 changes: 14 additions & 4 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,27 @@
# Changelog

## Unreleased
## v0.6.0-alpha

### New

* Support for sub-orchestrations (#7) - contributed by @usemam
* Support for sub-orchestrations ([#7](https://github.com/microsoft/durabletask-mssql/pull/7)) - contributed by [@usemam](https://github.com/usemam)
* Support for explicit task hub name configuration
* Added `dt.GlobalSettings` table and `dt.SetGlobalSetting` stored procedure
* Added new permissions.sql setup script for setting up databaes permissions
* Added task hub documentation page

## Breaking changes

* Renamed `SqlProviderOptions` to `SqlOrchestrationServiceSettings` and added required constructor parameters
* User-based multitenancy is now disabled by default
* The `dt_runtime` role is now granted access to only specific stored procedures rather than all of them

## v0.5.0-alpha

### New

* Added support for .NET Standard 2.0 (DTFx only) (#6)
* Made batch size configurable (#5) - contributed by @usemam
* Added support for .NET Standard 2.0 (DTFx only) ([#6](https://github.com/microsoft/durabletask-mssql/pull/6))
* Made batch size configurable ([#5](https://github.com/microsoft/durabletask-mssql/pull/5)) - contributed by [@usemam](https://github.com/usemam)

### Improved

Expand Down
1 change: 1 addition & 0 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ The tables are as follows:
* **dt.NewTasks**: Contains a queue of unprocessed activity tasks for running instances.
* **dt.Versions**: Contains a record of schema versions that have been provisioned in this database.
* **dt.Payloads**: Contains the payload blobs for all instances, events, tasks, and history records.
* **dt.GlobalSettings**: Key-value configuration pairs that control the runtime behavior of the provider.

You can find the current version of the database schema in the `dt.Versions` table. If you create an app using one version of the SQL provider and then later upgrade to a newer version of the provider, the provider will automatically take care of upgrading the database schema, without introducing any downtime.

Expand Down
6 changes: 2 additions & 4 deletions docs/introduction.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,14 @@
# Introduction

The [Durable Task Framework](https://github.com/Azure/durabletask) (DTFx) is a lightweight and portable framework that allows developers to build reliable workflows (orchestrations) using .NET tasks and standard C# async/await syntax. Task orchestrations and their activities are written using standard, imperative code. No DSLs or DAGs.

The Microsoft SQL provider is a backend for DTFx that persists all task hub state in a Microsoft SQL database, which can be hosted in the cloud or in your own infrastructure. This provider includes support for all DTFx features, including orchestrations, activities, and entities, and has full support for [Azure Durable Functions](https://docs.microsoft.com/azure/azure-functions/durable/durable-functions-overview).
The Durable Task SQL Provider is a backend for the [Durable Task Framework](https://github.com/Azure/durabletask) (DTFx) and [Azure Durable Functions](https://docs.microsoft.com/azure/azure-functions/durable/durable-functions-overview) that persists all task hub state in a Microsoft SQL database. It's compatible with [on-premises SQL Server](https://www.microsoft.com/sql-server/), [SQL Server for Docker containers](https://hub.docker.com/_/microsoft-mssql-server), the cloud-hosted [Azure SQL Database](https://azure.microsoft.com/services/azure-sql/), and includes support for orchestrations, activities, and durable entities.

## Features

The Microsoft SQL provider is just one of [many supported providers for the Durable Task Framework](https://github.com/Azure/durabletask#supported-persistance-stores). Each backend storage provider has its own strengths and weaknesses. We believe that the Microsoft SQL provider has many strengths that make it worth creating and supporting.

### Portability

Microsoft SQL Server is an industry leading database server available as a managed service or as a standalone installation and is supported by the leading cloud providers ([Azure SQL](https://azure.microsoft.com/services/azure-sql/), [SQL Server on AWS](https://aws.amazon.com/sql/), [Google Cloud SQL](https://cloud.google.com/sql/), etc.). It also is supported on multiple OS platforms, like [Windows Server](https://www.microsoft.com/sql-server/), [Linux containers](https://hub.docker.com/_/microsoft-mssql-server), and more recently on [IoT/Edge](https://azure.microsoft.com/services/sql-edge/) devices. All your orchestration data is contained in a single database that can easily be exported from one host to another, so there is no need to worry about having your data locked to a particular vendor.
Microsoft SQL Server is an industry leading database server available as a managed service or as a standalone installation and is supported by the leading cloud providers ([Azure SQL](https://azure.microsoft.com/services/azure-sql/), [SQL Server on AWS](https://aws.amazon.com/sql/), [Google Cloud SQL](https://cloud.google.com/sql/), etc.). It also is supported on multiple OS platforms, like [Windows Server](https://www.microsoft.com/sql-server/), [Linux Docker containers](https://hub.docker.com/_/microsoft-mssql-server), and more recently on [IoT/Edge](https://azure.microsoft.com/services/sql-edge/) devices. All your orchestration data is contained in a single database that can easily be exported from one host to another, so there is no need to worry about having your data locked to a particular vendor.

### Control

Expand Down
30 changes: 21 additions & 9 deletions docs/multitenancy.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,32 @@
# Multitenancy

One of the goals for the Microsoft SQL provider for the Durable Task Framework (DTFx) is to create a foundation for safe [multi-tenant deployments](https://en.wikipedia.org/wiki/Multitenancy). This is especially valuable when your organization has many small apps but prefers to manage only a single backend database. Different apps can connect to this database using different database login credentials. Database administrators will be able to query data across all tenants but individual apps will only have access to their own data.
This article describes the multitenancy features of the Durable Task SQL backend and how to enable them.

## Task hubs
## Overview

A **task hub** is an abstract grouping concept in DTFx and Durable Functions. Orchestrators, activities, and entities can only interact with each other when they belong to the same task hub. This is enforced at runtime by the underlying DTFx storage provider. In the case of the DTFx SQL provider, all stored procedures used by the runtime will only ever access data that belongs to the current task hub.
One of the goals for the Microsoft SQL provider for the Durable Task Framework (DTFx) is to enable [multi-tenant deployments](https://en.wikipedia.org/wiki/Multitenancy) with multiple apps sharing the same database. This is often valuable when your organization has many small apps but prefers to manage only a single backend database. When multitenancy is enabled, different apps connect to a shared database using different database login credentials. Database administrators will be able to query data across all tenants but individual apps will only have access to their own data.

The current task hub is determined by the credentials used to log into the database. For example, if your app connects to a Microsoft SQL database using **dbo** credentials (the default, built-in admin user for most databases), then the name of the connected task hub will be "dbo". It is not necessary to explicitly create or delete task hubs. All orchestrations and entities created under that connection will automatically be associated with the corresponding task hub.
Multitenancy works by isolating each app into a separate [task hub](taskhubs.md). The current task hub is determined by the credentials used to log into the database. For example, if your app connects to a Microsoft SQL database using **dbo** credentials (the default, built-in admin user for most databases), then the name of the connected task hub will be "dbo". Task hubs provide data isolation, ensuring that two users in the same database will not be able to access each other's data.

?> One difference between the Microsoft SQL provider and the Azure Storage provider is that all task hubs in the Microsoft SQL provider share the same tables. In the Azure Storage provider, each task hub is given a completely separate table in Azure Storage (along with isolated queues and blob containers). More importantly, however, is that the SQL provider allows task hubs to be securely isolated from each other. This is not possible with the Azure Storage provider - different tenants would need to be assigned to different storage accounts. The ability for multiple tenants to securely share a SQL databases is therefore much more cost-effective for implementing multitenancy.
?> Task hub isolation in the current version of the SQL provider prevents one tenant from accessing data that belongs to another tenant. However, it doesn't impose any restrictions on data volumes or database CPU usage. If this kind of strict resource isolation is required, then each tenant should instead be separated into their own database.

Each table in the Durable Task schema includes a `TaskHub` column that indicates the name of the tenant that a particular row belongs to. The stored procedures used to access data in the database will always filter data using the current task hub context. This ensures that each credential can only access data that is part of the same task hub. The task hub is also the first component in all primary keys within the database and is thus part of the identity of all instances.
## Enabling multitenancy

## Getting started
If you want to have multiple apps share a database (multitenancy) but want to ensure no app can access any data owned by another app, then you can configure a task hub via database login credentials. In this model, database administrators provide individual app owners with SQL credentials known only to them, and each credential maps to an isolated task hub within the database. When using this model, you do not configure a task hub name in code or configuration. Instead, the SQL login username is used as the task hub name.

To enable multitenancy, each tenant must be given its own login and user ID for the target database. To ensure that each tenant can only access its own data, you should add each user to the `dt_runtime` role that is created automatically by the setup scripts using the following T-SQL syntax.
Multitenancy is disabled by default. To enable multitenancy, a database administrator must set `TaskHubMode` to `1` in the `dt.GlobalSettings` table. This can be done using the `dt.SetGlobalSetting` stored procedure.

```sql
EXECUTE dt.SetGlobalSetting @Name='TaskHubMode', @Value=1
```

The value `1` instructs all runtime stored procedures to infer the current task hub from the [`USER_NAME()`](https://docs.microsoft.com/sql/t-sql/functions/user-name-transact-sql) function of SQL Server. Multitenancy can be disabled by setting `TaskHubMode` to `0`.

!> Enabling or disabling multitenancy may result in subsequent logins using a different task hub name. Any orchestrations or entities created using a previous task hub names will not be visible to an app that switches to a new task hub name. Switching between task hub modes must therefore be done with careful planning and should not be done while apps are actively running.

## Managing user credentials

Once multitenancy is enabled, each tenant must be given its own login and user ID for the target database. To ensure that each tenant can only access its own data, you should add each user to the `dt_runtime` role that is created automatically by the database setup scripts.

The following SQL statements illustrate how this can be done for a SQL database that supports username/password authentication.

Expand All @@ -34,4 +46,4 @@ GO

Each tenant should then use a SQL connection string with the above login credentials for their assigned user account. See [this SQL Server documentation](https://docs.microsoft.co/sql/relational-databases/security/authentication-access/create-a-database-user) for more information about how to create and manage database users.

!> Task hub names are limited to 50 characters. Database username lengths must therefore not exceed 50 characters.
?> Task hub names are limited to 50 characters. When multitenancy is enabled, the username is used as the task hub name. If the username exceeds 50 characters, the task hub name value used in the database will be a truncated version of the username followed by an MD5 hash of the full username.
1 change: 1 addition & 0 deletions docs/sidebar.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
* [Introduction](introduction.md "Durable Task SQL Provider")
* [Getting started](quickstart.md)
* [Architecture](architecture.md)
* [Task Hubs](taskhubs.md)
* [Multitenancy](multitenancy.md)
50 changes: 50 additions & 0 deletions docs/taskhubs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Task Hubs

This article describes what task hubs are and how they can be configured.

## Overview

A **task hub** is a logical grouping concept in both the Durable Task Framework (DTFx) and Durable Functions. Orchestrators, activities, and entities all belong to a single task hub and can only interact directly with other orchestrations, activities, and entities that are defined in the same task hub. In the SQL provider, a single database can contain multiple task hubs. Task hub data isolation is enforced at runtime by the underlying DTFx storage provider and its SQL stored procedures. In the case of the DTFx SQL provider, all stored procedures used by the runtime will only ever access data that belongs to the current task hub.

Task hubs are also the primary unit of isolation within a database. Each table in the Durable Task schema includes a `TaskHub` column as part of its primary key and stored procedures will only access data that belongs to the current _task hub context_. This isolation serves two primary purposes: supporting side-by-side deployments of different application version and [enabling multitenancy](multitenancy.md), as explained in other articles.

?> One difference between the Microsoft SQL provider and the Azure Storage provider is that all task hubs in the Microsoft SQL provider share the same tables. In the Azure Storage provider, each task hub is given a completely separate table in Azure Storage (along with isolated queues and blob containers). More importantly, however, is that the SQL provider allows task hubs to be securely isolated from each other. This is not possible with the Azure Storage provider - different tenants would need to be assigned to different storage accounts. The ability for multiple tenants to securely share a SQL databases is therefore much more cost-effective for implementing multitenancy.

## Configuring task hub names

Tasks hubs can be configured explicitly in the SQL provider configuration or can be inferred by details of the SQL connection string. For self-hosted DTFx apps, you can configure the task hub directly in the `SqlProviderOptions` class.

```csharp
var options = new SqlProviderOptions
{
TaskHub = "MyTaskHub",
ConnectionString = Environment.GetEnvironmentVariable("SQLDB_Connection"),
};
```

For Durable Functions apps, the task hub name can be configured in the `extensions/durableTask/hubName` property of the **host.json** file.

```json
{
"version": "2.0",
"extensions": {
"durableTask": {
"hubName": "MyTaskHub",
"storageProvider": {
"type": "MicrosoftSQL",
"connectionStringName": "SQLDB_Connection"
}
}
}
}
```

Task hub names can alternatively be inferred from database user credentials. For more information, see [Multitenancy](multitenancy.md).

?> Task hub names are limited to 50 characters. If the specified task hub name exceeds 50 characters, the configured task hub name will be truncated and suffixed with an MD5 hash of the full task hub name to keep it within 50 characters.

## Case sensitivity

Whether task hub names are case-sensitive depends on the collation of the SQL database. For example, if a [binary collation](https://docs.microsoft.com/sql/relational-databases/collations/collation-and-unicode-support#Binary-collations) is configured on the database, task hub names will be case-sensitive. Non-binary collations may result in case-insensitive string comparisons, making task hub names effectively case-insensitive. For more information on SQL database collations, see [Collation and Unicode support](https://docs.microsoft.com/sql/relational-databases/collations/collation-and-unicode-support) in the Microsoft SQL documentation.

?> The preferred database collation for the Durable Task SQL provider is `Latin1_General_100_BIN2_UTF8`, which is a binary collation.
44 changes: 43 additions & 1 deletion src/DurableTask.SqlServer.AzureFunctions/SqlDurabilityOptions.cs
Original file line number Diff line number Diff line change
Expand Up @@ -4,19 +4,61 @@
namespace DurableTask.SqlServer.AzureFunctions
{
using System;
using Microsoft.Azure.WebJobs.Extensions.DurableTask;
using Microsoft.Data.SqlClient;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Logging.Abstractions;
using Newtonsoft.Json;

public class SqlDurabilityOptions
{
[JsonProperty("connectionStringName")]
public string ConnectionStringName { get; set; } = "SQLDB_Connection";

[JsonProperty("taskHubName")]
public string TaskHubName { get; set; } = "default";

[JsonProperty("taskEventLockTimeout")]
public TimeSpan TaskEventLockTimeout { get; set; } = TimeSpan.FromMinutes(2);

[JsonProperty("taskEventBatchSize")]
public int TaskEventBatchSize { get; set; } = 10;

internal SqlProviderOptions ProviderOptions { get; set; } = new SqlProviderOptions();
internal ILoggerFactory LoggerFactory { get; set; } = NullLoggerFactory.Instance;

internal SqlOrchestrationServiceSettings GetOrchestrationServiceSettings(
IConnectionStringResolver connectionStringResolver)
{
if (connectionStringResolver == null)
{
throw new ArgumentNullException(nameof(connectionStringResolver));
}

string? connectionString = connectionStringResolver.Resolve(this.ConnectionStringName);
if (string.IsNullOrEmpty(connectionString))
{
throw new InvalidOperationException(
$"No SQL connection string configuration was found for the app setting or environment variable named '{this.ConnectionStringName}'.");
}

// Validate the connection string
try
{
new SqlConnectionStringBuilder(connectionString);
}
catch (ArgumentException e)
{
throw new ArgumentException("The provided connection string is invalid.", e);
}

var settings = new SqlOrchestrationServiceSettings(connectionString, this.TaskHubName)
{
LoggerFactory = this.LoggerFactory,
WorkItemLockTimeout = this.TaskEventLockTimeout,
WorkItemBatchSize = this.TaskEventBatchSize,
};

return settings;
}
}
}
Loading

0 comments on commit 3df2541

Please sign in to comment.