From b77253c82703ba251ae882dbb3073aa2d0b10cba Mon Sep 17 00:00:00 2001 From: Sylvie Date: Tue, 3 Sep 2024 15:49:51 -0500 Subject: [PATCH 1/2] deployment slots ADR Co-Authored-By: jherrflexion <118225331+jherrflexion@users.noreply.github.com> Co-Authored-By: Samuel Aquino --- adr/023-deployment-slots.md | 40 +++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) create mode 100644 adr/023-deployment-slots.md diff --git a/adr/023-deployment-slots.md b/adr/023-deployment-slots.md new file mode 100644 index 000000000..8d084452e --- /dev/null +++ b/adr/023-deployment-slots.md @@ -0,0 +1,40 @@ +# 23. Deployment Slots for Zero Downtime Deploys + +Date: 2024-09-03 + +## Decision +We will use Azure Web App Deployment Slots to facilitate zero-downtime deploys of the TI app + +## Status + +Accepted. + +## Context +Because TI is driven by web traffic from ReportStream, we can receive http calls at any time. +If TI fails to respond, ReportStream will have to try sending the data again later, causing delays. +By implementing zero-downtime deploys, our service can remain available to any incoming calls. + +## Impact +### Positive +- **Zero-downtime deploys**: Zero-downtime deploys keep us from dropping incoming calls during deployment. +- **Easy rollback**: Deployment slots make it easy to roll back to the previous version of the +app if we find errors after deploy. +- **Consistency**: Deployment Slots are an Azure feature specifically designed to enable +zero-down time deployment. We use deployment slots in all TI environments and +in the SFTP Ingestion Service. + +### Negative +- **Incomplete support for Linux**: The auto-swap feature is not available for Linux-based web apps like ours. + so we had to include an explicit swapping step in our updated deployment process. +- **Opaque responses from `az webapp deployment slot swap` CLI**: When there are issues swapping slots, the CLI doesn't +return any details about the issue. The swapping operation can also take as much as 20 minutes +to time out if there's a silent failure, which slows down deploy and validation. +- **Steep learning curve**: Most of the official docs and unofficial resources +(such as blogs and tutorials) for deployment slots are written for people using Windows +servers and Microsoft-published programing languages. This lack of support for other platforms +and languages means a lot more trial and error is involved. + +### Risks +- Because of the incomplete support for and documentation of our usecase, we may not have +chosen the optimal implementation of this feature. It may also be time-consuming to +troubleshoot if we run into future issues. From 6fa2322ce36de752b09df829eea82d4988379283 Mon Sep 17 00:00:00 2001 From: Sylvie Date: Wed, 4 Sep 2024 12:37:00 -0500 Subject: [PATCH 2/2] Add more info about decision process Co-Authored-By: jcrichlake <145698165+jcrichlake@users.noreply.github.com> --- adr/023-deployment-slots.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/adr/023-deployment-slots.md b/adr/023-deployment-slots.md index 8d084452e..b715afd29 100644 --- a/adr/023-deployment-slots.md +++ b/adr/023-deployment-slots.md @@ -14,6 +14,12 @@ Because TI is driven by web traffic from ReportStream, we can receive http calls If TI fails to respond, ReportStream will have to try sending the data again later, causing delays. By implementing zero-downtime deploys, our service can remain available to any incoming calls. +Even though there are some significant downsides to Deployment Slots, they're Azure's recommended +approach to zero-downtime deploys (ZDD), and they're lower effort and lower risk than the alternatives. +Other options to achieve ZDD are Kubernetes (significantly more complexity and effort), creating +our own custom deploy system (significantly more complexity, effort, and risk), or switching to +a cloud service provider that makes this easier, like AWS (not currently in scope as an option). + ## Impact ### Positive - **Zero-downtime deploys**: Zero-downtime deploys keep us from dropping incoming calls during deployment.