Test Azure outage alert #1459

jherrflexion · 2024-10-18T19:10:06Z

Add a PR title

Describe what changed in this PR at a high level.

Issue

Add a link to the issue here. Consider using
closing keywords
if the this PR isn't for a story (stories will be closed through different means).

Checklist

I have added tests to cover my changes
I have added logging where useful (with appropriate log level)
I have added JavaDocs where required
I have updated the documentation accordingly

Note: You may remove items that are not applicable

Co-Authored-By: Samuel Aquino <[email protected]>

github-actions · 2024-10-18T19:10:52Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Configuration Consistency The new alert configuration might not be consistent with existing patterns or best practices in terms of resource naming and lifecycle management.

github-actions · 2024-10-18T19:10:53Z

operations/template/alert.tf

@@ -28,6 +28,48 @@ resource "azurerm_monitor_action_group" "notify_slack_email" {
  }
 }

+resource "azurerm_monitor_activity_log_alert" "azure_service_health_alert" {
+  count               = local.non_pr_environment ? 1 : 0
+  name                = "cdcti-${var.environment}-azure-status-alert"


Consider using a variable for the alert name prefix instead of hardcoding 'cdcti-' to maintain consistency and configurability across different environments. [important]

github-actions · 2024-10-18T19:10:53Z

operations/template/alert.tf

@@ -28,6 +28,48 @@
  }
 }

+resource "azurerm_monitor_activity_log_alert" "azure_service_health_alert" {


It's recommended to add a tag to the new resource for better resource management and to align with the existing infrastructure as code practices. [important]

github-actions · 2024-10-18T19:10:53Z

operations/template/alert.tf

+      tags["security_steward"],
+      tags["support_group"],
+      tags["system"],
+      tags["technical_steward"],


Ensure that the 'ignore_changes' lifecycle policy is reviewed to confirm that it aligns with the operational requirements and does not inadvertently ignore important changes that should trigger updates. [medium]

github-actions · 2024-10-18T19:11:17Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Score
Enhancement	Replace the wildcard in the `services` field with specific service categories to enhance alert specificity Consider specifying more granular service categories in the `service_health` block instead of using a wildcard. This can improve the alert's accuracy and relevance. operations/template/alert.tf [44] -services = ["*"] +services = ["Compute", "Storage"] # Specify relevant services explicitly Suggestion importance[1-10]: 5 Why: Suggesting more specific service categories could indeed improve the alert's specificity and relevance. However, the use of a wildcard might be intentional to cover all services, so this suggestion is context-dependent.	5
Possible issue	Adjust the count condition to correctly enable or disable the alert based on the environment Ensure that the `count` condition for the resource `azurerm_monitor_activity_log_alert` is correctly set to enable or disable the alert based on the environment. If the intention is to disable the alert in production environments, consider revising the condition or adding a comment for clarity. operations/template/alert.tf [32] -count = local.non_pr_environment ? 1 : 0 +count = local.non_pr_environment ? 0 : 1 # Assuming the alert should be disabled in production Suggestion importance[1-10]: 4 Why: The suggestion to adjust the 'count' condition is valid but assumes the opposite intention of the existing code without clear evidence. The existing code disables the alert in production environments, which might be intentional.	4
Possible issue	Ensure the `scopes` field accurately targets all relevant Azure resources Verify the `scopes` field to ensure that it correctly targets the intended Azure resources for monitoring. If the scope is intended to include more than the container registry, additional resources should be added to the array. operations/template/alert.tf [36] -scopes = [azurerm_container_registry.registry.id] +scopes = [azurerm_container_registry.registry.id, additional_resource.id] # Add other required resources Suggestion importance[1-10]: 3 Why: The suggestion to verify the `scopes` field is relevant, but the improved code assumes additional resources need to be monitored without specific evidence from the PR context.	3
Best practice	Add validation for `service_health_locations` to ensure they are valid Azure regions Add validation to the `service_health_locations` variable to ensure that the provided values are valid Azure regions. This can prevent configuration errors. operations/template/alert.tf [42] -locations = var.service_health_locations +locations = var.service_health_locations # Ensure these are valid Azure regions Suggestion importance[1-10]: 2 Why: The suggestion to add validation is a good practice, but the improved code does not actually implement any validation mechanism; it only adds a comment which does not enforce any checks.	2

sonarqubecloud · 2024-10-18T19:12:34Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

jherrflexion and others added 5 commits October 18, 2024 09:33

WIP Azure Outage Alert

a629617

Co-Authored-By: Samuel Aquino <[email protected]>

Attempt action_group_id fix

07fbaf3

Co-Authored-By: Samuel Aquino <[email protected]>

Removed unnecessary email_subject

e4ec0a4

Refactoring location

71c1056

Remove temp change

1741c97

jherrflexion had a problem deploying to pr October 18, 2024 19:10 — with GitHub Actions Failure

github-actions bot reviewed Oct 18, 2024

View reviewed changes

jherrflexion had a problem deploying to pr October 18, 2024 19:11 — with GitHub Actions Failure

jherrflexion had a problem deploying to pr October 18, 2024 19:54 — with GitHub Actions Failure

jherrflexion had a problem deploying to pr October 18, 2024 20:46 — with GitHub Actions Failure

jherrflexion had a problem deploying to pr October 18, 2024 21:49 — with GitHub Actions Failure

jherrflexion had a problem deploying to pr October 21, 2024 14:28 — with GitHub Actions Failure

jherrflexion closed this Oct 21, 2024

jherrflexion had a problem deploying to pr October 21, 2024 14:45 — with GitHub Actions Failure

jherrflexion temporarily deployed to pr October 21, 2024 14:46 — with GitHub Actions Inactive

jherrflexion had a problem deploying to pr October 21, 2024 16:04 — with GitHub Actions Failure

jherrflexion deleted the azure-outage-alert branch November 4, 2024 21:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test Azure outage alert #1459

Test Azure outage alert #1459

jherrflexion commented Oct 18, 2024

github-actions bot commented Oct 18, 2024

github-actions bot Oct 18, 2024

github-actions bot Oct 18, 2024

github-actions bot Oct 18, 2024

github-actions bot commented Oct 18, 2024

sonarqubecloud bot commented Oct 18, 2024

Test Azure outage alert #1459

Test Azure outage alert #1459

Conversation

jherrflexion commented Oct 18, 2024

Add a PR title

Issue

Checklist

github-actions bot commented Oct 18, 2024

PR Reviewer Guide 🔍

github-actions bot Oct 18, 2024

Choose a reason for hiding this comment

github-actions bot Oct 18, 2024

Choose a reason for hiding this comment

github-actions bot Oct 18, 2024

Choose a reason for hiding this comment

github-actions bot commented Oct 18, 2024

PR Code Suggestions ✨

sonarqubecloud bot commented Oct 18, 2024

Quality Gate passed