WIP Azure Outage Alert #1455

jherrflexion · 2024-10-18T16:16:13Z

Add a PR title

Describe what changed in this PR at a high level.

Issue

Add a link to the issue here. Consider using
closing keywords
if the this PR isn't for a story (stories will be closed through different means).

Checklist

I have added tests to cover my changes
I have added logging where useful (with appropriate log level)
I have added JavaDocs where required
I have updated the documentation accordingly

Note: You may remove items that are not applicable

Co-Authored-By: Samuel Aquino <[email protected]>

github-actions · 2024-10-18T16:17:01Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Configuration Logic The condition for creating the 'azurerm_monitor_activity_log_alert' resource is based on 'local.non_pr_environment'. This logic should be reviewed to ensure it aligns with the intended environments for deployment. Hardcoded Values The alert configuration contains hardcoded values for locations and services which might not be suitable for all deployment scenarios. Consider making these values configurable.

github-actions · 2024-10-18T16:17:02Z

operations/template/alert.tf

+    category = "ServiceHealth"
+    levels   = ["Error"]
+    service_health {
+      locations = ["East US", "Global"]


Consider parameterizing the 'locations' and 'services' fields in the service_health criteria to enhance flexibility and maintainability of the alert configuration. [important]

github-actions · 2024-10-18T16:17:02Z

operations/template/alert.tf

+
+  lifecycle {
+    ignore_changes = [
+      tags["business_steward"],


Review the necessity of ignoring so many tags in the lifecycle configuration. This could potentially lead to overlooking important changes in these tags. [important]

github-actions · 2024-10-18T16:17:34Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Score
Possible issue	Adjust the scope to target the correct Azure resources for monitoring Ensure that the `scopes` field in the `azurerm_monitor_activity_log_alert` resource is correctly set to target the intended Azure resources. Currently, it is set to use the ID of `azurerm_container_registry.registry`, which might not be relevant for monitoring Azure service health. operations/template/alert.tf [36] -scopes = [azurerm_container_registry.registry.id] +scopes = [data.azurerm_resource_group.group.id] Suggestion importance[1-10]: 7 Why: The suggestion correctly identifies a potential misconfiguration in the 'scopes' field, which could lead to monitoring the wrong resources. Adjusting this to the correct resource group ID as suggested could significantly improve the relevance of the monitoring setup.	7
Best practice	Reevaluate lifecycle ignore changes to maintain compliance and security oversight Review the `ignore_changes` lifecycle configuration to ensure it aligns with the operational requirements. Ignoring changes to critical tags like "security_compliance" and "pii_data" might lead to oversight in compliance tracking. operations/template/alert.tf [57-69] ignore_changes = [ tags["business_steward"], ... - tags["pii_data"], - tags["security_compliance"], + tags["support_group"], + tags["system"], ... ] Suggestion importance[1-10]: 6 Why: The suggestion to review the 'ignore_changes' configuration is valid as ignoring critical tags like "security_compliance" and "pii_data" could lead to compliance issues. Adjusting this configuration could enhance the security and compliance monitoring of the infrastructure.	6
Enhancement	Expand the severity levels monitored to enhance alert coverage Consider adding more severity levels in the `levels` field of the `criteria` block to ensure comprehensive monitoring. Currently, only "Error" level is monitored. operations/template/alert.tf [40] -levels = ["Error"] +levels = ["Error", "Critical", "Warning"] Suggestion importance[1-10]: 5 Why: Adding more severity levels to the monitoring criteria can help in capturing a broader range of issues, thus enhancing the alert system's effectiveness. However, the necessity of this change depends on the specific monitoring needs and might not be critical.	5
Enhancement	Include additional event types in the alert criteria to improve monitoring effectiveness Verify and potentially expand the `events` list under `service_health` to include other relevant event types like "Maintenance" alongside "Incident" to ensure all pertinent service health issues are captured. operations/template/alert.tf [43] -events = ["Incident"] +events = ["Incident", "Maintenance"] Suggestion importance[1-10]: 5 Why: Including more event types such as "Maintenance" alongside "Incident" could provide a more comprehensive monitoring of service health. This suggestion is beneficial for capturing a wider range of service health issues.	5

Co-Authored-By: Samuel Aquino <[email protected]>

sonarqubecloud · 2024-10-18T16:54:25Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

WIP Azure Outage Alert

a629617

Co-Authored-By: Samuel Aquino <[email protected]>

jherrflexion had a problem deploying to pr October 18, 2024 16:16 — with GitHub Actions Failure

github-actions bot reviewed Oct 18, 2024

View reviewed changes

Attempt action_group_id fix

07fbaf3

Co-Authored-By: Samuel Aquino <[email protected]>

jherrflexion had a problem deploying to pr October 18, 2024 16:20 — with GitHub Actions Failure

Removed unnecessary email_subject

e4ec0a4

jherrflexion had a problem deploying to pr October 18, 2024 16:22 — with GitHub Actions Failure

Refactoring location

71c1056

jherrflexion had a problem deploying to pr October 18, 2024 16:41 — with GitHub Actions Failure

Remove temp change

1741c97

jherrflexion had a problem deploying to pr October 18, 2024 16:52 — with GitHub Actions Failure

jherrflexion had a problem deploying to pr October 18, 2024 18:32 — with GitHub Actions Failure

jherrflexion had a problem deploying to pr October 18, 2024 19:04 — with GitHub Actions Failure

jherrflexion closed this Oct 18, 2024

jherrflexion temporarily deployed to pr October 18, 2024 19:08 — with GitHub Actions Inactive

jherrflexion temporarily deployed to pr October 18, 2024 19:09 — with GitHub Actions Inactive

jherrflexion had a problem deploying to pr October 18, 2024 19:10 — with GitHub Actions Failure

jherrflexion had a problem deploying to pr October 18, 2024 19:11 — with GitHub Actions Failure

jherrflexion had a problem deploying to pr October 18, 2024 19:54 — with GitHub Actions Failure

jherrflexion had a problem deploying to pr October 18, 2024 20:46 — with GitHub Actions Failure

jherrflexion had a problem deploying to pr October 18, 2024 21:49 — with GitHub Actions Failure

jherrflexion had a problem deploying to pr October 21, 2024 14:28 — with GitHub Actions Failure

jherrflexion had a problem deploying to pr October 21, 2024 14:45 — with GitHub Actions Failure

jherrflexion temporarily deployed to pr October 21, 2024 14:46 — with GitHub Actions Inactive

jherrflexion had a problem deploying to pr October 21, 2024 16:04 — with GitHub Actions Failure

jherrflexion deleted the azure-outage-alert branch November 4, 2024 21:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP Azure Outage Alert #1455

WIP Azure Outage Alert #1455

jherrflexion commented Oct 18, 2024

github-actions bot commented Oct 18, 2024

github-actions bot Oct 18, 2024

github-actions bot Oct 18, 2024

github-actions bot commented Oct 18, 2024

sonarqubecloud bot commented Oct 18, 2024

WIP Azure Outage Alert #1455

WIP Azure Outage Alert #1455

Conversation

jherrflexion commented Oct 18, 2024

Add a PR title

Issue

Checklist

github-actions bot commented Oct 18, 2024

PR Reviewer Guide 🔍

github-actions bot Oct 18, 2024

Choose a reason for hiding this comment

github-actions bot Oct 18, 2024

Choose a reason for hiding this comment

github-actions bot commented Oct 18, 2024

PR Code Suggestions ✨

sonarqubecloud bot commented Oct 18, 2024

Quality Gate passed