[Draft] CSPL-3354: Add Lifecycle Hooks and Configurable Termination Grace Period to Splunk Operator #1424
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
This Pull Request introduces enhancements to the Splunk Operator by integrating Lifecycle Hooks and allowing customers to configure the Termination Grace Period via the Custom Resource (
Common Spec
). These changes aim to ensure graceful shutdowns of Splunk pods, thereby maintaining data integrity and improving the reliability of Splunk deployments on Kubernetes.Problem Statement
Customers running Splunk on Kubernetes have reported issues related to abrupt pod terminations, especially during node recycling or maintenance operations. Without proper shutdown procedures, Splunk instances may not decommission gracefully, leading to potential data loss and increased operational churn. Additionally, the lack of configurable grace periods limits customers' ability to tailor shutdown behaviors to their specific environments and requirements.
Proposed Solution
Integrate Lifecycle Hooks:
preStop
Hook: Executessplunk offline
andsplunk stop
commands before the pod is terminated. This ensures that Splunk instances decommission gracefully, preventing data corruption and loss.Configurable Termination Grace Period:
Common Spec
of the Splunk Operator’s Custom Resource to allow customers to specifyterminationGracePeriodSeconds
.Changes Made
Custom Resource Definition:
terminationGracePeriodSeconds
under thecommonSpec
section to allow customization.StatefulSet Template Update:
lifecycle
section with thepreStop
hook.terminationGracePeriodSeconds
value from theCommon Spec
.Benefits
Related Issues
Testing Performed
Unit Tests:
terminationGracePeriodSeconds
from the Custom Resource is correctly applied to the StatefulSet.preStop
lifecycle hook executes the appropriate Splunk commands.Integration Tests:
splunk offline
andsplunk stop
commands were executed before termination.terminationGracePeriodSeconds
values to ensure flexibility and correctness.Manual Testing:
Documentation Updates
Operator README:
terminationGracePeriodSeconds
field in the Custom Resource.Configuration Guides:
terminationGracePeriodSeconds
based on different deployment scenarios.How to Test
Update Custom Resource:
terminationGracePeriodSeconds
in your Splunk Operator Custom Resource.Deploy or Update Splunk Cluster:
Verify StatefulSet Configuration:
preStop
lifecycle hook and the correctterminationGracePeriodSeconds
.Simulate Pod Termination:
preStop
hook.Future Considerations
splunk decommission
if it provides more comprehensive shutdown procedures compared tosplunk offline
andsplunk stop
.terminationGracePeriodSeconds
without requiring full cluster redeployments.Reviewer Notes
terminationGracePeriodSeconds
field continue to operate with the default grace period.Pull Request Checklist: