-
Notifications
You must be signed in to change notification settings - Fork 931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[KYUUBI #5795][K8S] Support to cleanup the spark driver pod periodically #5806
Conversation
@@ -147,6 +150,26 @@ class KubernetesApplicationOperation extends ApplicationOperation with Logging { | |||
} | |||
}) | |||
.build() | |||
if (conf.get(KyuubiConf.KUBERNETES_SPARK_CLEANUP_TERMINATED_DRIVE_POD_IMMEDIATELY)) { | |||
expireCleanUpTriggerCacheExecutor = Executors.newSingleThreadScheduledExecutor( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use ThreadUtils to create the executor
@@ -1241,6 +1241,15 @@ object KyuubiConf { | |||
.stringConf | |||
.createWithDefault(KubernetesCleanupDriverPodStrategy.NONE.toString) | |||
|
|||
val KUBERNETES_SPARK_CLEANUP_TERMINATED_DRIVE_POD_IMMEDIATELY: ConfigEntry[Boolean] = | |||
buildConf("kyuubi.kubernetes.spark.cleanupTerminatedDriverPodImmediately") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for making you back and forth, I think the configuration name is confusing now. how about?
kyuubi.kubernetes.spark.cleanupTerminatedDriverPod.kind=[NONE|ALL|COMPLETED]
kyuubi.kubernetes.spark.cleanupTerminatedDriverPod.checkInterval=[PT1M]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
val keys = cleanupTerminatedAppInfoTrigger.asMap().keySet().asScala | ||
for (key <- keys) { | ||
// do get to trigger cache eviction | ||
cleanupTerminatedAppInfoTrigger.getIfPresent(key) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
val keys = cleanupTerminatedAppInfoTrigger.asMap().keySet().asScala | |
for (key <- keys) { | |
// do get to trigger cache eviction | |
cleanupTerminatedAppInfoTrigger.getIfPresent(key) | |
} | |
val keys = cleanupTerminatedAppInfoTrigger.asMap().asScala.foreach { case (key, _) => | |
// do get to trigger cache eviction | |
cleanupTerminatedAppInfoTrigger.getIfPresent(key) | |
} |
expireCleanUpTriggerCacheExecutor = Executors.newSingleThreadScheduledExecutor( | ||
new ThreadFactoryBuilder().setDaemon(true).setNameFormat( | ||
s"pod-clean-up-trigger-thread").build()) | ||
expireCleanUpTriggerCacheExecutor.scheduleAtFixedRate( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer to use scheduleAtFixedDelay
if possible
I have encountered a performance issue when the task execution time is longer than the scheduling interval, which causes high CPU usage.
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #5806 +/- ##
============================================
- Coverage 61.27% 61.25% -0.03%
Complexity 23 23
============================================
Files 608 608
Lines 36027 36047 +20
Branches 4951 4951
============================================
+ Hits 22076 22080 +4
- Misses 11556 11573 +17
+ Partials 2395 2394 -1 ☔ View full report in Codecov by Sentry. |
Merging to master/1.8 |
# 🔍 Description ## Issue References 🔗 This pull request fixes #5795 ## Describe Your Solution 🔧 Create a single daemon thread to traverse cache map periodically, which will evict expired cache and trigger a pod clean up operation. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [ ] I have performed a self-review - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [ ] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] New and existing unit tests pass locally with my changes - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [x] Pull request title is okay. - [x] No license issues. - [x] Milestone correctly set? - [ ] Test coverage is ok - [x] Assignees are selected. - [x] Minimum number of approvals - [x] No changes are requested **Be nice. Be informative.** Closes #5806 from liaoyt/master. Closes #5795 75c2b68 [yeatsliao] cleanup driver pod periodically Authored-by: yeatsliao <[email protected]> Signed-off-by: Cheng Pan <[email protected]> (cherry picked from commit 27ad102) Signed-off-by: Cheng Pan <[email protected]>
I'm using v1.9 but this change seems not working 🤔 |
I'm sorry currently there is not any logs to check if the deamon is still active. |
|
but still doesn't delete pods |
I use the v1.9.1 version and configure the following two parameters kyuubi.kubernetes.spark.cleanupTerminatedDriverPod.kind=ALL |
🔍 Description
Issue References 🔗
This pull request fixes #5795
Describe Your Solution 🔧
Create a single daemon thread to traverse cache map periodically, which will evict expired cache and trigger a pod clean up operation.
Types of changes 🔖
Test Plan 🧪
Behavior Without This Pull Request ⚰️
Behavior With This Pull Request 🎉
Related Unit Tests
Checklists
📝 Author Self Checklist
📝 Committer Pre-Merge Checklist
Be nice. Be informative.