Skip to content

Commit

Permalink
[KYUUBI #5731][K8S] Support to cleanup the spark driver pod with spec…
Browse files Browse the repository at this point in the history
…ified cleanup strategy

# 🔍 Description

## Describe Your Solution 🔧

A new feature introduced from #5714 supports kyuubi to clean up spark driver pods automatically, but all pod would be clean up without considering app's terminated state.
This PR make user can chose which pods should be delete by setting up a cleanup strategy.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Test locally.

---

# Checklists
## 📝 Author Self Checklist

- [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project
- [x] I have performed a self-review
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

## 📝 Committer Pre-Merge Checklist

- [x] Pull request title is okay.
- [ ] No license issues.
- [ ] Milestone correctly set?
- [ ] Test coverage is ok
- [ ] Assignees are selected.
- [ ] Minimum number of approvals
- [ ] No changes are requested

**Be nice. Be informative.**

Closes #5728 from liaoyt/master.

Closes #5731

d2cc8cb [yeatsliao] regenerate docs
4caf8b1 [yeatsliao] rename conf 'KUBERNETES_SPARK_DELETE_DRIVER_POD_ON_TERMINATION' to 'KUBERNETES_SPARK_CLEANUP_TERMINATED_DRIVER_POD'
4d970fa [yeatsliao] [K8S] Support to cleanup the spark driver pod with specified clean up strategy

Authored-by: yeatsliao <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
(cherry picked from commit dc03687)
Signed-off-by: Cheng Pan <[email protected]>
  • Loading branch information
liaoyt authored and pan3793 committed Nov 20, 2023
1 parent e70bfc5 commit fd4516f
Show file tree
Hide file tree
Showing 3 changed files with 24 additions and 11 deletions.
2 changes: 1 addition & 1 deletion docs/configuration/settings.md
Original file line number Diff line number Diff line change
Expand Up @@ -323,7 +323,7 @@ You can configure the Kyuubi properties in `$KYUUBI_HOME/conf/kyuubi-defaults.co
| kyuubi.kubernetes.master.address | &lt;undefined&gt; | The internal Kubernetes master (API server) address to be used for kyuubi. | string | 1.7.0 |
| kyuubi.kubernetes.namespace | default | The namespace that will be used for running the kyuubi pods and find engines. | string | 1.7.0 |
| kyuubi.kubernetes.namespace.allow.list || The allowed kubernetes namespace list, if it is empty, there is no kubernetes namespace limitation. | set | 1.8.0 |
| kyuubi.kubernetes.spark.deleteDriverPodOnTermination.enabled | false | If set to true then Kyuubi server will delete the spark driver pod after the application terminates for kyuubi.kubernetes.terminatedApplicationRetainPeriod. | boolean | 1.8.1 |
| kyuubi.kubernetes.spark.cleanupTerminatedDriverPod | NONE | Kyuubi server will delete the spark driver pod after the application terminates for kyuubi.kubernetes.terminatedApplicationRetainPeriod. Available options are NONE, ALL, COMPLETED and default value is None which means none of the pod will be deleted | string | 1.8.1 |
| kyuubi.kubernetes.spark.forciblyRewriteDriverPodName.enabled | false | Whether to forcibly rewrite Spark driver pod name with 'kyuubi-<uuid>-driver'. If disabled, Kyuubi will try to preserve the application name while satisfying K8s' pod name policy, but some vendors may have stricter pod name policies, thus the generated name may become illegal. | boolean | 1.8.1 |
| kyuubi.kubernetes.spark.forciblyRewriteExecutorPodNamePrefix.enabled | false | Whether to forcibly rewrite Spark executor pod name prefix with 'kyuubi-<uuid>'. If disabled, Kyuubi will try to preserve the application name while satisfying K8s' pod name policy, but some vendors may have stricter Pod name policies, thus the generated name may become illegal. | boolean | 1.8.1 |
| kyuubi.kubernetes.terminatedApplicationRetainPeriod | PT5M | The period for which the Kyuubi server retains application information after the application terminates. | duration | 1.7.1 |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1231,13 +1231,20 @@ object KyuubiConf {
.checkValue(_ > 0, "must be positive number")
.createWithDefault(Duration.ofMinutes(5).toMillis)

val KUBERNETES_SPARK_DELETE_DRIVER_POD_ON_TERMINATION_ENABLED: ConfigEntry[Boolean] =
buildConf("kyuubi.kubernetes.spark.deleteDriverPodOnTermination.enabled")
.doc("If set to true then Kyuubi server will delete the spark driver pod after " +
s"the application terminates for ${KUBERNETES_TERMINATED_APPLICATION_RETAIN_PERIOD.key}.")
val KUBERNETES_SPARK_CLEANUP_TERMINATED_DRIVER_POD: ConfigEntry[String] =
buildConf("kyuubi.kubernetes.spark.cleanupTerminatedDriverPod")
.doc("Kyuubi server will delete the spark driver pod after " +
s"the application terminates for ${KUBERNETES_TERMINATED_APPLICATION_RETAIN_PERIOD.key}. " +
"Available options are NONE, ALL, COMPLETED and " +
"default value is None which means none of the pod will be deleted")
.version("1.8.1")
.booleanConf
.createWithDefault(false)
.stringConf
.createWithDefault(KubernetesCleanupDriverPodStrategy.NONE.toString)

object KubernetesCleanupDriverPodStrategy extends Enumeration {
type KubernetesCleanupDriverPodStrategy = Value
val NONE, ALL, COMPLETED = Value
}

val KUBERNETES_APPLICATION_STATE_CONTAINER: ConfigEntry[String] =
buildConf("kyuubi.kubernetes.application.state.container")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,9 @@ import io.fabric8.kubernetes.client.informers.{ResourceEventHandler, SharedIndex

import org.apache.kyuubi.{KyuubiException, Logging, Utils}
import org.apache.kyuubi.config.KyuubiConf
import org.apache.kyuubi.config.KyuubiConf.KubernetesApplicationStateSource
import org.apache.kyuubi.config.KyuubiConf.{KubernetesApplicationStateSource, KubernetesCleanupDriverPodStrategy}
import org.apache.kyuubi.config.KyuubiConf.KubernetesApplicationStateSource.KubernetesApplicationStateSource
import org.apache.kyuubi.config.KyuubiConf.KubernetesCleanupDriverPodStrategy.{ALL, COMPLETED, NONE}
import org.apache.kyuubi.engine.ApplicationState.{isTerminated, ApplicationState, FAILED, FINISHED, NOT_FOUND, PENDING, RUNNING, UNKNOWN}
import org.apache.kyuubi.util.KubernetesUtils

Expand Down Expand Up @@ -107,14 +108,19 @@ class KubernetesApplicationOperation extends ApplicationOperation with Logging {
submitTimeout = conf.get(KyuubiConf.ENGINE_KUBERNETES_SUBMIT_TIMEOUT)
// Defer cleaning terminated application information
val retainPeriod = conf.get(KyuubiConf.KUBERNETES_TERMINATED_APPLICATION_RETAIN_PERIOD)
val deleteSparkDriverPodOnTermination =
conf.get(KyuubiConf.KUBERNETES_SPARK_DELETE_DRIVER_POD_ON_TERMINATION_ENABLED)
val cleanupDriverPodStrategy = KubernetesCleanupDriverPodStrategy.withName(
conf.get(KyuubiConf.KUBERNETES_SPARK_CLEANUP_TERMINATED_DRIVER_POD))
cleanupTerminatedAppInfoTrigger = CacheBuilder.newBuilder()
.expireAfterWrite(retainPeriod, TimeUnit.MILLISECONDS)
.removalListener((notification: RemovalNotification[String, ApplicationState]) => {
Option(appInfoStore.remove(notification.getKey)).foreach { case (kubernetesInfo, removed) =>
val appLabel = notification.getKey
if (deleteSparkDriverPodOnTermination) {
val shouldDelete = cleanupDriverPodStrategy match {
case NONE => false
case ALL => true
case COMPLETED => !ApplicationState.isFailed(notification.getValue)
}
if (shouldDelete) {
val podName = removed.name
try {
val kubernetesClient = getOrCreateKubernetesClient(kubernetesInfo)
Expand Down

0 comments on commit fd4516f

Please sign in to comment.