Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure default Spark executor SA is created in Spark clusters #3921

Merged
merged 4 commits into from
Jul 17, 2024

Conversation

nemacysts
Copy link
Member

This is mostly a disaster-recovery thing since this SA exists already - but to avoid any unpleasant times should we decide to start these clusters from scratch, it'd be best to ensure that this SA actually exists :)

(I was talking to @CaptainSame in Slack and he reminded me about this SA)

This is mostly a disaster-recovery thing since this SA exists already -
but to avoid any unpleasant times should we decide to start these
clusters from scratch, it'd be best to ensure that this SA actually
exists :)
@@ -97,6 +97,26 @@ def ensure_service_accounts(job_configs: List[TronJobConfig]) -> None:
namespace=spark_tools.SPARK_EXECUTOR_NAMESPACE,
kube_client=spark_kube_client,
)
elif (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this block since there is always an iam_role returned by action.get_iam_role() if action.get_executor() == "spark", so we will never hit this condition.
If the objective is to safeguard against action.get_iam_role() being None, then we can just take out the spark executor service account creation logic from this if else block and put it under if action.get_executor()=="spark" and action.get_spark_executor_iam_role()
What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, you're right: this should just be action.get_executor() == "spark" and action.get_spark_executor_iam_role()!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, i see - we can just combine the previous logic and this and call ensure_service_account twice: once with get_iam_role (for the driver) and then with get_spark_executor_iam_role for the executor

The previous iteration of this would have never been hit since in normal
operation action.get_iam_role() will always be truthy for executor=spark
i am dumb and sameer reminded me about how this all actually works (i
think i was getting my wires crossed between how spark works from the
batch boxes vs on k8s :p)
Copy link
Contributor

@CaptainSame CaptainSame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@CaptainSame CaptainSame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nemacysts nemacysts merged commit 5d000db into master Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants