-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[worker] fix kill remote shell task and print exception log #15569 #15628
Conversation
@@ -156,7 +156,6 @@ public void kill(String taskId) throws IOException { | |||
String pid = getTaskPid(taskId); | |||
String killCommand = String.format(COMMAND.KILL_COMMAND, pid); | |||
runRemote(killCommand); | |||
cleanData(taskId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please explain more why we need to delete cleanData() here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rickchengx Hi, I read the logic again, we can not delete cleanData()
here. I think, The rm
command in cleanData()
should be replaced to rm -f
to avoid the printing of the error log when the file is not existed. what you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rickchengx Hi, I read the logic again, we can not delete
cleanData()
here. I think, Therm
command incleanData()
should be replaced torm -f
to avoid the printing of the error log when the file is not existed. what you think?
I think we should make sure the remote task can be killed successfully. And There can be various reasons for errors when cleaning data, and I think it's acceptable to make users aware of these errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rickchengx Hi, I read the logic again, we can not delete
cleanData()
here. I think, Therm
command incleanData()
should be replaced torm -f
to avoid the printing of the error log when the file is not existed. what you think?I think we should make sure the remote task can be killed successfully. And There can be various reasons for errors when cleaning data, and I think it's acceptable to make users aware of these errors.
sure,I will revert to this logic later。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rickchengx Hi, I read the logic again, we can not delete
cleanData()
here. I think, Therm
command incleanData()
should be replaced torm -f
to avoid the printing of the error log when the file is not existed. what you think?I think we should make sure the remote task can be killed successfully. And There can be various reasons for errors when cleaning data, and I think it's acceptable to make users aware of these errors.
Hi, I reverted cleanData()
logic.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## dev #15628 +/- ##
============================================
- Coverage 38.96% 38.96% -0.01%
+ Complexity 4840 4839 -1
============================================
Files 1316 1316
Lines 45010 45010
Branches 4818 4818
============================================
- Hits 17539 17538 -1
Misses 25575 25575
- Partials 1896 1897 +1 ☔ View full report in Codecov by Sentry. |
Please retry analysis of this Pull-Request directly on SonarCloud |
@@ -90,7 +90,6 @@ public void init() { | |||
taskId = TASK_ID_PREFIX + taskExecutionContext.getTaskInstanceId(); | |||
} | |||
setAppIds(taskId); | |||
taskExecutionContext.setAppIds(taskId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's appropriate to delete the appId directly here. I suggest modifying the logic of ProcessUtils. Currently, it seems that the appId is considered to be only yarn app id, but in fact there are other external application ids, such as remote shell.
cc @Radeity @caishunfeng @SbloodyS
Lines 187 to 193 in 738da1c
if (CollectionUtils.isEmpty(appIds)) { | |
log.info("The appId is empty"); | |
return; | |
} | |
ApplicationManager applicationManager = applicationManagerMap.get(ResourceManagerType.YARN); | |
applicationManager.killApplication(new YarnApplicationManagerContext(executePath, tenantCode, appIds)); | |
} |
This pull request has been automatically marked as stale because it has not had recent activity for 120 days. It will be closed in 7 days if no further activity occurs. |
This pull request has been closed because it has not had recent activity. You could reopen it if you try to continue your work, and anyone who are interested in it are encouraged to continue work on this pull request. |
Purpose of the pull request
This pull request to fix #15569 , not print exception log when kill a remote shell task.
appIds
inTaskExecutionContext
. when we set it in remote-sehll task, an error logic will be triggered inorg.apache.dolphinscheduler.plugin.task.api.utils.ProcessUtils#cancelApplication
cleanData(taskId)
inorg.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor#kill()
. BecausecleanData(taskId)
inorg.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor#getTaskExitCode()
.Brief change log
Verify this pull request
This pull request is code cleanup without any test coverage.
(or)
This pull request is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(or)
If your pull request contain incompatible change, you should also add it to
docs/docs/en/guide/upgrede/incompatible.md