-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
finish implementing task expiration #5934
Conversation
commit a3da49837f2b60e768784c416c511fe1d1b6d720 Merge: e793ca1 c07392d Author: Hilary James Oliver <[email protected]> Date: Tue Jan 16 15:32:58 2024 +1300 Merge branch 'master' into cylc-set-task commit e793ca1 Author: Hilary James Oliver <[email protected]> Date: Tue Dec 19 17:49:33 2023 +1300 working on flow_wait reset ...[skip ci] commit f6a4f33 Author: Hilary James Oliver <[email protected]> Date: Tue Dec 19 15:46:11 2023 +1300 Handle flow-number skipping. commit 2f8536a Merge: 3fddad0 ca8637e Author: Hilary James Oliver <[email protected]> Date: Tue Dec 19 12:12:41 2023 +1300 Merge pull request #35 from oliver-sanders/cylc-set-stuff Cylc set stuff commit ca8637e Author: Hilary James Oliver <[email protected]> Date: Tue Dec 19 12:12:19 2023 +1300 Apply suggestions from code review [skip ci] commit 3fddad0 Author: Hilary James Oliver <[email protected]> Date: Sun Dec 17 13:05:48 2023 +1300 Fix wrong-outputs warning. commit 7c75fbc Author: Oliver Sanders <[email protected]> Date: Fri Dec 15 10:28:35 2023 +0000 tui: support the "cylc set" command with default opts commit cd4a3e3 Author: Oliver Sanders <[email protected]> Date: Fri Dec 15 10:18:49 2023 +0000 set: add note about CLI completion commit 06fbdea Author: Hilary James Oliver <[email protected]> Date: Fri Dec 15 16:24:39 2023 +1300 Fix C7 back-compat task removal. commit ad6e5d7 Merge: 9d320a5 6f62691 Author: Hilary James Oliver <[email protected]> Date: Fri Dec 15 15:32:00 2023 +1300 Merge branch 'master' into cylc-set-task commit 9d320a5 Author: Hilary James Oliver <[email protected]> Date: Fri Dec 15 13:43:05 2023 +1300 Final functionality tweaks. commit 8f5c0aa Author: Hilary James Oliver <[email protected]> Date: Thu Dec 14 21:33:08 2023 +1300 Fix select prev instance from task_states table. commit 8c34c75 Author: Hilary James Oliver <[email protected]> Date: Thu Dec 14 17:50:41 2023 +1300 Fix functional tests. commit e65f2c6 Author: Hilary James Oliver <[email protected]> Date: Thu Dec 14 16:37:24 2023 +1300 Logging tweaks. commit 1148487 Author: Hilary James Oliver <[email protected]> Date: Thu Dec 14 16:15:10 2023 +1300 Fix an integration test. commit 50a5b49 Author: Hilary James Oliver <[email protected]> Date: Thu Dec 14 15:55:34 2023 +1300 Transient task proxy comment. commit 4dc5a55 Author: Hilary James Oliver <[email protected]> Date: Thu Dec 14 10:40:46 2023 +1300 Tweak satisfy_me methods. commit cceb740 Author: Hilary James Oliver <[email protected]> Date: Thu Dec 14 10:10:17 2023 +1300 Remove duplicate block. commit 6503ac5 Author: Hilary James Oliver <[email protected]> Date: Wed Dec 13 16:29:24 2023 +1300 Demote warning to debug. commit 70fd970 Author: Hilary James Oliver <[email protected]> Date: Tue Dec 12 22:25:10 2023 +1300 Code review tweaks 2. commit 1e93707 Author: Hilary James Oliver <[email protected]> Date: Tue Dec 12 16:03:40 2023 +1300 Update cylc/flow/task_pool.py [skip ci] Co-authored-by: Oliver Sanders <[email protected]> commit 9c8b64a Author: Hilary James Oliver <[email protected]> Date: Tue Dec 12 15:29:33 2023 +1300 Update cylc/flow/task_proxy.py [skip ci] Co-authored-by: Oliver Sanders <[email protected]> commit f592516 Author: Hilary James Oliver <[email protected]> Date: Tue Dec 12 13:23:00 2023 +1300 code review tweaks [skip ci] commit d2839b4 Author: Hilary James Oliver <[email protected]> Date: Fri Dec 8 14:32:48 2023 +1300 Update tui test. commit 08938bd Author: Hilary James Oliver <[email protected]> Date: Thu Dec 7 18:24:50 2023 +1300 Tweak wording of set --help. [skip ci] commit 6e0f050 Author: Hilary James Oliver <[email protected]> Date: Thu Dec 7 16:30:12 2023 +1300 Don't spawn parentless if removing after flow-stop. commit 452c05d Author: Hilary James Oliver <[email protected]> Date: Thu Dec 7 14:30:36 2023 +1300 New func test. commit b22b817 Author: Hilary James Oliver <[email protected]> Date: Thu Dec 7 14:29:59 2023 +1300 Handle removal of parentless runahead tasks. commit d9be15c Author: Hilary James Oliver <[email protected]> Date: Wed Dec 6 19:41:43 2023 +1300 Remove useless func test. commit 69f2923 Author: Hilary James Oliver <[email protected]> Date: Wed Dec 6 18:51:04 2023 +1300 Record task completion in DB, for spawning. commit a66b50e Author: Hilary James Oliver <[email protected]> Date: Wed Dec 6 18:09:28 2023 +1300 Separate flows_nums stringify function. commit 759faf9 Author: Hilary James Oliver <[email protected]> Date: Wed Dec 6 13:04:36 2023 +1300 Fix some func tests. commit 365a9c7 Author: Hilary James Oliver <[email protected]> Date: Wed Dec 6 12:17:41 2023 +1300 expired tasks: dequeue and don't log as incomplete commit acefbba Author: Hilary James Oliver <[email protected]> Date: Tue Dec 5 17:21:58 2023 +1300 Better handling of implied outputs. commit 9fa07f9 Author: Hilary James Oliver <[email protected]> Date: Mon Dec 4 15:41:07 2023 +1300 working on tui test... commit 9702fd8 Author: Hilary James Oliver <[email protected]> Date: Mon Dec 4 14:42:28 2023 +1300 Fix a tui test. commit a79de36 Author: Hilary James Oliver <[email protected]> Date: Mon Dec 4 14:06:53 2023 +1300 Simplify and tidy. commit a7ee355 Author: Hilary James Oliver <[email protected]> Date: Mon Dec 4 10:42:28 2023 +1300 mypy fix commit cf197a1 Author: Hilary James Oliver <[email protected]> Date: Fri Dec 1 15:23:58 2023 +1300 Add cylc-set func tests TEMP commit d0e3c78 Author: Hilary James Oliver <[email protected]> Date: Thu Nov 23 13:29:19 2023 +1300 cylc-set: glob in pool for now. commit 78d5599 Author: Hilary James Oliver <[email protected]> Date: Sun Nov 19 11:40:59 2023 +1300 Revert CI change. commit ec954c2 Author: Hilary James Oliver <[email protected]> Date: Thu Nov 16 20:22:10 2023 +1300 Fix command logging and test. commit c61476f Author: Hilary James Oliver <[email protected]> Date: Thu Nov 16 18:11:36 2023 +1300 set command: dead end, not alias. commit 6c308d8 Author: Hilary James Oliver <[email protected]> Date: Wed Nov 15 15:32:05 2023 +1300 cylc set --pre: infer succeeded commit 44d84df Author: Hilary James Oliver <[email protected]> Date: Wed Nov 15 15:08:45 2023 +1300 Clean up command logging. commit cf8a26a Author: Oliver Sanders <[email protected]> Date: Tue Nov 14 14:48:41 2023 +0000 completion_server: support "cylc set" arguments * Support the `--pre` and `--out` arguments to `cylc set`. * This requires the task ID(s) to be provided *before* the `--pre` / `--out` option because otherwise we don't have the required information to complete the arguments. * This lists prereqs/outputs from `cylc show` which is currently restricted to n=1 tasks. * This does not support completing comma separared prereqs/outputs, use the `--pre` / `--out` options multiple times to do this. commit a01a56b Author: Hilary James Oliver <[email protected]> Date: Tue Nov 14 20:23:17 2023 +1300 Fix simulation job stuff. commit e81caf4 Author: Hilary James Oliver <[email protected]> Date: Tue Nov 14 18:34:51 2023 +1300 Fix tests. commit a7cbdd1 Author: Hilary James Oliver <[email protected]> Date: Tue Nov 14 16:14:48 2023 +1300 Better command logging. commit da2ced9 Author: Hilary James Oliver <[email protected]> Date: Tue Nov 14 15:30:53 2023 +1300 Don't log state changes for transient tasks. commit 34fc997 Author: Hilary James Oliver <[email protected]> Date: Tue Nov 14 10:35:34 2023 +1300 Some fixes; and remove cylc-set functional tests. commit 8acb090 Author: Hilary James Oliver <[email protected]> Date: Sun Nov 12 16:41:29 2023 +1300 Add integration tests. commit eac5276 Author: Hilary James Oliver <[email protected]> Date: Sun Nov 12 09:42:48 2023 +1300 Clean up future trigger comments. commit ba759c9 Author: Hilary James Oliver <[email protected]> Date: Sat Nov 11 23:04:24 2023 +1300 tiny docstring tweak commit c100520 Author: Hilary James Oliver <[email protected]> Date: Sat Nov 11 15:12:56 2023 +1300 cylc-set doctests commit 93152f2 Author: Hilary James Oliver <[email protected]> Date: Sat Nov 11 15:01:33 2023 +1300 Handle missed started events. commit 799f6ff Author: Hilary James Oliver <[email protected]> Date: Sat Nov 11 13:57:13 2023 +1300 Remove duplicate blocks, post merge from master. commit 8cea473 Author: Hilary James Oliver <[email protected]> Date: Fri Nov 10 10:36:17 2023 +1300 Squash "cylc set" dev branch. (History too messy!)
* Add test for use case 5 where a custom output is set without also setting a final output. * Add some integration tests to cover interaction with other features.
set: extra tests
51ae7e2
to
9b9e64d
Compare
ddcdc96
to
515a521
Compare
515a521
to
eec08d3
Compare
Questions... "force-expiring"I'm guessing that "force-expiring" means, setting the task to the expired state i.e: Question 1: should force-expiring an active task kill the job?The
Setting the succeeded or failed output on a task does not kill an active job, the expire output is no different. To kill the job, use Question 2: should force-expire have to be flagged in the graph?Expiry doesn't have to be handled in the graph for every task, but if it isn't handled, and the task is expired, then the workflow will stall.
This is correct as it is not possible to guess the user's intention in this situation, in most cases assuming expiry is optional will either result in premature shutdown or a delayed stall which is not a desirable default. Assuming the user's intention is aligned with optional expiry, then Remove i.e. make it as if this task never existed, rather than set the task state to expired then resolve the downstream implications. |
Yes, as opposed to clock-expire which is automatic, and therefore potentially surprising if you made a mistake when writing the workflow months ago. Force-expiring a task is by definition not surprising. You asked for it, and you got it. Unless you want to count "I didn't realize what expire does" as surprising, which would be bad form - we should provide powerful tools, and not presume that users don't understand them (the usual
That statement does not preclude killing an active job. "Intervening in completion status" should (at least could) have consequences for the jobs of running tasks.
Well, IF we decide that's that case. (Actually, maybe setting failed should kill the job - otherwise the failed task can be re-triggered while the original job is still running, with dire consequences for job IO).
Those don't cover quite the same use cases, at least not without extra effort from users to get the right result, especially for scripted applications (e.g. in task scripting) where you don't necessarily know if the task is running when you want to force-expire it. If I expire a task, by the meaning of "expire", I want the task not to run, and not to trigger anything downstream except via
|
I'm more on the fence about Question 2, because otherwise (as you noted) we are in a sense assuming that expire is optional, when it's not marked as such. However, it is very easy to argue that point:
[As an aside: why not put the same safety net under the "default" behaviour of Also, this isn't true:
We're talking about a deliberate manual intervention here, so there's no guessing needed. The user issued an "expire" command, so their intention is to expire the task, and expire has a clear meaning: there's no need to run that task or its children, except via the :expire trigger. To say we can't guess their intention here is to assume that they don't understand what they're doing (which might be true in some cases, but we can't assume that as a default!) |
I'll try to distill the two main issues here. I'm arguing: Q1
What exactly is your argument against that? Q2
(For Q1 I think it would be acceptable to have a |
Makes sense, they do what they say on the tin right. I'm not getting the "set a finished output on an active task" use requirement, it feels invalid. Note, however, that orphaning active jobs (proposed behaviour) is a requirement, see use case 4.
No, it is very much true! Consider this example:
And this command:
What is the user's intention here?
They haven't given you enough information to know. IMO outcome (1) would a very surprising outcome for this command. Removing (subtractive) the task is a more intuitive way to achieve this than setting (additive) a task state. |
That doesn't counter my argument for what it means to expire a task. Exactly where we locate it within the command set is of secondary importance, and can be changed very easily. I certainly agree that
Do you think we should not allow an active task to be expired? I'll try to be absolutely clear about the use case:
Yes, but that's an entirely different use case. If the scheduler cannot contact the job platform, the best we can do is set the task to failed and (potentially) orphan the job. If the scheduler could contact the job platform, we would not want to kill or orphan the job. |
OK this is the essence of our disagreement. You are essentially saying we should not respect an explicit command issued by the user because they don't understand what it means to expire a task. I'm saying that task expiry has a clear meaning and clear consequences, and we should do what the user asks; the onus is on them to understand what they're doing when they issue a command. The lack of an expire trigger in the graph is "information". If the graph is I agree that clock-expiry should be flagged in the graph because (a) it happens automatically and possibly very infrequently, and so is potentially surprising and dangerous, but also (b) it is reasonable to do that because you by definition know in advance which tasks may clock expire - that is not the case for direct manual interventions.
As per previous comments, I do agree that |
For Q1 (job kill when expiring an active task), we more or less agree on the "set" command not being ideal for killing jobs, and you haven't really argued against the conceptual issue, so I think this could be the way forward:
|
For Q2 (manually expiring tasks that have no "expired?" in the graph) I think I'll concede ☮️ I'm OK with stalling as a safety net, even though as argued above I don't really agree that it should be necessary. BTW the extension proposal actually says this:
but I think we both agree we do need to allow manually expiring tasks that do not have Scenario:
But how should I un-stall the workflow now? What I really want to do is expire Options:
Hmmm, actually the second option is best. We can explain it like this:
(That fits perfectly with what it means to expire a task). |
(I imagine it goes without saying, but there's no point in looking at the code on this branch until these two discussions are resolved) |
[UPDATE after off-line discussions] We've agreed:
For the record: I'm not 100% happy with 2. because:
However, it's consistent and easy to explain, and a couple of extra steps won't hurt anyone. So that's OK. Closing as superseded by #5981 ... |
BLOCKED: BUILT ON TOP OF #5658 - merge that PR first.
Question 1: should force-expiring an active task kill the job?
I think it should, and this branch does that, but the proposal does not cover this.
An already-active task will not auto expire on the clock (although I think it can be argued that it should).
But if the user explicitly commands the scheduler to expire an active task, it should kill the job then expire the task.
Rationale: expire means "don't bother to run this task, or its (non-expire) children"; therefore, if the task is already active there is no point in letting it continue, so we should cancel or kill the job. (We could expire the task without killing the job, which would orphan the job, but that leaves potential for duplicate job trouble if the task gets triggered again in a new flow).
Question 2: should force-expire have to be flagged in the graph?
I've implemented this as per the proposal (i.e.,
<task>:expire?
must appear in the graph) but I think that's wrong.Check List
CONTRIBUTING.md
and added my name as a Code Contributor.setup.cfg
(andconda-environment.yml
if present).CHANGES.md
entry included if this is a change that can affect users?.?.x
branch.