diff --git a/docs/proposal-task-expire.md b/docs/proposal-task-expire.md index 13d36dfd..d9bcb50c 100644 --- a/docs/proposal-task-expire.md +++ b/docs/proposal-task-expire.md @@ -15,29 +15,33 @@ not needed anymore. NOTE that forced expiration is exactly the same, except that it is caused by the user rather than the clock. In both cases we expire an incomplete task (whether waiting or finished-but-incomplete) because it does not need to run -(or rerun) in order to be considered complete. +(or rerun) - it's completion is no longer required. -### Expired Tasks Are "Complete" and Should Not Cause a Stall +### Expired Tasks Should Not Cause a Stall If a task expires, the workflow definition, or the user, has decreed that under current conditions it *should* expire, i.e., we don't need to run it anymore. -That is an kind of success, and we should therefore mark expired tasks as -complete so the scheduler can forget them - not retain them as incomplete to -cause a stall. +Therefore expired tasks should not be retained as incomplete tasks (which is a +surprising/error condition) and allowed to stall the workflow. ### Tasks Downstream of an Expired Task Should Not Cause a Stall +If the workflow definition, or the user, has decreed that under current +circumstances a task should expire, i.e. that it should not run all all, then +by implication nothing downstream of it should run either. + + ``` a => foo => bar # a succeeds, foo expires so foo never runs ``` -There is no reason to stall on account of `bar` here. Required outputs are always -conditional on the owner task actually running in the first place, and the -downstream graph should not spawn at all if the outputs it hangs off are not -generated. +So there is no reason to stall on account of `bar` here. Required outputs are +always conditional on the owner task actually running in the first place, and +the downstream graph should not spawn at all if the outputs it hangs off are +not generated. -From the perspective of `bar` this no different than if `foo` not running because -the branch it lives on wasn't taken at all: +From the perspective of `bar` this no different than if `foo` not running +because the branch it lives on wasn't taken at all: ``` a:x? => foo => bar # a does not generate :x, so foo never runs @@ -64,6 +68,8 @@ expires, then :expire triggers must be used to achieve that. We should enforce that :expire triggers be marked optional, because "required expiration" doesn't really make sense. +(N.B. this is moot if we decide that `:expire` is not a task output - see +below). ### Optional :expire Does Not Mean :succeed Must Be Optional @@ -75,13 +81,17 @@ means, *if the task runs, its success is required*. The status of an expired task's required outputs is the same as that of the required outputs of a task on a branch not taken at runtime. +``` foo:x? => bar foo:y? => baz +``` Here, bar:succeed is required ONLY if bar runs, i.e. if branch x is taken. If branch y is taken, the scheduler does not car that bar did not succeed. +``` foo | foo:expire? => bar +``` Here, foo:succeed is required ONLY if foo runs, i.e. if it doesn't expire. If foo expires, the scheduler should not care that foo (and bar) did not succeed. @@ -94,8 +104,8 @@ information: we could instantly distinguish between waiting tasks that expired without running to achieve completion, and finished-but-incomplete tasks were force-expired without re-running to achieve completion. -Note this would also make sense in the context of getting rid of `:expire` as a -task output. +Note this would also make sense in the context of NOT having `:expire` as a +task output at all (see below). I don't think anyone is deeply invested in expired as a state. @@ -109,17 +119,16 @@ every other output!) must be optional too? As discussed above, optional expire does NOT imply optional success - because expiration prevents a task from running at all, and the required nature of -real outputs is always contingent on the task running in the first place. - -So that's a nuance of the optional outputs system that users will have to -understand, OR we could remove expiration from the outputs system. +"real" outputs is always contingent on the task running in the first place. +How can a task that never even submits generate an output? -All the other outputs can only be generated once a job is submitted to run. Which makes sense - how can a task that never even submits generate an output? +So that's unfortunately a nuance of optional outputs that users will have to +understand, **OR we could remove expiration from the outputs system.** Expiration prevents a task from running in the first place. It makes more sense to think of expiration as something the scheduler does TO the task, not -something done BY the task. (In fact, a task can in principle expire long -before the active window of the workflow catches up with it). +something that is done BY the task. (In fact, a task can in principle expire +long before worfklow activity even catches reaches it in the graph). So we could use different notation to express this in the graph and allow triggering off of expiration, e.g.: