Skip to content

Commit

Permalink
rfc15: describe signal forwarding details
Browse files Browse the repository at this point in the history
Problem: the IMP kill subcommand is briefly mentioned as the way
to signal guest processes, but this is inadequate in practice.

Now that the IMP lingers, just have it forward signals to the job
shell.  In addition, describe a surrogate signal that tells the
IMP to do its best to clean up the entire job container.

See also: flux-framework/flux-core#6011
  • Loading branch information
garlick committed Oct 17, 2024
1 parent 128de0f commit ba16b1d
Showing 1 changed file with 18 additions and 6 deletions.
24 changes: 18 additions & 6 deletions spec_15.rst
Original file line number Diff line number Diff line change
Expand Up @@ -372,13 +372,25 @@ A multi-user instance of Flux not only requires the ability to execute
work as a guest user, but it must also have privilege to monitor and
kill these processes as part of normal resource manager operation.

Signaling and terminating jobs in a multi-user instance
-------------------------------------------------------
Signal Handling
---------------

For terminating and signaling processes the IMP SHALL include a ``kill``
subcommand which, using the process tracking functionality, SHALL allow
an instance owner to signal or terminate any guest processes including
ancestors thereof that were started by the owner’s instance.
The IMP runs with an effective user ID of root and a real user id of the
system instance owner, thus the system instance owner is permitted to signal
the IMP. In contrast, the system instance owner is not permitted to signal
guest user processes.

To enable the instance owner to signal guest jobs, the IMP SHALL act
as a proxy for the job by trapping common signals and forwarding them to
the job shell.

To enable the instance owner to fully clean up when the job shell is unable
to do so, the IMP SHALL handle SIGUSR1 as a surrogate for SIGKILL. Upon
receipt of this signal, the IMP SHOULD deliver SIGKILL to all processes in
the job's container, including the job shell.

The mechanism by which processes are identified to receive SIGKILL is
outside the scope of this document.

IMP configuration
=================
Expand Down

0 comments on commit ba16b1d

Please sign in to comment.