You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems polars 1.15 installs a hook to warn the user about possible deadlocks when using multiprocessing.Pool() with the fork method (default on Unix platforms). However, that hook triggers every time the controls goes back to the Python interpreter after a fork, which unfortunately happens when using the preexec_fn parameter of subprocess.Popen(), which we do.
Do we actually need these preexec_fn ?
defpreexec_function():
# Change process group in case we have to kill the subprocess and all of# its children later.# TODO: this is Unix-specific; would be good to find an OS-agnostic way# to do this in case we wanna port WA to Windows.os.setpgrp()
We very much need to allow having threads in an application using devlib.
Also, the following note provides a possible fix:
Note
If you need to modify the environment for the child use the env parameter rather than doing it in a preexec_fn. The start_new_session and process_group parameters should take the place of code using preexec_fn to call os.setsid() or os.setpgid() in the child.
In practice from my experimentation, processes are not killed upon Popen garbage collection. However, if the parent process exits due to an exception (e.g. KeyboardInterrupt), then the processes spawned by Popen seem to be terminated, even if the process outlived its Popen handler in the Python world. If the parent process is killed (so exits not via an exception), then the children outlive the parent process.
So to summarize:
I don't think the advantages of calling os.setpgrp() as described in preexec_fn comment justifies the risk of undefined behavior outlined in the Python doc, especially when correct resource control on the Python side (e.g. context managers) can allow clean handling in cooperative requests (KeyboardInterrupt).
In practice, this conflicts with polars. It can be argued that polars should not install that handler, but really the problem is that Python has a broken default that lets no-one win. This will change in 3.14, but until then we have to deal with it.
So I would suggest we simply remove those preexec_fn without replacement.
It seems polars 1.15 installs a hook to warn the user about possible deadlocks when using
multiprocessing.Pool()
with the fork method (default on Unix platforms). However, that hook triggers every time the controls goes back to the Python interpreter after a fork, which unfortunately happens when using thepreexec_fn
parameter ofsubprocess.Popen()
, which we do.Do we actually need these
preexec_fn
?The Python documentation states:
We very much need to allow having threads in an application using devlib.
Also, the following note provides a possible fix:
https://docs.python.org/3/library/subprocess.html#subprocess.Popen
This however was introduced in Python 3.11, so we would either loose compat with older Python versions, or if we want, have a degraded experience on older versions.
In practice from my experimentation, processes are not killed upon Popen garbage collection. However, if the parent process exits due to an exception (e.g.
KeyboardInterrupt
), then the processes spawned byPopen
seem to be terminated, even if the process outlived its Popen handler in the Python world. If the parent process is killed (so exits not via an exception), then the children outlive the parent process.So to summarize:
I don't think the advantages of calling
os.setpgrp()
as described inpreexec_fn
comment justifies the risk of undefined behavior outlined in the Python doc, especially when correct resource control on the Python side (e.g. context managers) can allow clean handling in cooperative requests (KeyboardInterrupt).In practice, this conflicts with polars. It can be argued that polars should not install that handler, but really the problem is that Python has a broken default that lets no-one win. This will change in 3.14, but until then we have to deal with it.
So I would suggest we simply remove those
preexec_fn
without replacement.The associated polars issue is: pola-rs/polars#20000
The text was updated successfully, but these errors were encountered: