Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kill Globus and Prefect tasks if PERMISSION_DENIED #37

Open
davramov opened this issue Oct 22, 2024 · 0 comments
Open

Kill Globus and Prefect tasks if PERMISSION_DENIED #37

davramov opened this issue Oct 22, 2024 · 0 comments

Comments

@davramov
Copy link
Contributor

Finished in state Failed('Flow run encountered an exception. TransferError: Configured to wait 600, elapsed is 605.2364678382874 Last globus transfer nice_status PERMISSION_DENIED. Job may complete in background.')
12:55:49 PM
prefect.flow_runs
Encountered exception during execution:
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 877, in orchestrate_flow_run
    result = await flow_call.aresult()
  File "/usr/local/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 327, in aresult
    return await asyncio.wrap_future(self.future)
  File "/usr/local/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 352, in _run_sync
    result = self.fn(*self.args, **self.kwargs)
  File "/tmp/tmponkxk_1_prefect/orchestration/flows/bl832/prune.py", line 53, in prune_spot832
    prune_files(
  File "/tmp/tmponkxk_1_prefect/orchestration/flows/bl832/prune.py", line 35, in prune_files
    prune_one_safe(
  File "/tmp/tmponkxk_1_prefect/orchestration/globus/transfer.py", line 289, in prune_one_safe
    delete_id = prune_files(
  File "/tmp/tmponkxk_1_prefect/orchestration/globus/transfer.py", line 202, in prune_files
    task_wait(
  File "/tmp/tmponkxk_1_prefect/orchestration/globus/transfer.py", line 230, in task_wait
    raise TransferError(
orchestration.globus.transfer.TransferError: Configured to wait 600, elapsed is 605.2364678382874 Last globus transfer nice_status PERMISSION_DENIED. Job may complete in background.
12:55:49 PM
prefect.flow_runs
done waiting for completion of task 
12:55:49 PM
prefect.flow_runs
waiting for task with task_id 399a5c2c-90ae-11ef-bf6a-cf076a6040b9 to complete PERMISSION_DENIED

When PERMISSION_DENIED error occurs, the Globus job is not killed.
Solution: add an early break somewhere in the Prefect pruning flow if a PERMISSION_DENIED error occurs, and make sure to kill that globus task. It looks like the task_wait() function in orchestration/globus/transfer.py is the place to do this, since there is already catch for FILE_NOT_FOUND

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant