Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catch silent subproc_execute failures #2404

Merged
merged 2 commits into from
Jun 27, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions flytekit/extras/tasks/shell.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,8 @@
# Execute the command and capture stdout and stderr
result = subprocess.run(command, **kwargs)

assert result.stderr == ""

Check warning on line 83 in flytekit/extras/tasks/shell.py

View check run for this annotation

Codecov / codecov/patch

flytekit/extras/tasks/shell.py#L83

Added line #L83 was not covered by tests
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sometimes commands write debug information to stderr, so we should have a more resilient way of checking if a subprocess ran to completion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish they didn't but you're right.. unfortunately, subprocess.run doesn't capture PIPESTATUS it seems, only showing the return code for the last command. Do you have any thoughts? The only thing I can think of would be to check stderr for "command not found" or "error", but that's pretty awkward.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I don't have a great solution.

pipestatus is a bash feature, so this complicates things a little since we use /bin/sh as the default shell in ShellTask and depending on the OS that's not bash (e.g. on MacOS /bin/sh is symlinked to bash:

❯ uname -a
Darwin gondor 23.3.0 Darwin Kernel Version 23.3.0: Wed Dec 20 21:31:00 PST 2023; root:xnu-10002.81.5~7/RELEASE_ARM64_T6020 arm64

❯ readlink /private/var/select/sh
/bin/bash

).

A low-lift solution is to write a warning to the flytekit logs that we detected the use of a pipe in the command and that means we might end up not capturing the output of the command. That's not a great solution either, but at least we let the users know that they are in pipeland, which means they will have to handle it (maybe mention pipestatus?). wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that's the best solution given the current constraints. It is a bit of a corner case and I don't think we should be expected to handle every failure state. How does the latest commit look?


# Access the stdout and stderr output
return ProcessResult(result.returncode, result.stdout, result.stderr)

Expand All @@ -93,6 +95,9 @@
custom dependencies?\n{e}"""
)

except AssertionError:
raise Exception(f"Command: {command}\nexperienced a silent failure, likely due to shell=True:\n{result.stderr}")

Check warning on line 99 in flytekit/extras/tasks/shell.py

View check run for this annotation

Codecov / codecov/patch

flytekit/extras/tasks/shell.py#L98-L99

Added lines #L98 - L99 were not covered by tests


def _dummy_task_func():
"""
Expand Down
24 changes: 24 additions & 0 deletions tests/flytekit/unit/extras/tasks/test_shell.py
Original file line number Diff line number Diff line change
Expand Up @@ -370,3 +370,27 @@ def test_subproc_execute_with_shell():
subproc_execute(cmd, shell=True)
cont = open(opth).read()
assert "hello" in cont


def test_subproc_execute_missing_dep():
cmd = ["non-existent", "blah"]
with pytest.raises(Exception) as e:
subproc_execute(cmd)
assert "executable could not be found" in str(e.value)


def test_subproc_execute_error():
cmd = ["ls", "--banana"]
with pytest.raises(Exception) as e:
subproc_execute(cmd)
assert "Failed with return code" in str(e.value)


def test_subproc_execute_shell_error(tmp_path):
# This is a corner case I ran into that really shouldn't
# ever happen. The assert catches anything in stderr despite
# a 0 exit.
cmd = " ".join(["bcftools", "isec", "|", "gzip", "-c", ">", f"{tmp_path.joinpath('out')}"])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the exitcode 0 only happens because the second command (i.e. gzip) returns a 0. Bash implements a feature called PIPESTATUS, which captures the exit codes of all commands in a pipe. It's just an array that you can verify after the pipe is run, for example:

bash-5.2$ inexistet command --flag | echo bla | gzip -c > out
bash: inexistet: command not found
bash-5.2$ echo "${PIPESTATUS[@]}"
127 0 0

Notice how the first command returns 127 and the other two return 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL, thank you!

with pytest.raises(Exception) as e:
subproc_execute(cmd, shell=True)
assert "silent failure" in str(e.value)
Loading