Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it possible to kill guest execution when running a host function. #192

Open
danbugs opened this issue Jan 29, 2025 · 1 comment
Open

Comments

@danbugs
Copy link
Contributor

danbugs commented Jan 29, 2025

Currently, it is not possible to interrupt or cancel execution if the guest is calling a host function. This means that, if the host function hangs, then the call will never return or get cancelled. This gets surfaced, like so:

HyperlightError::GuestExecutionHungOnHostFunctionCall() => {}

One possible solution

When running with the seccomp feature on, host functions are wrapped in their own thread like so:

let join_handle = std::thread::Builder::new()
.name(format!("Host Function Worker Thread for: {:?}", name_cloned))
.spawn(move || {
// We have a `catch_unwind` here because, if a disallowed syscall is issued,
// we handle it by panicking. This is to avoid returning execution to the
// offending host function—for two reasons: (1) if a host function is issuing
// disallowed syscalls, it could be unsafe to return to, and (2) returning
// execution after trapping the disallowed syscall can lead to UB (e.g., try
// running a host function that attempts to sleep without `SYS_clock_nanosleep`,
// you'll block the syscall but panic in the aftermath).
match std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| call_func(&host_funcs_cloned, &name_cloned, args_cloned))) {
Ok(val) => val,
Err(err) => {
if let Some(crate::HyperlightError::DisallowedSyscall) = err.downcast_ref::<crate::HyperlightError>() {
return Err(crate::HyperlightError::DisallowedSyscall)
}
crate::log_then_return!("Host function {} panicked", name_cloned);
}
}
})?;

You could leverage these threads to cancel execution in the same way we cancel execution in the guest:

let thread_id = self.execution_variables.get_thread_id()?;
if thread_id == u64::MAX {
log_then_return!("Failed to get thread id to signal thread");
}
let mut count: i32 = 0;
// We need to send the signal multiple times in case the thread was between checking if it
// should be cancelled and entering the run loop
// We cannot do this forever (if the thread is calling a host function that never
// returns we will sit here forever), so use the timeout_wait_to_cancel to limit the number
// of iterations
let number_of_iterations =
self.configuration.max_wait_for_cancellation.as_micros() / 500;
while !self.execution_variables.run_cancelled.load() {
count += 1;
if count > number_of_iterations.try_into().unwrap() {
break;
}
info!(
"Sending signal to thread {} iteration: {}",
thread_id, count
);
let ret = unsafe { pthread_kill(thread_id, SIGRTMIN()) };
// We may get ESRCH if we try to signal a thread that has already exited
if ret < 0 && ret != ESRCH {
log_then_return!("error {} calling pthread_kill", ret);
}
std::thread::sleep(Duration::from_micros(500));
}

Though, this would mean always wrapping host function calls with an extra thread and that might be naive in terms of perf.

@syntactically
Copy link
Contributor

See also: #243 for API design around this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants