Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dmz: don't use runc-dmz in complicated capability setups #4137

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 34 additions & 2 deletions libcontainer/container_linux.go
Original file line number Diff line number Diff line change
Expand Up @@ -454,11 +454,43 @@ func slicesContains[S ~[]E, E comparable](slice S, needle E) bool {
return false
}

func isDmzBinarySafe(c *configs.Config) bool {
// No longer needed in Go 1.21.
func slicesEqual[S ~[]E, E comparable](s1, s2 S) bool {
if len(s1) != len(s2) {
return false
}
for idx := range s1 {
if s1[idx] != s2[idx] {
return false
}
}
return true
}

func shouldUseDmzBinary(p *Process, c *configs.Config) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be the simplest solution, but it seems like a bit of a shame to have this code and not use it... Should we remove the SELinux logic too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably we should define a ternary env var like RUNC_USE_DMZ=(1|0|auto).

The default value should be auto, however, for runc v1.2, I'd suggest to just treat this as an alias for 0 (false) to minimize the incompatibility.

In a future version of runc, we may implement more clever logic for auto.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @lifubang WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RUNC_DMZ=legacy can disable the dmz feature now. You mean you worry about there will be more imcompatible reasons not included in #4158 ? But we should know that if we set the default value to legacy, the k8s e2e test case about this area will fail? How to improve this test case in k8s?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might masquerade this in k8s if we disable runc-dmz if the container is not running as root. I think if it runs as root we don't need to change the capabilities.

I'm not sure if the root detection is hard or not safe and that is why it wasn't done here. I haven't looked into it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone mentioned that checking if we are root would be sufficient. To be honest, I struggle to understand all of the interactions of capabilities with everything else in the kernel (some of the functions in commoncap are actual line noise to my eyes).

The issue is that runc binary overwrites are only relevant for uid 0 in most cases. However, if runc-dmz is only used for unprivileged container users maybe that'd be okay for now (not uid 0 and no caps).

// Older versions container-selinux (< 2.224.0) had permission issues when
// using runc-dmz, so don't use runc-dmz for those systems.
if !dmz.WorksWithSELinux(c) {
return false
}

// runc-dmz changes how capabilities are inherited by the process "runc
// init" execs if the container process is not root and the capabilities
// are not in the ambient set. There is no nice solution for this problem,
// so we can't use runc-dmz unless the four non-effective capability sets
// are the same.
//
// TODO: Unfortunately, our configuration format is such that we don't know
// the uid/gid of the container process at the moment. See newProcess() in
// utils_linux.go. So we have to assume that any User other than "0:0" is
// not root.
if p.User != "0:0" && c.Capabilities != nil &&
(!slicesEqual(c.Capabilities.Inheritable, c.Capabilities.Ambient) ||
!slicesEqual(c.Capabilities.Bounding, c.Capabilities.Ambient) ||
!slicesEqual(c.Capabilities.Permitted, c.Capabilities.Ambient)) {
return false
}

// Because we set the dumpable flag in nsexec, the only time when it is
// unsafe to use runc-dmz is when the container process would be able to
// race against "runc init" and bypass the ptrace_may_access() checks.
Expand Down Expand Up @@ -513,7 +545,7 @@ func (c *Container) newParentProcess(p *Process) (parentProcess, error) {
exePath = "/proc/self/exe"
} else {
var err error
if isDmzBinarySafe(c.config) {
if shouldUseDmzBinary(p, c.config) {
dmzExe, err = dmz.Binary(c.stateDir)
if err == nil {
// We can use our own executable without cloning if we are using
Expand Down
Loading