Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treating unknown/unavailable capabilities #4426

Open
kolyshkin opened this issue Oct 4, 2024 · 5 comments
Open

Treating unknown/unavailable capabilities #4426

kolyshkin opened this issue Oct 4, 2024 · 5 comments

Comments

@kolyshkin
Copy link
Contributor

kolyshkin commented Oct 4, 2024

Description

This is more of a request for comments than a bug.

Background

(a) runc and runtime-spec used to treat capabilities from config.json as a must (i.e. if a requested capability could not be set, an error must be returned).

(b) This has changed in runtime-spec since opencontainers/runtime-spec#1094. The spec now says:

Any value which cannot be mapped to a relevant kernel interface, or cannot be granted otherwise MUST be logged as a warning by the runtime. Runtimes SHOULD NOT fail if the container configuration requests capabilities that cannot be granted, for example, if the runtime operates in a restricted environment with a limited set of capabilities.

(c) The above change was partially implemented in #2854. By "partially" I mean that runc now warns, not errors, in the following two cases:

  • when a capability name specified is not recognized (for example, a misspelled or made up name, or a recently added capability that runc is not yet aware);
  • when a capability is known, but not (yet) supported by a currently running kernel (i.e. it's an older kernel).

(d) Also, due to a bug in capability package (fixed in it's moby/sys fork, see kolyshkin/capability#3), any error when raising an ambient capability (using prctl(PR_CAP_AMBIENT)) which is not present in both the permitted and the inheritable sets was silently ignored.

(e) For all other cases, such as when a valid (known and supported) capability can not be granted, runc still returns an error. This includes:

  • dropping capabilities from the bounding set (using prctl(PR_CAPBSET_DROP)) when runc doesn't have CAP_SETPCAP(unlikely to happen);
  • any errors from capset(2); quoting capset(2) man page:
  • An attempt was made to add a capability to the permitted set;
  • An attempt was made to set a capability in the effective set that is not in the permitted set.
  • An attempt was made to add a capability to the inheritable set, and either:
    • that capability was not in the caller's bounding set; or
    • the capability was not in the caller's permitted set and the caller lacked the CAP_SETPCAP capability in its effective set.

Questions

  1. In (d) above, do we want to switch from a silence to a warning (this is what libct/cap: switch to moby/sys/capability, lazy init #4358 does) or to an error?
  2. In (e) above, do we want to switch from an error to a warning?

@opencontainers/runc-maintainers PTAL.

@kolyshkin
Copy link
Contributor Author

@kolyshkin
Copy link
Contributor Author

kolyshkin commented Oct 4, 2024

Here is how I see this.

We should distinguish between two types of issues:

  • Unknown or unsupported capability (as in (c) above). Those we want to warn about.
  • Improper capabilities configuration (as in (d) and (e) above). Those we want to error out on (like we do now).

We should probably clarify that in runtime-spec, too.

An exception to 2 is (d). Since we happened to ignore errors from setting ambient capabilities, we should not error out as it might break some users. Instead, introduce a warning and change it to an error in a later version.

So, answering my questions above:

  1. Yes, switch to warning (with an intention to make it an error in a later version).
  2. No, keep it an error as this is clearly a bug in configuration, not an issue with the older kernel and/or runc.

@lifubang
Copy link
Member

lifubang commented Oct 5, 2024

Maybe we can add some option fields in runtime-spec, the aim of it is to let the user decide whether show an error or a warning.
The default value of these options should honor the past bugs or partial implementation to have a compatibility.

@kolyshkin
Copy link
Contributor Author

Maybe we can add some option fields in runtime-spec, the aim of it is to let the user decide whether show an error or a warning.

In general I'm against such things. Unless there is a strong use case for this, I'd rather not add options like that. I mean, if there is some practical scenario which demonstrates a need for such an option, I'd rather not add it. Software is already very complicated, and by adding such options, we're making it even more complicated.

I'd rather have some sane and backward-compatible default and stick with it.

@opencontainers/runtime-maintainers WDYT?

@rata
Copy link
Member

rata commented Oct 9, 2024

@kolyshkin thanks for providing all the links, it really makes a difference!

I agree with you here about no adding more config options, and here about the answers.

The short answers for me would be:

  1. SGTM, let's add a warning.
  2. No, I don't see the advantage of creating the container without honoring the capabilities when we know we should/can do it, and we've been failing in earlier versions of runc. I can only see disadvantages.

Also, it seems worth noting that Kubernetes doesn't support to set ambient caps (there was this KEP kubernetes/enhancements#2763, that was never implemented and dropped for userns) and it seems docker doesn't set ambient capabilities either. Therefore, a big chunk of our users won't be affected by this.

This is what I tested wtith docker:

$ docker run --cap-add=SYS_ADMIN -ti --rm debian bash
root@a3b5a94522a2:/# grep -i cap /proc/self/status 
CapInh:	0000000000000000
CapPrm:	00000000a82425fb
CapEff:	00000000a82425fb
CapBnd:	00000000a82425fb
CapAmb:	0000000000000000

@cyphar cyphar mentioned this issue Oct 21, 2024
21 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants