Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

guest-components: Bump guest-components dependency #1865

Merged

Conversation

mkulke
Copy link
Collaborator

@mkulke mkulke commented Jun 13, 2024

There has been a change in build flags in the way attestation-agent is built. cc_kbc is now always enabled as part of the coco-as and kbs features. a new ATTESTER Makefile flag has been introduced to pick the attesters that should be included in the attestation-agent build. By default all attesters are being built, which won't build ootb, since it's missing dependencies (e.g. sgx libraries)

For peerpods only a limited set of attesters actually make sense and usually you'd want to define it at build time for a given TEE architecture (e.g. azure vtpm or ibm se attester modules), so we default to ATTESTER=sample in most cases.

The AA_KBC param is now only used for templating the aa-kbc-params value in the podvm's static kata-agent config.

@mkulke mkulke requested review from bpradipt and stevenhorsman June 13, 2024 11:17
@stevenhorsman stevenhorsman requested a review from huoqifeng June 13, 2024 11:36
@mkulke mkulke added dependencies Pull requests that update a dependency file test_e2e_libvirt Run Libvirt e2e tests labels Jun 13, 2024
@@ -23,12 +23,13 @@ ARCH := $(or $(ARCH),$(HOST_ARCH))
# Normalise x86_64 / amd64 for input ARCH
ARCH := $(subst amd64,x86_64,$(ARCH))
DEB_ARCH := $(subst x86_64,amd64,$(ARCH))
AA_KBC ?= cc_kbc
AA_KBC ?= offline_fs_kbc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Magnus, just to check I understand - the attestation-agent doesn't build in the KBCs supported now, but uses attesters that communicate specifics with the matching verifier that runs on the KBS? What is the behaviour for aa_kbc_params=offline_fs_kbc::null? is the offline_fs_kbc always included as well as cc_kbc?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this part well either. it's probably good that we upgrade. cc_kbc doesn't exist anymore as a specific flag. AA now exposes only this API:

#[async_trait]
pub trait AttestationAPIs {
    /// Get attestation Token
    async fn get_token(&mut self, token_type: &str) -> Result<Vec<u8>>;

    /// Get TEE hardware signed evidence that includes the runtime data.
    async fn get_evidence(&mut self, runtime_data: &[u8]) -> Result<Vec<u8>>;

    /// Extend runtime measurement register
    async fn extend_runtime_measurement(
        &mut self,
        events: Vec<Vec<u8>>,
        register_index: Option<u64>,
    ) -> Result<()>;

    /// Check the initdata binding
    async fn check_init_data(&mut self, init_data: &[u8]) -> Result<InitdataResult>;

While it used to be like this:

#[async_trait]
pub trait AttestationAPIs {
    /// Decrypt the encrypted information in `annotation`.
    ///
    /// The specific format of `annotation` is defined by different KBC and corresponding KBS.
    /// The decryption method may be to obtain the key from KBS for decryption, or
    /// directly send the `annotation` to KBS for decryption, which depends on the
    /// specific implementation of each KBC module.
    ///
    /// TODO: move this API to Confidential Data Hub
    async fn decrypt_image_layer_annotation(
        &mut self,
        kbc_name: &str,
        kbs_uri: &str,
        annotation: &str,
    ) -> Result<Vec<u8>>;

    /// Request KBS to obtain confidential resources, including confidential data or files.
    ///
    /// `resource_uri` is a KBS Resource URI pointing to a specific resource.
    ///
    /// TODO: remove this API
    async fn download_confidential_resource(
        &mut self,
        kbc_name: &str,
        resource_path: &str,
        kbs_uri: &str,
    ) -> Result<Vec<u8>>;

    /// Get attestation Token
    async fn get_token(&mut self, token_type: &str) -> Result<Vec<u8>>;

    /// Get TEE hardware signed evidence that includes the runtime data.
    async fn get_evidence(&mut self, runtime_data: &[u8]) -> Result<Vec<u8>>;

    /// Extend runtime measurement register
    async fn extend_runtime_measurement(
        &mut self,
        events: Vec<Vec<u8>>,
        register_index: Option<u64>,
    ) -> Result<()>;

    /// Check the initdata binding
    async fn check_init_data(&mut self, init_data: &[u8]) -> Result<()>;
}

/// Attestation agent to provide attestation service.
pub struct AttestationAgent {
    kbc_module_list: KbcModuleList,
    kbc_instance_map: HashMap<String, KbcInstance>,
    config: Option<Config>,
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, I assume, the kbc is picked in the CDH config now:

# KBC related configs.
[kbc]
# Required. The KBC name. It could be `cc_kbc`, `online_sev_kbc` or
# `offline_fs_kbc`. All the items under `[credentials]` will be
# retrieved using the kbc.
name = "cc_kbc"

# Required. The URL of KBS. If `name` is either `cc_kbc` or
# `online_sev_kbc`, this URL will be used to connect to the
# CoCoKBS (for cc_kbc) or Simple-KBS (for online_sev_kbc). If
# `name` is `offline_fs_kbc`, This URL will be ignored.
url = "http://example.io:8080"

that means for offline_fs_kbc to work, the CDH would in theory need to be populated. we currently do this via CAA + Cloud-Config, if the --aa-kbc-params is set. However, if it isn't set, there will be no CDH config file. However² CDH falls back to offline_fs_kbc if there's no config file afaik ...so it should work ootb 🍀

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - I guess we don't have any upstream testing of offline_fs_kbc any more (and no downstream testing since we switched to main?), so we might have found a(nother) hole, but that's not a reason to not try and stay current.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure. things might also just miraculously keep working... let's see what the e2e tests will report.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory everything should continue working. The startup-up code is really convoluted tho. Hopefully we will clean it up when we finally switch to the init-data approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mkulke does it really make sense to keep AA_KBC option to populate agent-config.toml anymore?
Or should we use AA_KBC to create a static CDH config during pod vm image building ?

Copy link
Collaborator Author

@mkulke mkulke Jun 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes indeed, that makes sense, we would always create a CDH config file, defaulting to offline_fs_kbc if —aa-kbc-params is not provided.

it would need more work though, since CDH depends on the AA sock atm, and AA also needs the templated kata agent-config.toml

So, I suggest we defer this to another PR, introducing an AA config file.

@@ -169,7 +170,10 @@ $(GUEST_COMPONENTS_SRC):
$(call git_clone_repo_ref,$(GUEST_COMPONENTS_REPO),$(GUEST_COMPONENTS_SRC),$(GUEST_COMPONENTS_VERSION))

$(ATTESTATION_AGENT): $(FORCE_TARGET) | $(GUEST_COMPONENTS_SRC)
cd "$(GUEST_COMPONENTS_SRC)/attestation-agent" && CC= ARCH=$(ARCH) $(MAKE) KBC="$(AA_KBC)" ttrpc=true LIBC="$(LIBC)"
cd "$(GUEST_COMPONENTS_SRC)/attestation-agent" && CC= ARCH=$(ARCH) $(MAKE) \
$(if $(AA_ATTESTER),ATTESTER=$(AA_ATTESTER),) \
Copy link
Member

@stevenhorsman stevenhorsman Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we are building all-attesters now:

cd attestation-agent && RUSTFLAGS=" -C linker=s390x-linux-gnu-gcc" cargo build --release --no-default-features --features "coco_as,kbs bin,ttrpc all-attesters,openssl" --bin ttrpc-aa --target s390x-unknown-linux-gnu

and the tss libraries are only present on x86, so the s390x build it failing.

It's also worth noting that the se attester isn't included in all-attesters either:
https://github.com/confidential-containers/guest-components/blob/3f2fd793c0a67c74f7ce62b115ed5be293616386/attestation-agent/attestation-agent/Cargo.toml#L61
@huoqifeng - was this done deliberately due to different arch support?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The x86 binaries build is also failing with:

 The system library `openssl` required by crate `openssl-sys` was not found.
#10 587.3   The file `openssl.pc` needs to be installed and the PKG_CONFIG_PATH environment variable must contain its parent directory.
#10 587.3   The PKG_CONFIG_PATH environment variable is not set.
#10 587.3 
#10 587.3   HINT: if you have installed the library, try setting PKG_CONFIG_PATH to the directory containing `openssl.pc`.

so we might need to add another package like libssl-dev?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was assuming that we install those on the CI machine, but since the old build defaulted to sample, I suggest we just set it as default here, too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool - hopefully in future (once we sync versions with kata), we can just consume their binaries for some of these cases and let them solve it properly once there (though I was the one that did the guest-components build update, so it might need a closer look!)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afaiu the "kata stance" has been to build TEE-agnostic guest-components to minimize binary variance across deployments. It's probably a bit too ambitious, especially across CPU architectures. if you are bundling e.g. tpm libraries, you'll get a dynamic openssl dep for free, similarly for the se attester I think. all this won't work well with static musl libraries.

So, I assume we'll end up with specific binaries after all, but maybe we can get canonical builds from guest-components (like aa-x86_64-gnu, aa-x86_64-musl, aa-s390x-gnu, ...)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also worth noting that the se attester isn't included in all-attesters either: https://github.com/confidential-containers/guest-components/blob/3f2fd793c0a67c74f7ce62b115ed5be293616386/attestation-agent/attestation-agent/Cargo.toml#L61 @huoqifeng - was this done deliberately due to different arch support?

Yes, because IBMSE attester code can only been run on a s390x instance, while some other attesters can not been compiled on s390x, I'm not sure whether we should leverage the code in kata CI to build artifacts per TEE, I have a PR in-progress to fix it in kata-containers/kata-containers#9817

@mkulke mkulke force-pushed the mkulke/bump-guest-components branch 2 times, most recently from 48d3829 to 0f36de5 Compare June 13, 2024 14:20
@mkulke mkulke mentioned this pull request Jun 13, 2024
5 tasks
@mkulke
Copy link
Collaborator Author

mkulke commented Jun 13, 2024

oh my. you cannot specify a sample attester. you have to pick one attester that will definitely compile, otherwise the AA makefile will assign all-attesters 😭

edit: apparently ATTESTER=none will do that (sample is implicitly included)

@mkulke mkulke force-pushed the mkulke/bump-guest-components branch 3 times, most recently from 2ca3907 to 68e0f23 Compare June 13, 2024 20:25
@mkulke
Copy link
Collaborator Author

mkulke commented Jun 14, 2024

hrm. the libvirt e2e test suite is still sort of flaky, sadly. I think this has to do with large kbs images, which is being addressed atm. But if it does eventually execute there is also a real problem: The new ASR will fail with statuscode 500 on policy rejections, which will make the wget process that the e2e test uses fail (not tweakable for busybox wget). Have to decide on how to fix that:

  • Add support for asserting non-0 exit codes in the e2e framework (probably better than fishing out a substring from the response and hoping this doesn't change)
  • Disable/Remove the negative test with a deny-all policy (not sure if CAA should have any business testing the policy engine of AS, there's nothing on the CAA side that you can fix if a "deny-all" policy produces a secret)
  • Fix ASR (i hope it's just that) to not produce a 500 on policy rejections. 500s should not be a valid error code in any case (sort of contradicting my previous point that this particular e2e test might not be useful 😅 )

I'm leaning towards the last option. opinions?

Edit: fixing the ASR response code is not trivial, since information is lost in the RPC with CDH. trying to use curl, which tolerates a 500 response and opened an issue on GC.

@mkulke mkulke force-pushed the mkulke/bump-guest-components branch from 68e0f23 to 705e7cb Compare June 14, 2024 08:27
@stevenhorsman
Copy link
Member

hrm. the libvirt e2e test suite is still sort of flaky, sadly. I think this has to do with large kbs images, which is being addressed atm. But if it does eventually execute there is also a real problem: The new ASR will fail with statuscode 500 on policy rejections, which will make the wget process that the e2e test uses fail (not tweakable for busybox wget). Have to decide on how to fix that:

  • Add support for asserting non-0 exit codes in the e2e framework (probably better than fishing out a substring from the response and hoping this doesn't change)
  • Disable/Remove the negative test with a deny-all policy (not sure if CAA should have any business testing the policy engine of AS, there's nothing on the CAA side that you can fix if a "deny-all" policy produces a secret)
  • Fix ASR (i hope it's just that) to not produce a 500 on policy rejections. 500s should not be a valid error code in any case (sort of contradicting my previous point that this particular e2e test might not be useful 😅 )

I'm leaning towards the last option. opinions?

I'm not an expert on the attestation flow, but I guess my initial thoughts are both 1 & 3 - improving the e2e test to make it less brittle (or more brittle by combining error code and message checking!) sounds good, but improving the ASR behaviour sounds like a good improvement too.

Based on 3 being tricky we can start with 1 and then hope 3 happens later.

WRT 2 - I encouraged QiFeng to add the negative test, I'm probably just not trusting enough of other components and testing, and wanted to ensure that we hadn't misconfigured something such that we were getting false positives on the other test, but maybe that's something I need to work on personally and not project into CAA!

@mkulke mkulke force-pushed the mkulke/bump-guest-components branch from 705e7cb to 1edea24 Compare June 14, 2024 10:36
There has been a change in build flags in the way attestation-agent is
built. cc_kbc is now always enabled as part of the `coco-as` and `kbs`
features. a new `ATTESTER` Makefile flag has been introduced to pick the
attesters that should be included in the attestation-agent build. By
default all attesters are being built, which won't build ootb, since
it's missing dependencies (e.g. sgx libraries)

For peerpods only a limited set of attesters actually make sense and
usually you'd want to define it at build time for a given TEE
architecture (e.g. azure vtpm or ibm se attester modules), so we default
to `ATTESTER=sample` in most cases.

The `AA_KBC` param is now only used for templating the `aa-kbc-params`
value in the podvm's static kata-agent config.

Signed-off-by: Magnus Kulke <[email protected]>
@mkulke mkulke force-pushed the mkulke/bump-guest-components branch from 1edea24 to 5199e57 Compare June 14, 2024 11:35
@mkulke mkulke requested review from fitzthum and stevenhorsman June 14, 2024 13:09
Copy link
Member

@stevenhorsman stevenhorsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are some things we could do to make the configuration of the guest components easier, but that's related to the init data conversation and outside the scope of just trying to get more current in this PR, so I'm happy enough with this. Thanks!

ContainerName: pod.Spec.Containers[0].Name,
TestCommandStdoutFn: func(stdout bytes.Buffer) bool {
if strings.Contains(stdout.String(), "request unautorized") {
body := stdout.String()
if strings.Contains(strings.ToLower(body), "error") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to have a more specific error, but I think that depends on confidential-containers/guest-components#587, so this is okay for now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess what we want to do is assert on an http status code (401 in this case once we have this). instead of relying on error strings

@mkulke mkulke requested a review from surajssd June 14, 2024 14:22
@mkulke mkulke merged commit 419bddf into confidential-containers:main Jun 15, 2024
27 checks passed
@mkulke mkulke deleted the mkulke/bump-guest-components branch June 15, 2024 11:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file test_e2e_libvirt Run Libvirt e2e tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants