SecureComms: Add Support for activating using InitData #2072

davidhadas · 2024-09-30T10:10:41Z

See:

Add apj.json to InitData
apf.josn includes an sc key used to activate secure comms (if not already activated using an agent-protocol-forwarder.service flag)

davidhadas · 2024-10-08T04:25:30Z

cc: @bpradipt

Install KBS and test SecureComms with KBS Based on confidential-containers#2072 which should be merged first Signed-off-by: David Hadas <[email protected]>

stevenhorsman

Some initial comments. It would be good to fix the typos in the commit message too.

stevenhorsman · 2024-10-08T13:18:03Z

src/cloud-api-adaptor/cmd/agent-protocol-forwarder/main.go

@@ -52,6 +54,8 @@ func load(path string, obj interface{}) error {
 		return fmt.Errorf("failed to decode a Agent Protocol Forwarder config file file: %s: %w", path, err)
 	}

+	logger.Printf("succesful loading config from %s\n", path)


Is this supposed to say "successfully loaded config..."?

stevenhorsman · 2024-10-08T13:20:49Z

src/cloud-api-adaptor/cmd/cloud-api-adaptor/main.go

@@ -114,7 +114,7 @@ func (cfg *daemonConfig) Setup() (cmd.Starter, error) {
 		flags.BoolVar(&secureComms, "secure-comms", false, "Use SSH to secure communication between cluster and peer pods")
 		flags.StringVar(&secureCommsInbounds, "secure-comms-inbounds", "", "Inbound tags for secure communication tunnels")
 		flags.StringVar(&secureCommsOutbounds, "secure-comms-outbounds", "", "Outbound tags for secure communication tunnels")
-		flags.StringVar(&secureCommsKbsAddr, "secure-comms-kbs", "kbs-service.kbs-operator-system:8080", "Address of a KBS Service for Secure-Comms")
+		flags.StringVar(&secureCommsKbsAddr, "secure-comms-kbs", "kbs-service.trustee-operator-system:8080", "Address of a KBS Service for Secure-Comms")


The trustee operator namespace changes should be in a separate commit with an explanation that it's triggered by the change in the trustee-operator project

I've just noticed a bunch of these change are in #2073. Is this PR supposed to depend on that one?

Yes - #2073 should be merged first

stevenhorsman · 2024-10-08T13:27:36Z

src/cloud-api-adaptor/docs/SecureComms.md

+```sh
+kubectl get secrets -n trustee-operator-system
+NAME                  TYPE     DATA   AGE
+kbs-auth-public-key   Opaque   1      28h
+kbs-client            Opaque   1      28h
+```


What's the reason for this command being added?

I've just seen this code is under #2065 now. What's going on with these PR and their duplication of code?

This is a slight improvement to the SecureComms doc which shows the correct result after following the instructions trustee operator and following the recommendation: "Make sure to uncomment the secret generation as recommended for both public and private key (kbs-auth-public-key and kbs-client secrets). "

We can add it to #2073 if it will make things clearer

I prefer to leave this extra documentation detail as is, although it also appears in #2065 - unless someone finds this very disturbing.

stevenhorsman · 2024-10-08T13:32:55Z

src/cloud-api-adaptor/docs/SecureComms.md

+kubectl -n confidential-containers-system  get cm peer-pods-cm  -o yaml | sed "s/SECURE_COMMS: \"false\"/SECURE_COMMS: \"true\"/"|kubectl apply -f -
+```
+
+Set InitData to point KBC services to IP address 127.0.0.1 


Should this have a heading like Build a podvm that enforces Secure-Comms (Optional) section does as it's an alternative to it?

Some expansion of the explanation of what this is doing and why would be nice.

I will be adapting the documentation based on this comment. Please reconsider the next version.

src/cloud-api-adaptor/docs/SecureComms.md

stevenhorsman · 2024-10-08T13:46:12Z

src/cloud-api-adaptor/cmd/cloud-api-adaptor/main.go

@@ -114,7 +114,7 @@ func (cfg *daemonConfig) Setup() (cmd.Starter, error) {
 		flags.BoolVar(&secureComms, "secure-comms", false, "Use SSH to secure communication between cluster and peer pods")
 		flags.StringVar(&secureCommsInbounds, "secure-comms-inbounds", "", "Inbound tags for secure communication tunnels")
 		flags.StringVar(&secureCommsOutbounds, "secure-comms-outbounds", "", "Outbound tags for secure communication tunnels")
-		flags.StringVar(&secureCommsKbsAddr, "secure-comms-kbs", "kbs-service.kbs-operator-system:8080", "Address of a KBS Service for Secure-Comms")
+		flags.StringVar(&secureCommsKbsAddr, "secure-comms-kbs", "kbs-service.trustee-operator-system:8080", "Address of a KBS Service for Secure-Comms")


I've just noticed a bunch of these change are in #2073. Is this PR supposed to depend on that one?

stevenhorsman · 2024-10-08T14:08:56Z

src/cloud-api-adaptor/docs/SecureComms.md

+```sh
+kubectl get secrets -n trustee-operator-system
+NAME                  TYPE     DATA   AGE
+kbs-auth-public-key   Opaque   1      28h
+kbs-client            Opaque   1      28h
+```


I've just seen this code is under #2065 now. What's going on with these PR and their duplication of code?

mkulke

I understand that kata is proceeding with an init-data approach that is passed from the runtime to the agent via a SetInitData() RPC call for the upcoming CoCo release.

This looks like a CAA-specific extension to InitData. If the project adopts kata's init-data approach, would this still work?

Add apj.json to InitData apf.josn includes a secure-comms key used to activate secure comms (if not already activated using an agent-protocol-forwarder.service flag) Signed-off-by: David Hadas <[email protected]>

davidhadas · 2024-10-21T10:44:06Z

@huoqifeng, can you please review this PR as it extends your work in #1895

davidhadas · 2024-10-21T11:07:16Z

@mkulke, thanks for pointing this,

I understand that kata is proceeding with an init-data approach that is passed from the runtime to the agent via a SetInitData() RPC call for the upcoming CoCo release.

This looks like a CAA-specific extension to InitData. If the project adopts kata's init-data approach, would this still work?

This PR extends #1895 which introduced a CAA-specific extension to InitData. Here we added configuration of the APF to the already merged configuration of AA and CDH.

Regardless, it seems that the SetInitData() RPC call cannot be used under peer-pods (even without this PR) - see kata-containers/kata-containers#10163 (comment)

mkulke · 2024-10-21T11:50:29Z

@mkulke, thanks for pointing this,

I understand that kata is proceeding with an init-data approach that is passed from the runtime to the agent via a SetInitData() RPC call for the upcoming CoCo release.
This looks like a CAA-specific extension to InitData. If the project adopts kata's init-data approach, would this still work?

This PR extends #1895 which introduced a CAA-specific extension to InitData. Here we added configuration of the APF to the already merged configuration of AA and CDH.

Regardless, it seems that the SetInitData() RPC call cannot be used under peer-pods (even without this PR) - see kata-containers/kata-containers#10163 (comment)

Can you elaborate on why a SetInitData() RPC wouldn't work for CAA? I think I POC'd that approach when the init-data design was being discussed to make sure it would work and I didn't run into issues.

My understanding is that agent-protocol-forwarder, daemon.json and user-data are implementation details of CAA that are responsible for setting the stage to allow runtime <=> agent communication and kata should ideally not be concerned about those.

stevenhorsman · 2024-10-21T12:08:05Z

Can you elaborate on why a SetInitData() RPC wouldn't work for CAA? I think I POC'd that approach when the init-data design was being discussed to make sure it would work and I didn't run into issues.

In your prototype did you keep the guest-components managed by systemd, or the kata-agent? It seems to be that in if the kata-agent isn't managing them and they are started before the kata-agent then relying on config from kata-agent endpoint doesn't really make sense as we then have this weird undefined start until the kata is up and running and receiving the setInitData request.

In fairness I'm still not sure why bare-metal doesn't adopt the approach of systemd managing the processes like the peer pod does anyway to avoid this either, so I'm obviously missing something.

davidhadas · 2024-10-21T12:45:03Z

@mkulke,

Can you elaborate on why a SetInitData() RPC wouldn't work for CAA?

To add to what @stevenhorsman indicated, we use attestation when connecting CAA to APF in SecureComms.
To do attestation, we need the measurements of the configuration.
To get the measurements, we need InitData.
We connect runtime to kata-agent only after the attestation is done and keys delivered - keys we are using to complete the establishment of the CAA to APF secure communication channel.

I would assume that allowing someone to change the InitData (using the kata agent RPC) after attestation was already done and after keys delivered to the podvm, is not what we want to do. Am I missing something?

mkulke · 2024-10-21T13:01:28Z

In your prototype did you keep the guest-components managed by systemd, or the kata-agent? It seems to be that in if the kata-agent isn't managing them and they are started before the kata-agent then relying on config from kata-agent endpoint doesn't really make sense as we then have this weird undefined start until the kata is up and running and receiving the setInitData request.

In fairness I'm still not sure why bare-metal doesn't adopt the approach of systemd managing the processes like the peer pod does anyway to avoid this either, so I'm obviously missing something.

I would have to dig a bit to find the details, but I think in the PoC code kata-agent was talking via ttRPC to attestation-agent to update the configuration, so it was indeed half-initialized. however, that endpoint is about to be removed as I understand, though. somehow AA will now have to perform a binding to the TEE itself (e.g. compare host-data, extend an init-data PCR).

mkulke · 2024-10-21T13:42:49Z

We connect runtime to kata-agent only after the attestation is done and keys delivered - keys we are using to complete the establishment of the CAA to APF secure communication channel.

I understand this applies if you use the secure communications feature, but in a default CAA installation there is no attestation ceremony before runtime <=> agent communication is established, afaik?

davidhadas requested a review from a team as a code owner September 30, 2024 10:10

davidhadas force-pushed the secComms_initData branch 2 times, most recently from 8630009 to 206ce0f Compare October 7, 2024 12:37

davidhadas mentioned this pull request Oct 8, 2024

SecureComms: E2e test SecureComms with KBS #2093

Open

stevenhorsman reviewed Oct 8, 2024

View reviewed changes

davidhadas force-pushed the secComms_initData branch from 206ce0f to 1a5d0a9 Compare October 9, 2024 08:50

mkulke reviewed Oct 11, 2024

View reviewed changes

davidhadas mentioned this pull request Oct 16, 2024

Feat | Implement initdata for bare-metal/qemu hypervisor kata-containers/kata-containers#10163

Draft

SecureComms: Activate PP SC using InitData

914490c

Add apj.json to InitData apf.josn includes a secure-comms key used to activate secure comms (if not already activated using an agent-protocol-forwarder.service flag) Signed-off-by: David Hadas <[email protected]>

davidhadas force-pushed the secComms_initData branch from 1a5d0a9 to 914490c Compare October 21, 2024 05:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SecureComms: Add Support for activating using InitData #2072

SecureComms: Add Support for activating using InitData #2072

davidhadas commented Sep 30, 2024 •

edited

Loading

davidhadas commented Oct 8, 2024

stevenhorsman left a comment

stevenhorsman Oct 8, 2024

stevenhorsman Oct 8, 2024

stevenhorsman Oct 8, 2024

davidhadas Oct 8, 2024

stevenhorsman Oct 8, 2024

stevenhorsman Oct 8, 2024

davidhadas Oct 8, 2024

davidhadas Oct 9, 2024

stevenhorsman Oct 8, 2024

stevenhorsman Oct 8, 2024

davidhadas Oct 9, 2024

stevenhorsman Oct 8, 2024

stevenhorsman Oct 8, 2024

mkulke left a comment

davidhadas commented Oct 21, 2024

davidhadas commented Oct 21, 2024 •

edited

Loading

mkulke commented Oct 21, 2024

stevenhorsman commented Oct 21, 2024

davidhadas commented Oct 21, 2024 •

edited

Loading

mkulke commented Oct 21, 2024

mkulke commented Oct 21, 2024

SecureComms: Add Support for activating using InitData #2072

Are you sure you want to change the base?

SecureComms: Add Support for activating using InitData #2072

Conversation

davidhadas commented Sep 30, 2024 • edited Loading

davidhadas commented Oct 8, 2024

stevenhorsman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mkulke left a comment

Choose a reason for hiding this comment

davidhadas commented Oct 21, 2024

davidhadas commented Oct 21, 2024 • edited Loading

mkulke commented Oct 21, 2024

stevenhorsman commented Oct 21, 2024

davidhadas commented Oct 21, 2024 • edited Loading

mkulke commented Oct 21, 2024

mkulke commented Oct 21, 2024

davidhadas commented Sep 30, 2024 •

edited

Loading

davidhadas commented Oct 21, 2024 •

edited

Loading

davidhadas commented Oct 21, 2024 •

edited

Loading