Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connectivity: extend node to node encryption tests #1870

Merged

Conversation

3u13r
Copy link
Contributor

@3u13r 3u13r commented Jul 27, 2023

This PR extends the node-to-node encryption tests. Before no actions would fail when running them with encryption disabled (Before cilium/cilium#26712 it was 1/3, because the tunneling feature wasn't correctly detected) in one of my environments (WireGuard, tunnel, vxlan, endpointRoutes). I've adapted them so they work with native routing and tunneling with WireGuard (as far as I could test it), i.e. all actions fail when encryption is disabled and succeed if it is enabled.

The first two commits are just some minor cleanup and documentation in the feature detection. Feedback regarding the detection preference comment is especially welcome.

Output before:

[=] Test [node-to-node-encryption]
  [-] Scenario [node-to-node-encryption/node-to-node-encryption]
  🐛 Detected cilium_vxlan iface for communication among client and server nodes
  🐛 Running in bg: tcpdump -i cilium_vxlan --immediate-mode -w /tmp/node-to-node-encryption.pcap src host 10.244.2.111 and dst host 192.168.179.114 and icmp -c 1
  [.] Action [node-to-node-encryption/node-to-node-encryption/ping-ipv4: cilium-test/client-c4bfddc44-jgzxz (10.244.2.111) -> cilium-test/host-netns-h6nn8 (192.168.179.114:0)]
  🐛 Executing command [ping -c 1 -W 2 -w 10 192.168.179.114]
  🐛 Detected cilium_vxlan iface for communication among client and server nodes
  🐛 Running in bg: tcpdump -i cilium_vxlan --immediate-mode -w /tmp/node-to-node-encryption.pcap src host 192.168.179.108 and dst host 192.168.179.114 and icmp -c 1
  [.] Action [node-to-node-encryption/node-to-node-encryption/ping-ipv4: cilium-test/host-netns-2k57p (192.168.179.108) -> cilium-test/host-netns-h6nn8 (192.168.179.114:0)]
  🐛 Executing command [ping -c 1 -W 2 -w 10 192.168.179.114]
  🐛 Detected cilium_vxlan iface for communication among client and server nodes
  🐛 Running in bg: tcpdump -i cilium_vxlan --immediate-mode -w /tmp/node-to-node-encryption.pcap src host 192.168.179.108 and dst host 10.244.1.159 and tcp -c 1
  [.] Action [node-to-node-encryption/node-to-node-encryption/curl-ipv4: cilium-test/host-netns-2k57p (192.168.179.108) -> cilium-test/echo-other-node-646976b7dd-nxl4x (10.244.1.159:8080)]
  🐛 Executing command [curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 http://10.244.1.159:8080]

Output after:

[=] Test [node-to-node-encryption]
  [-] Scenario [node-to-node-encryption/node-to-node-encryption]
  🐛 Running /bin/sh -c ip -o route get 192.168.179.114 | grep -oE 'dev [^ ]*' | cut -d' ' -f2
  🐛 Running /bin/sh -c ip -o route get 192.168.179.114 | grep -oE 'src [^ ]*' | cut -d' ' -f2
  🐛 Running in bg: tcpdump -i ens5 --immediate-mode -w /tmp/node-to-node-encryption.pcap src host 192.168.179.108 and dst host 192.168.179.114 and icmp
  [.] Action [node-to-node-encryption/node-to-node-encryption/ping-ipv4: cilium-test/client-c4bfddc44-jgzxz (10.244.2.111) -> cilium-test/host-netns-h6nn8 (192.168.179.114:0)]
  🐛 Executing command [ping -c 1 -W 2 -w 10 192.168.179.114]
  ❌ Captured unencrypted pkt (count=1 packet)
  🐛 Captured pkts:
10:44:05.285615 IP 192.168.179.108 > 192.168.179.114: ICMP echo request, id 11628, seq 1, length 64

  🐛 Running /bin/sh -c ip -o route get 192.168.179.114 | grep -oE 'dev [^ ]*' | cut -d' ' -f2
  🐛 Running /bin/sh -c ip -o route get 192.168.179.114 | grep -oE 'src [^ ]*' | cut -d' ' -f2
  🐛 Running in bg: tcpdump -i ens5 --immediate-mode -w /tmp/node-to-node-encryption.pcap src host 192.168.179.108 and dst host 192.168.179.114 and icmp
  [.] Action [node-to-node-encryption/node-to-node-encryption/ping-ipv4: cilium-test/host-netns-2k57p (192.168.179.108) -> cilium-test/host-netns-h6nn8 (192.168.179.114:0)]
  🐛 Executing command [ping -c 1 -W 2 -w 10 192.168.179.114]
  ❌ Captured unencrypted pkt (count=1 packet)
  🐛 Captured pkts:
10:44:07.285568 IP 192.168.179.108 > 192.168.179.114: ICMP echo request, id 2, seq 1, length 64

  🐛 Running /bin/sh -c ip -o route get 10.244.1.159 | grep -oE 'dev [^ ]*' | cut -d' ' -f2
  🐛 Running /bin/sh -c ip -o route get 10.244.1.159 | grep -oE 'src [^ ]*' | cut -d' ' -f2
  🐛 Running in bg: tcpdump -i cilium_vxlan --immediate-mode -w /tmp/node-to-node-encryption.pcap src host 10.244.2.4 and dst host 10.244.1.159 and tcp
  [.] Action [node-to-node-encryption/node-to-node-encryption/curl-ipv4: cilium-test/host-netns-2k57p (192.168.179.108) -> cilium-test/echo-other-node-646976b7dd-nxl4x (10.244.1.159:8080)]
  🐛 Executing command [curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 http://10.244.1.159:8080]
  ❌ Captured unencrypted pkt (count=6 packets)
  🐛 Captured pkts:
10:44:09.107766 IP 10.244.2.4.55498 > 10.244.1.159.8080: Flags [S], seq 1124524169, win 64860, options [mss 1410,sackOK,TS val 2752214173 ecr 0,nop,wscale 7], length 0
10:44:09.108262 IP 10.244.2.4.55498 > 10.244.1.159.8080: Flags [.], ack 766383770, win 507, options [nop,nop,TS val 2752214174 ecr 3559090258], length 0
10:44:09.108331 IP 10.244.2.4.55498 > 10.244.1.159.8080: Flags [P.], seq 0:80, ack 1, win 507, options [nop,nop,TS val 2752214174 ecr 3559090258], length 80: HTTP: GET / HTTP/1.1
10:44:09.109436 IP 10.244.2.4.55498 > 10.244.1.159.8080: Flags [.], ack 2446, win 498, options [nop,nop,TS val 2752214175 ecr 3559090259], length 0
10:44:09.109563 IP 10.244.2.4.55498 > 10.244.1.159.8080: Flags [F.], seq 80, ack 2446, win 502, options [nop,nop,TS val 2752214175 ecr 3559090259], length 0
10:44:09.110085 IP 10.244.2.4.55498 > 10.244.1.159.8080: Flags [.], ack 2447, win 502, options [nop,nop,TS val 2752214176 ecr 3559090260], length 0

@3u13r 3u13r requested a review from a team as a code owner July 27, 2023 09:45
@3u13r 3u13r requested a review from derailed July 27, 2023 09:45
@maintainer-s-little-helper
Copy link

Commits 7c5c9d4, 05a5030, 7c7653d, 3e0b341 do not contain "Signed-off-by".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@3u13r 3u13r temporarily deployed to ci July 27, 2023 09:45 — with GitHub Actions Inactive
@3u13r 3u13r force-pushed the 3u13r/feat/node-to-node-encryption-tests branch from 3e0b341 to 58b7b18 Compare July 27, 2023 09:54
@3u13r 3u13r temporarily deployed to ci July 27, 2023 09:54 — with GitHub Actions Inactive
@3u13r
Copy link
Contributor Author

3u13r commented Aug 1, 2023

I'm already working on the follow-up PR to this (porting the pod-to-pod-strict-mode test to the cilium-cli). Since the future changes are based on this PR, could someone maybe take a look. Maybe @brb?

@brb brb self-requested a review August 1, 2023 11:20
@maintainer-s-little-helper
Copy link

Commit e9833c2 does not contain "Signed-off-by".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper
Copy link

Commit e9833c2 does not contain "Signed-off-by".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@3u13r 3u13r temporarily deployed to ci August 2, 2023 10:47 — with GitHub Actions Inactive
@3u13r 3u13r force-pushed the 3u13r/feat/node-to-node-encryption-tests branch from b4bfd7b to 7ed9b37 Compare August 2, 2023 11:12
@3u13r 3u13r temporarily deployed to ci August 2, 2023 11:12 — with GitHub Actions Inactive
@3u13r
Copy link
Contributor Author

3u13r commented Aug 4, 2023

I rebased the PR to main to fix the CI. btw: the follow-up code implementing the strict mode pod-to-pod test is already finished. I'll create a PR once this is merged, as I depend on this PR.

@3u13r
Copy link
Contributor Author

3u13r commented Aug 10, 2023

Are there any updates regarding the timeline of the review?

@brb
Copy link
Member

brb commented Aug 11, 2023

@3u13r Sorry, a bit swamped. I will try to review in 1-2 weeks.

@3u13r
Copy link
Contributor Author

3u13r commented Aug 14, 2023

No worries, just want to make sure that this PR isn't forgotten.

Copy link
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I did a quick skim of the PR. I will do the review once the commit msgs have been extended.

connectivity/check/features.go Show resolved Hide resolved
connectivity/tests/encryption.go Show resolved Hide resolved
The default value is not necessarily reflected in the ConfigMap.
Therefore, we need to set the right values when the keys are not found.

Accoding to the documented defaults in Cilium's deployment, tunneling
with vxlan is enabled if not specified otherwise.

Signed-off-by: Leonard Cohnen <[email protected]>
When tunneling is enabled, and the feature is correctly picked up the
the cilium-cli, the encryption test always listens on the tunneling
interface for leaked packets. This is wrong as e.g. node-to-node
communication is not encapsulated and therefore does not pass the
cilium_vxlan interface.

Signed-off-by: Leonard Cohnen <[email protected]>
@3u13r 3u13r force-pushed the 3u13r/feat/node-to-node-encryption-tests branch from 7ed9b37 to 9e40129 Compare August 14, 2023 11:27
@3u13r 3u13r temporarily deployed to ci August 14, 2023 11:27 — with GitHub Actions Inactive
@3u13r
Copy link
Contributor Author

3u13r commented Aug 14, 2023

I extended both commit messages. I assume that the other two commits are small enough to be explained by only the commit message. If not, I'm happy to extend those also.
It seems to me that the failing test is unrelated to the code I changed.

@3u13r 3u13r requested a review from brb August 15, 2023 08:44
@brb
Copy link
Member

brb commented Aug 16, 2023

Thanks! HOWTO test it against Cilium (we should put it into the contrib docs):

  1. Get the CLI CI image tag from https://github.com/cilium/cilium-cli/actions/runs/5855144718/job/15872443885?pr=1870.
  2. Open a draft PR against cilium/cilium with the image tag set in https://github.com/cilium/cilium/blob/main/.github/workflows/conformance-e2e.yaml#L56 and https://github.com/cilium/cilium/blob/main/.github/workflows/conformance-e2e.yaml#L55 unset.
  3. Comment /ci-e2e in that PR.

@3u13r
Copy link
Contributor Author

3u13r commented Aug 16, 2023

Done. Not 100% how to interpret the test failure but I assume that it is unrelated. On the other hand, I couldn't find a recent similar failure.

@brb
Copy link
Member

brb commented Aug 17, 2023

Thanks. The ci-e2e has passed (https://github.com/cilium/cilium/actions/runs/5880176520). The rest is irrelevant.

@3u13r
Copy link
Contributor Author

3u13r commented Aug 17, 2023

ci-ipsec-e2e also looks good: https://github.com/cilium/cilium/actions/runs/5889185248

Copy link
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Unrelated to your PR, but the encryption leak test is a bit brittle, as it depends on the src and netdev derivations, which might become wrong, thus giving a false negative.

To eliminate the src detection, we could use a unique server port, and use it in the tcpdump filter.

However, improving/eliminating the netdev detection is not that trivial.

@brb brb added ready-to-merge This PR has passed all tests and received consensus from code owners to merge. area/CI Continuous Integration testing issue or flake labels Aug 18, 2023
@3u13r
Copy link
Contributor Author

3u13r commented Aug 18, 2023

Thanks for the review and the pointers for testing it against cilium!

I can have a look at making this test less brittle. What I always do locally is to disable the encryption and check that the test fails. Since the tests are run in the ci, inverting the test when encryption is disabled is technically an option. But I'll look at the simpler options first.

@michi-covalent michi-covalent merged commit 33ddb44 into cilium:main Aug 21, 2023
19 checks passed
@3u13r 3u13r deleted the 3u13r/feat/node-to-node-encryption-tests branch September 18, 2023 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/CI Continuous Integration testing issue or flake ready-to-merge This PR has passed all tests and received consensus from code owners to merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants