-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New network topology for firecracker VMs #798
New network topology for firecracker VMs #798
Conversation
e3464df
to
a19d27a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, please check the tests
@ustiugov I made an isolated change (i.e., the new networking module is not used), so I AFAIC, the test failures are not related to my changes (and the unit test for the networking module passes). |
I'm restarting the runners, might be related to this |
@lrq619 looks like the workflows are all failing due the same error: Error: 4m[09:32:45] [Info] Installing Knative Serving component (gvisor mode) >>>>> [09:33:34] [Error] [exit 1] -> error: unable to read URL "https://raw.githubusercontent.com/vhive-serverless/vHive/main/configs/knative_yamls/serving-core.yaml", server reported 503 Service Unavailable, status code=503
Error: 1m[09:33:[34](https://github.com/vhive-serverless/vHive/actions/runs/6074712390/job/16500708127?pr=800#step:6:35)] [Error] Failed to install Knative Serving component!
Error: 1m[09:33:34] [Error] Faild subcommand: start_onenode_vhive_cluster! |
All tests passed |
Merge is blocked because the commit is deemed unsigned. I think that all that novel that you put in the commit message is too much for this auto check. Can you please remove it and update the branch? |
22c3440
to
87b3e72
Compare
Currently, each firecracker VM needs to use a TAP network device, to route its packages into the network stack of the physical host. When saving and restoring a function instance, the tap device name and the IP address of the functions’ server, running inside the container, are preserved (see also the current requirements for vanilla firecracker snapshot loading [1]). This leads to networking conflicts on the host and limits the snapshot restoration to a single instance per physical machine. To bypass this obstacle, the following network topology is proposed: 1. A new network namespace (e.g.: VMns4) is created for each VM, in which the TAP device from the snapshotted VM is rebuilt and receives the original IP address of the function. The TAP device will broadcast all the incoming and outgoing packets to and from the serverless function and VM’s network interface. Each VM will run in its own network namespace, leading to no conflicts on the host due to networking resources. 2. A local virtual tunnel is established between the VM inside its network namespace and the host node via a virtual ethernet pair (veth). A link is then established between the two ends of the virtual ethernet pair, in the network namespace (veth4-0) and the host namespace (veth4-1). In contrast, the default vHive configuration sets up a similar forwarding system through network bridges. 3. Inside the network namespace we add a routing rule that redirects all packets via the veth VM end towards a default gateway (172.17.0.17). Thus, all packets sent by the function will show at the hosts’ end of the tunnel. 4. To avoid IP conflicts when routing the packets to and from functions, each VM is assigned a unique clone address (172.18.0.5). All packets leaving the VM end of the virtual ethernet pair get their source address rewritten to the clone address of the corresponding VM. Packets entering the host end of the virtual ethernet pair get their destination address written to the original address of the VM. As a result, each VM still thinks it is using the original address while in reality, its address is translated to a clone address, different for every VM. This is accomplished using two rules in the NAT table corresponding to the virtual namespace of the VM. One rule is added in the POSTROUTING chain and one in the PREROUTING chain. The POSTROUTING rule alters the network packets before they are sent out in the virtual tunnel, from the VM namespace to the host, and rewrites the IP source address of the packet. Similarly, the PREROUTING rule overwrites the destination address of incoming packets, before routing. The two ensure that packets going into the virtual namespace have their destination address the original IP address of the VM (172.16.0.2), while packets coming out of the namespace have their source address the clone IP address (172.18.05). The source IP address will remain the same for all the VM in the enhanced snapshotting mode, being set to 172.16.0.2 respectively. 5. In the routing table of the host, we add a rule that dictates that any package that has as destination IP the clone IP of a VM, will be routed towards the end of the tunnel situated in the corresponding network namespace, through a set gateway (172.17.0.18). This ensures that whenever packages arrive on the host for a VM, they will be sent down the right virtual tunnel instantaneously. 6. In the hosts NFT filter table we add 2 rules for the FORWARD chain, that allow traffic from the host end of the veth pair (veth4-1) to the default host interface (eno 49) and vice versa. Introduce a new networking management component for the topology described above. 1. https://github.com/firecracker-microvm/firecracker/blob/main/docs/snapshotting/snapshot-support.md#loading-snapshots Closes vhive-serverless#797 Part of vhive-serverless#794 Signed-off-by: Georgiy Lebedev <[email protected]>
87b3e72
to
02edc1e
Compare
@leokondrashov No, the problem was you required signing commits for this repository. Fixed it by setting up commit signing. |
Summary
Closes #797
Part of #794
Implementation Notes ⚒️
See #797 for details.
This PR introduces a new networking manager which implements the new network topology described in #797, but does not replace the existing tap manager, as it requires a patch to firecracker-containerd that adds a network namespace parameter to the CreateVM request.
Unit tests for the new networking manager are provided.
External Dependencies 🍀
Breaking API Changes⚠️