Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
New network topology for firecracker VMs
Currently, each firecracker VM needs to use a TAP network device, to route its packages into the network stack of the physical host. When saving and restoring a function instance, the tap device name and the IP address of the functions’ server, running inside the container, are preserved (see also the current requirements for vanilla firecracker snapshot loading [1]). This leads to networking conflicts on the host and limits the snapshot restoration to a single instance per physical machine. To bypass this obstacle, the following network topology is proposed: 1. A new network namespace (e.g.: VMns4) is created for each VM, in which the TAP device from the snapshotted VM is rebuilt and receives the original IP address of the function. The TAP device will broadcast all the incoming and outgoing packets to and from the serverless function and VM’s network interface. Each VM will run in its own network namespace, leading to no conflicts on the host due to networking resources. 2. A local virtual tunnel is established between the VM inside its network namespace and the host node via a virtual ethernet pair (veth). A link is then established between the two ends of the virtual ethernet pair, in the network namespace (veth4-0) and the host namespace (veth4-1). In contrast, the default vHive configuration sets up a similar forwarding system through network bridges. 3. Inside the network namespace we add a routing rule that redirects all packets via the veth VM end towards a default gateway (172.17.0.17). Thus, all packets sent by the function will show at the hosts’ end of the tunnel. 4. To avoid IP conflicts when routing the packets to and from functions, each VM is assigned a unique clone address (172.18.0.5). All packets leaving the VM end of the virtual ethernet pair get their source address rewritten to the clone address of the corresponding VM. Packets entering the host end of the virtual ethernet pair get their destination address written to the original address of the VM. As a result, each VM still thinks it is using the original address while in reality, its address is translated to a clone address, different for every VM. This is accomplished using two rules in the NAT table corresponding to the virtual namespace of the VM. One rule is added in the POSTROUTING chain and one in the PREROUTING chain. The POSTROUTING rule alters the network packets before they are sent out in the virtual tunnel, from the VM namespace to the host, and rewrites the IP source address of the packet. Similarly, the PREROUTING rule overwrites the destination address of incoming packets, before routing. The two ensure that packets going into the virtual namespace have their destination address the original IP address of the VM (172.16.0.2), while packets coming out of the namespace have their source address the clone IP address (172.18.05). The source IP address will remain the same for all the VM in the enhanced snapshotting mode, being set to 172.16.0.2 respectively. 5. In the routing table of the host, we add a rule that dictates that any package that has as destination IP the clone IP of a VM, will be routed towards the end of the tunnel situated in the corresponding network namespace, through a set gateway (172.17.0.18). This ensures that whenever packages arrive on the host for a VM, they will be sent down the right virtual tunnel instantaneously. 6. In the hosts NFT filter table we add 2 rules for the FORWARD chain, that allow traffic from the host end of the veth pair (veth4-1) to the default host interface (eno 49) and vice versa. The tap manager will be refactored into a new networking managing component responsible for managing the network topology described above. 1. https://github.com/firecracker-microvm/firecracker/blob/main/docs/snapshotting/snapshot-support.md#loading-snapshots Closes #797 Part of #794 Signed-off-by: Georgiy Lebedev <[email protected]>
- Loading branch information