Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow snapshot tap changes #4731

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,15 +119,20 @@ and this project adheres to
Support for VMGenID via DeviceTree bindings exists only on mainline 6.10 Linux
onwards. Users of Firecracker will need to backport the relevant patches on
top of their 6.1 kernels to make use of the feature.

- [#4732](https://github.com/firecracker-microvm/firecracker/pull/4732),
[#4733](https://github.com/firecracker-microvm/firecracker/pull/4733),
[#4741](https://github.com/firecracker-microvm/firecracker/pull/4741),
[#4746](https://github.com/firecracker-microvm/firecracker/pull/4746): Added
official support for 6.1 microVM guest kernels.

- [#4743](https://github.com/firecracker-microvm/firecracker/pull/4743): Added
support for `-h` help flag to the Jailer. The Jailer will now print the help
message with either `--help` or `-h`.

- [#4731](https://github.com/firecracker-microvm/firecracker/pull/4731): Added
support for modifying the host TAP device name during snapshot restore.
andrewla marked this conversation as resolved.
Show resolved Hide resolved

### Changed

### Deprecated
Expand Down
60 changes: 60 additions & 0 deletions docs/snapshotting/network-for-clones.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,66 @@ Otherwise, packets originating from the guest might be using old Link Layer
Address for up to arp cache timeout seconds. After said timeout period,
connectivity will work both ways even without an explicit flush.

### Renaming host device names
andrewla marked this conversation as resolved.
Show resolved Hide resolved

In some environments where the jailer is not being used, restoring a snapshot
may be tricky because the tap device on the host will not be the same as the tap
device that the original VM was mapped to when it was snapshotted, as when the
tap device come from a pool of such devices.

In this case you can use the `network_overrides` parameter to snapshot restore
to specify which guest network device maps to which host tap device.

For example, if we have a network interface named `eth0` in the snapshotted
microVM. We can override it to point to the host device `vmtap01` during
snapshot resume, like this:


```bash
curl --unix-socket /tmp/firecracker.socket -i \
-X PUT 'http://localhost/snapshot/load' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"snapshot_path": "./snapshot_file",
"mem_backend": {
"backend_path": "./mem_file",
"backend_type": "File"
},
"enable_diff_snapshots": true,
"resume_vm": false,
"network_overrides": [
{
iface_id: "eth0",
host_dev_name": "vmtap01"
}
]
}'
```

This may require reconfiguration of the networking inside the VM so that it is
still routable externally. The
[network setup documentation](../network-setup.md) in the "In The Guest" section
describes what the typical setup is. If you are not using network namespaces or
the jailer, then the guest will have to be made aware (via vsock or other
channel) that it needs to reconfigure its network to match the network
configured on the tap device.

If the new TAP device, say `vmtap3` has been configured to use a guest address
of `172.16.3.2` then after snapshot restore you would run something like:

```bash
# In the guest

# Clear out the previous addr and route
ip addr flush dev eth0
ip route flush dev eth0

# Configure the new address
ip addr add 172.16.3.2/30 dev eth0
ip route add defaul via 172.16.3.1/30 dev eth0
```

# Ingress connectivity

The above setup only provides egress connectivity. If in addition we also want
Expand Down
44 changes: 43 additions & 1 deletion src/firecracker/src/api_server/request/snapshot.rs
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,7 @@ fn parse_put_snapshot_load(body: &Body) -> Result<ParsedRequest, RequestError> {
mem_backend,
enable_diff_snapshots: snapshot_config.enable_diff_snapshots,
resume_vm: snapshot_config.resume_vm,
network_overrides: snapshot_config.network_overrides,
};

// Construct the `ParsedRequest` object.
Expand All @@ -120,7 +121,7 @@ fn parse_put_snapshot_load(body: &Body) -> Result<ParsedRequest, RequestError> {

#[cfg(test)]
mod tests {
use vmm::vmm_config::snapshot::{MemBackendConfig, MemBackendType};
use vmm::vmm_config::snapshot::{MemBackendConfig, MemBackendType, NetworkOverride};

use super::*;
use crate::api_server::parsed_request::tests::{depr_action_from_req, vmm_action_from_request};
Expand Down Expand Up @@ -181,6 +182,7 @@ mod tests {
},
enable_diff_snapshots: false,
resume_vm: false,
network_overrides: vec![],
};
let mut parsed_request = parse_put_snapshot(&Body::new(body), Some("load")).unwrap();
assert!(parsed_request
Expand Down Expand Up @@ -208,6 +210,7 @@ mod tests {
},
enable_diff_snapshots: true,
resume_vm: false,
network_overrides: vec![],
};
let mut parsed_request = parse_put_snapshot(&Body::new(body), Some("load")).unwrap();
assert!(parsed_request
Expand Down Expand Up @@ -235,6 +238,44 @@ mod tests {
},
enable_diff_snapshots: false,
resume_vm: true,
network_overrides: vec![],
};
let mut parsed_request = parse_put_snapshot(&Body::new(body), Some("load")).unwrap();
assert!(parsed_request
.parsing_info()
.take_deprecation_message()
.is_none());
assert_eq!(
vmm_action_from_request(parsed_request),
VmmAction::LoadSnapshot(expected_config)
);

let body = r#"{
"snapshot_path": "foo",
"mem_backend": {
"backend_path": "bar",
"backend_type": "Uffd"
},
"resume_vm": true,
"network_overrides": [
{
"iface_id": "eth0",
"host_dev_name": "vmtap2"
}
]
}"#;
let expected_config = LoadSnapshotParams {
snapshot_path: PathBuf::from("foo"),
mem_backend: MemBackendConfig {
backend_path: PathBuf::from("bar"),
backend_type: MemBackendType::Uffd,
},
enable_diff_snapshots: false,
resume_vm: true,
network_overrides: vec![NetworkOverride {
iface_id: String::from("eth0"),
host_dev_name: String::from("vmtap2"),
}],
};
let mut parsed_request = parse_put_snapshot(&Body::new(body), Some("load")).unwrap();
assert!(parsed_request
Expand All @@ -259,6 +300,7 @@ mod tests {
},
enable_diff_snapshots: false,
resume_vm: true,
network_overrides: vec![],
};
let parsed_request = parse_put_snapshot(&Body::new(body), Some("load")).unwrap();
assert_eq!(
Expand Down
24 changes: 24 additions & 0 deletions src/firecracker/swagger/firecracker.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1216,6 +1216,24 @@ definitions:
Type of snapshot to create. It is optional and by default, a full
snapshot is created.

NetworkOverride:
type: object
description:
Allows for changing the backing TAP device of a network interface
during snapshot restore.
required:
- iface_id
- host_dev_name
properties:
iface_id:
type: string
description:
The name of the interface to modify
host_dev_name:
type: string
description:
The new host device of the interface

SnapshotLoadParams:
type: object
description:
Expand Down Expand Up @@ -1247,6 +1265,12 @@ definitions:
type: boolean
description:
When set to true, the vm is also resumed if the snapshot load is successful.
network_overrides:
type: array
description: Network host device names to override
items:
$ref: "#/definitions/NetworkOverride"


TokenBucket:
type: object
Expand Down
4 changes: 2 additions & 2 deletions src/vmm/src/devices/virtio/net/persist.rs
Original file line number Diff line number Diff line change
Expand Up @@ -55,8 +55,8 @@ impl RxBufferState {
/// at snapshot.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct NetState {
id: String,
tap_if_name: String,
pub id: String,
pub tap_if_name: String,
rx_rate_limiter_state: RateLimiterState,
tx_rate_limiter_state: RateLimiterState,
/// The associated MMDS network stack.
Expand Down
18 changes: 17 additions & 1 deletion src/vmm/src/persist.rs
Original file line number Diff line number Diff line change
Expand Up @@ -418,7 +418,21 @@
params: &LoadSnapshotParams,
vm_resources: &mut VmResources,
) -> Result<Arc<Mutex<Vmm>>, RestoreFromSnapshotError> {
let microvm_state = snapshot_state_from_file(&params.snapshot_path)?;
let mut microvm_state = snapshot_state_from_file(&params.snapshot_path)?;
for entry in &params.network_overrides {
let net_devices = &mut microvm_state.device_states.net_devices;
if let Some(device) = net_devices
.iter_mut()
.find(|x| x.device_state.id == entry.iface_id)
{
device
.device_state
.tap_if_name
.clone_from(&entry.host_dev_name);
} else {
return Err(SnapshotStateFromFileError::UnknownNetworkDevice.into());

Check warning on line 433 in src/vmm/src/persist.rs

View check run for this annotation

Codecov / codecov/patch

src/vmm/src/persist.rs#L423-L433

Added lines #L423 - L433 were not covered by tests
andrewla marked this conversation as resolved.
Show resolved Hide resolved
}
}
let track_dirty_pages = params.enable_diff_snapshots;

let vcpu_count = microvm_state
Expand Down Expand Up @@ -490,6 +504,8 @@
Meta(std::io::Error),
/// Failed to load snapshot state from file: {0}
Load(#[from] crate::snapshot::SnapshotError),
/// Unknown Network Device.
UnknownNetworkDevice,
}

fn snapshot_state_from_file(
Expand Down
1 change: 1 addition & 0 deletions src/vmm/src/rpc_interface.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1269,6 +1269,7 @@ mod tests {
},
enable_diff_snapshots: false,
resume_vm: false,
network_overrides: vec![],
},
)));
check_unsupported(runtime_request(VmmAction::SetEntropyDevice(
Expand Down
15 changes: 15 additions & 0 deletions src/vmm/src/vmm_config/snapshot.rs
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,16 @@ pub struct CreateSnapshotParams {
pub mem_file_path: PathBuf,
}

/// Allows for changing the mapping between tap devices and host devices
/// during snapshot restore
#[derive(Debug, PartialEq, Eq, Deserialize)]
pub struct NetworkOverride {
/// The index of the interface to modify
pub iface_id: String,
/// The new name of the interface to be assigned
pub host_dev_name: String,
}

/// Stores the configuration that will be used for loading a snapshot.
#[derive(Debug, PartialEq, Eq)]
pub struct LoadSnapshotParams {
Expand All @@ -60,6 +70,8 @@ pub struct LoadSnapshotParams {
/// When set to true, the vm is also resumed if the snapshot load
/// is successful.
pub resume_vm: bool,
/// The network devices to override on load.
pub network_overrides: Vec<NetworkOverride>,
andrewla marked this conversation as resolved.
Show resolved Hide resolved
}

/// Stores the configuration for loading a snapshot that is provided by the user.
Expand All @@ -82,6 +94,9 @@ pub struct LoadSnapshotConfig {
/// Whether or not to resume the vm post snapshot load.
#[serde(default)]
pub resume_vm: bool,
/// The network devices to override on load.
#[serde(default)]
pub network_overrides: Vec<NetworkOverride>,
}

/// Stores the configuration used for managing snapshot memory.
Expand Down
2 changes: 2 additions & 0 deletions src/vmm/tests/integration_tests.rs
Original file line number Diff line number Diff line change
Expand Up @@ -261,6 +261,7 @@ fn verify_load_snapshot(snapshot_file: TempFile, memory_file: TempFile) {
},
enable_diff_snapshots: false,
resume_vm: true,
network_overrides: vec![],
}))
.unwrap();

Expand Down Expand Up @@ -344,6 +345,7 @@ fn verify_load_snap_disallowed_after_boot_resources(res: VmmAction, res_name: &s
},
enable_diff_snapshots: false,
resume_vm: false,
network_overrides: vec![],
});
let err = preboot_api_controller.handle_preboot_request(req);
assert!(
Expand Down
17 changes: 17 additions & 0 deletions tests/framework/microvm.py
Original file line number Diff line number Diff line change
Expand Up @@ -972,6 +972,7 @@ def restore_from_snapshot(
snapshot: Snapshot,
resume: bool = False,
uffd_path: Path = None,
rename_interfaces: dict = None,
):
"""Restore a snapshot"""
jailed_snapshot = snapshot.copy_to_chroot(Path(self.chroot()))
Expand Down Expand Up @@ -999,11 +1000,27 @@ def restore_from_snapshot(
# Adjust things just in case
self.kernel_file = Path(self.kernel_file)

iface_overrides = []
if rename_interfaces is not None:
iface_overrides = [
{"iface_id": k, "host_dev_name": v}
for k, v in rename_interfaces.items()
]

optional_kwargs = {}
if iface_overrides:
# For backwards compatibility ab testing we want to avoid adding
# new parameters until we have a release baseline with the new
# parameter. Once the release baseline has moved, this assignment
# can be inline in the snapshot_load command below
optional_kwargs["network_overrides"] = iface_overrides

self.api.snapshot_load.put(
mem_backend=mem_backend,
snapshot_path=str(jailed_vmstate),
enable_diff_snapshots=snapshot.is_diff,
resume_vm=resume,
**optional_kwargs,
)
# This is not a "wait for boot", but rather a "VM still works after restoration"
if snapshot.net_ifaces and resume:
Expand Down
Loading