-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow snapshot tap changes #4731
base: main
Are you sure you want to change the base?
Allow snapshot tap changes #4731
Conversation
7991d9f
to
8d1a0a9
Compare
Hi @andrewla thank you for your contribution! We would like to understand the use case better in case it can be resolved through other means first. We recommend using a network namespace where you can create TAP devices with the same name, but that probably requires Could you elaborate on your use case? Is there a way you could create the namespace in a privileged setting and then use something like |
That assessment is correct -- basically to run the jailer in a network namespace you need the setns syscall which requires CAP_SYS_ADMIN. So nsenter is not an option. Our particular case is running in a containerized environment where our privileges are limited by the nature of the general environment. Once we're in our particular container we have lost all relevant privileges. |
3415816
to
03c3be9
Compare
03c3be9
to
265ea94
Compare
Hi again @andrewla, we have been talking internally about this PR and we may need to spend some time to decide on the API aspects of it to make sure it doesn't conflict with other efforts. In the meantime, we thought of another workaround. The For example we imagine the tool would work like this: snapshot-editor edit-vmstate rename-network eth0 tap1 Would this work within your environment? |
This was our initial approach as it required minimal changes. But we found that the performance cost of making the copy (as opposed to hardlinking) during the operation (plus serde costs) were more expensive than we were willing to tolerate in our environment. |
Hi @pb8o -- is there anything we can do to help move this forward? |
Hi @andrewla I haven't had time to look at this, but this is next on my list now. Thanks for your patience! |
On a related note, another reason why renaming the tap device is a better approach than namespaced NAT from the "Network for Clones" guide is that the namespaced NAT imposes measurable overhead onto the host kernel due to the addition of about 5 more Even though I made an effort to support namespaced NAT in fcnet, it increased complexity by a factor of 4-5x in comparison to regular NAT only to support one usecase: two simultaneous microVM clones. So I'd be in favor of this change, or a |
Hello @andrewla ! I apologize for the long time between updates, but some other stuff came up. So we have decided to go ahead with this. I gave a first initial review and I only have some minor comments, but mostly looks good to me. I just have a question if the |
Re: config -- currently there is no config support for snapshots (https://github.com/firecracker-microvm/firecracker/blob/main/src/vmm/src/resources.rs) -- the snapshot configuration and restore has to be done with a running firecracker instance |
d8c5a44
to
ea62e9a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It generally looks good and thanks for the contribution Andrew.
A few comments/questions from me.
Also, I've commented this for the documentation changes, but could you please squash as well the commits for the test changes into a single commit?
0be3ff7
to
bf47436
Compare
eb815a2
to
97838cb
Compare
07a8237
to
837e744
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4731 +/- ##
==========================================
- Coverage 83.13% 83.09% -0.04%
==========================================
Files 245 245
Lines 26697 26710 +13
==========================================
+ Hits 22194 22196 +2
- Misses 4503 4514 +11
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
7e2c75f
to
5bcdf11
Compare
c979583
to
3240301
Compare
It turns out that the test for renaming devices was failing when run with other tests that used network devices. After some experimentation, it seems that we are not cleaning up network devices from other tests, and modifying a network device results in an incompatible network configuration, rendering the VM unreachable. For now I've patched this by having the new test use an unallocated network device, but I'm not sure if we're comfortable with this or if we want to try to figure out why the test passes when run alone but not when run in tandem with other tests. |
|
||
vm = uvm_nano | ||
iface1 = dataclasses.replace(base_iface, tap_name="tap1") | ||
iface1 = dataclasses.replace(base_iface, tap_name="tap8") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh this may be because since #4966 we re-use network namespaces and devices during the integration tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great progress! All tests are passing except on c5n.metal https://buildkite.com/firecracker/firecracker-pr/builds/12381#01948d21-de06-43e5-abe7-758b149c3501/62-1644
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting -- might be worth making the tap_name non-optional, or disabling this behavior if it differs, to avoid this issue cropping up again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried some debugging but couldn't get anywhere -- it doesn't appear when you run the test in isolation, even the earlier repro only worked when run in tandem with other tests.
I upped the id of the interface; hopefully that will let this squeeze by while I try to root cause it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I was looking at this today in c5n. I think it's because the network namespace re-used conflicts with a previous TAP device. It is surprising that it only happens in c5n though, I wonder if it's because of the exact test sequence.
So basically re-using network namespaces is just not worth it in this test and then we can simplify it to this:
def test_snapshot_rename_interface(uvm_nano, microvm_factory):
"""
Test that we can restore a snapshot and point its interface to a
different host interface.
"""
vm = uvm_nano
base_iface = vm.add_net_iface()
vm.start()
snapshot = vm.snapshot_full()
# We don't re-use the network namespace as it may conflict with
# previous/future devices
restored_vm = microvm_factory.build(netns=NetNs(str(uuid.uuid4())))
# Override the tap name, but keep the same IP configuration
iface_override = dataclasses.replace(base_iface, tap_name="tap_override")
restored_vm.spawn()
snapshot.net_ifaces.clear()
snapshot.net_ifaces.append(iface_override)
restored_vm.restore_from_snapshot(
snapshot,
rename_interfaces={iface_override.dev_name: iface_override.tap_name},
resume=True
)
(I don't think it's worth it to do the negative test actually so I removed it here)
40eb957
to
232a604
Compare
In some scenarios it is not possible to use the jailer, especially in limited privilege environments where the security is external to firecracker itself. But in these cases a snapshot may have to use a different tap device than the one that it was using when it was snapshotted. Signed-off-by: Andrew Laucius <[email protected]>
Test that we can correctly parse configuration and API calls in a backwards compatible way. Signed-off-by: Andrew Laucius <[email protected]>
Documenting the ability to rename network interfaces on snapshot restore. Signed-off-by: Andrew Laucius <[email protected]>
Passing in the new flag breaks tests that compare behavior to main. Signed-off-by: Andrew Laucius <[email protected]>
It appears that if we use a tap device that has already been configured by another test, then we cannot effectively change the configuration of that device. Because for this test we need to rename the device, make sure that we use an unallocated tap device. Signed-off-by: Andrew Laucius <[email protected]>
Signed-off-by: Andrew Laucius <[email protected]>
232a604
to
b138302
Compare
Adding an expected failure case to ensure that the renaming code is not silently failing. Signed-off-by: Andrew Laucius <[email protected]>
Changes
Allow renaming of tap devices on snapshot restore
Reason
In some scenarios it is not possible to use the jailer, especially in limited privilege environments where the security is external to firecracker itself. But in these cases a snapshot may have to use a different tap device than the one that it was using when it was snapshotted.
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md
.PR Checklist
PR.
CHANGELOG.md
.TODO
s link to an issue.contribution quality standards.
rust-vmm
.