-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: improve behavior of starting VMs that are waiting for Crucible activation #873
base: gjcolombo/refactor-server-request-queue
Are you sure you want to change the base?
server: improve behavior of starting VMs that are waiting for Crucible activation #873
Conversation
@hawkw Thanks for the review! Everything should be addressed in 0c7a689 (except for the "wait for five seconds to make sure we don't start running" bit; I'll think a bit more about whether there's some other signal we could look at, but this mostly seems like one of those cases where we kind of want to prove a negative...). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New changes look great! I imagine you might want to wait for an approval from @leftwo as well?
Modify the state driver's VM startup procedure to allow the driver to process Crucible volume configuration changes while block backends are being activated. This fixes a livelock that occurs when starting a VM with a Crucible VCR that points to an unavailable downstairs: the unavailable downstairs prevents Crucible activation from proceeding; Nexus sends a corrected VCR that, if applied, would allow the upstairs to activate; but the state driver never applies the new VCR because it's blocked trying to activate using the broken VCR. Modify the PHD VCR replacement smoke test so that it checks this behavior. Add an affordance to PHD Crucible disks that allows a test to specify that the disk's generated VCRs should contain an invalid downstairs IP. Start a VM with a disk configured this way, then replace the broken VCR with a corrected VCR and verify that the VM boots normally.
0c7a689
to
e794e68
Compare
let downstairs_addrs = self | ||
.downstairs_instances | ||
.iter() | ||
.map(|ds| ds.vcr_address()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I'm surprised Clippy hasn't suggested
.map(|ds| ds.vcr_address()) | |
.map(Downstairs::vcr_address) |
since it usually prefers point-free style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 65f862f.
Add a PHD test case for this. This also revealed a bug in the request queue: if a VM is stuck waiting to activate a Crucible volume, and a stop request arrives, the VCR replacement that might have fixed it will be rejected, because the queue rejects replacement requests that arrive after a stop request is queued. Fix this by keeping the same VCR replacement request disposition when a stop request is received.
@mkeeter: Thanks for the detailed review! I think I've addressed all your feedback in the latest push. @hawkw: There are enough changes in the new commits that I'd appreciate another review from you when you have a moment. (In particular I've changed things so that Propolis now reports that a VM is "Starting" and not "Creating" before the component start sequence actually begins. IIRC, we discussed this in DMs recently and decided this would be OK to do, since Omicron collapses these two states into a single "Starting" VMM state anyway.) |
Ran PHD, but forgot to run |
@gjcolombo the new changes look pretty good! I'm now wondering if |
Interesting question. I think this actually illuminates a subtle problem with the queue dispositions we expose with this change:
So the answer to your question is "we could do that," at least insofar as the request queue logic will allow a new stop request to be enqueued if we do that after marking the VM as Running in step 5. But that disposition change is itself a problem--the goal is to reject requests that won't be handled because the VM is going to stop first, and setting the "now running" dispositions allows that. I'm going to mull this over a little bit more--the more I look at the request queue's complexity, the less I like it, and the more I want to try to see how much of it is really essential. |
So this blew up. (Shame I don't have a SoundCloud.) I've described this problem in more detail and proposed a fix in #879. My current plan is to put up a request queue refactoring PR that will fix #879, then amend this PR to take advantage of it. I have a working version of this PR that uses the improved request queue API to ditch the "self request" queue (the improved API lets the state driver pop VCR replacement requests without popping or reordering "state change" requests like stop requests). Assuming the fix for #879 is acceptable, I'll merge it and then push the necessary amendments to this PR. |
I've opened #880 to deal with the request queue improvements I discussed above and have changed this PR's merge base to that branch. I also added commit 0fa8e37, which changes what happens when the state driver gets a stop request while block backends are still activating: instead of buffering the request, the driver simply returns and drops the block backend startup future. The only block backend that has an actually-asynchronous startup procedure is Crucible, and its only async call is to Doing this eliminates the need for a "self-request" queue in the state driver, so the state driver logic turns out to be a bit simpler. |
(N.B. Currently stacked on #880.)
Modify the state driver's VM startup procedure to allow the driver to process Crucible volume configuration changes while block backends are being activated. This fixes a livelock that occurs when starting a VM with a Crucible VCR that points to an unavailable downstairs: the unavailable downstairs prevents Crucible activation from proceeding; Nexus sends a corrected VCR that, if applied, would allow the upstairs to activate; but the state driver never applies the new VCR because it's blocked trying to activate using the broken VCR.
Since the state driver can now dequeue VM state change requests during startup, also teach it to abort startup if a stop request is received while block backends are starting. This is very slightly spicy, because it can cancel a block backend startup future in a way that was not possible before, but (a) the only affected backend is the Crucible backend, and (b) there should be no requests in flight anyway because, if this case is reached, the VM's vCPUs have not started.
Add PHD coverage of the new behaviors:
To assist with this, add a feature to PHD Crucible disks that allows a test to specify that a disk's generated VCRs should contain an invalid downstairs IP address. Starting a VM with a disk configured this way will cause activation to block until the disk's VCR is replaced with a corrected VCR.
Tests: cargo test, PHD w/Alpine and Debian 11 guests.
Fixes #841.