server: improve behavior of starting VMs that are waiting for Crucible activation #873

gjcolombo · 2025-02-26T00:16:59Z

(N.B. Currently stacked on #880.)

Modify the state driver's VM startup procedure to allow the driver to process Crucible volume configuration changes while block backends are being activated. This fixes a livelock that occurs when starting a VM with a Crucible VCR that points to an unavailable downstairs: the unavailable downstairs prevents Crucible activation from proceeding; Nexus sends a corrected VCR that, if applied, would allow the upstairs to activate; but the state driver never applies the new VCR because it's blocked trying to activate using the broken VCR.

Since the state driver can now dequeue VM state change requests during startup, also teach it to abort startup if a stop request is received while block backends are starting. This is very slightly spicy, because it can cancel a block backend startup future in a way that was not possible before, but (a) the only affected backend is the Crucible backend, and (b) there should be no requests in flight anyway because, if this case is reached, the VM's vCPUs have not started.

Add PHD coverage of the new behaviors:

Modify the PHD VCR replacement smoke test to check that start requests with a bad set of Crucible targets can be unblocked by replacing a bad target with a good target.
Add a server state machine test that checks that a VM that is blocked waiting for Crucible activation can be explicitly stopped.

To assist with this, add a feature to PHD Crucible disks that allows a test to specify that a disk's generated VCRs should contain an invalid downstairs IP address. Starting a VM with a disk configured this way will cause activation to block until the disk's VCR is replaced with a corrected VCR.

Tests: cargo test, PHD w/Alpine and Debian 11 guests.

Fixes #841.

bin/propolis-server/src/lib/vm/objects.rs

bin/propolis-server/src/lib/vm/state_driver.rs

phd-tests/tests/src/crucible/smoke.rs

phd-tests/framework/src/disk/crucible.rs

gjcolombo · 2025-02-26T19:08:03Z

@hawkw Thanks for the review! Everything should be addressed in 0c7a689 (except for the "wait for five seconds to make sure we don't start running" bit; I'll think a bit more about whether there's some other signal we could look at, but this mostly seems like one of those cases where we kind of want to prove a negative...).

hawkw

New changes look great! I imagine you might want to wait for an approval from @leftwo as well?

phd-tests/tests/src/crucible/smoke.rs

Modify the state driver's VM startup procedure to allow the driver to process Crucible volume configuration changes while block backends are being activated. This fixes a livelock that occurs when starting a VM with a Crucible VCR that points to an unavailable downstairs: the unavailable downstairs prevents Crucible activation from proceeding; Nexus sends a corrected VCR that, if applied, would allow the upstairs to activate; but the state driver never applies the new VCR because it's blocked trying to activate using the broken VCR. Modify the PHD VCR replacement smoke test so that it checks this behavior. Add an affordance to PHD Crucible disks that allows a test to specify that the disk's generated VCRs should contain an invalid downstairs IP. Start a VM with a disk configured this way, then replace the broken VCR with a corrected VCR and verify that the VM boots normally.

bin/propolis-server/src/lib/vm/state_driver.rs

mkeeter · 2025-03-03T23:01:53Z

phd-tests/framework/src/disk/crucible.rs

+        let downstairs_addrs = self
+            .downstairs_instances
+            .iter()
+            .map(|ds| ds.vcr_address())


Nit: I'm surprised Clippy hasn't suggested

Suggested change

.map(|ds| ds.vcr_address())

.map(Downstairs::vcr_address)

since it usually prefers point-free style.

Fixed in 65f862f.

phd-tests/tests/src/crucible/smoke.rs

Add a PHD test case for this. This also revealed a bug in the request queue: if a VM is stuck waiting to activate a Crucible volume, and a stop request arrives, the VCR replacement that might have fixed it will be rejected, because the queue rejects replacement requests that arrive after a stop request is queued. Fix this by keeping the same VCR replacement request disposition when a stop request is received.

gjcolombo · 2025-03-04T01:36:33Z

@mkeeter: Thanks for the detailed review! I think I've addressed all your feedback in the latest push.

@hawkw: There are enough changes in the new commits that I'd appreciate another review from you when you have a moment. (In particular I've changed things so that Propolis now reports that a VM is "Starting" and not "Creating" before the component start sequence actually begins. IIRC, we discussed this in DMs recently and decided this would be OK to do, since Omicron collapses these two states into a single "Starting" VMM state anyway.)

gjcolombo · 2025-03-04T02:26:33Z

Ran PHD, but forgot to run cargo test before pushing. I'll fix the request queue unit test to match the new expected behavior and push that tomorrow.

bin/propolis-server/src/lib/vm/request_queue.rs

bin/propolis-server/src/lib/vm/state_driver.rs

mkeeter · 2025-03-04T19:20:58Z

@gjcolombo the new changes look pretty good! I'm now wondering if self_request is necessary at all, since it only ever stores a single ExternalRequest::Stop – could we just push that to the normal external_requests queue instead?

gjcolombo · 2025-03-04T20:48:17Z

I'm now wondering if self_request is necessary at all, since it only ever stores a single ExternalRequest::Stop – could we just push that to the normal external_requests queue instead?

Interesting question. I think this actually illuminates a subtle problem with the queue dispositions we expose with this change:

VM begins to start
External requestor asks to stop the VM; this sets the queue disposition to ignore new stop requests (for idempotency), reject reboot/migration requests (since the VM will stop before these are processed), and allow VCR change requests (new in this PR)
start_vm pops the stop request and sets stopped_while_starting
Block backends finish starting up
start_vm calls publish_steady_state(Running); this sets the queue dispositions to appropriate dispositions for a running VM (reboot/migration out requests are allowed again; VCR change requests are still allowed; stop requests are enqueued, not ignored)
the state driver processes its buffered stop request; this corrects the queue dispositions again (to reject everything but stop requests, which are ignored)

So the answer to your question is "we could do that," at least insofar as the request queue logic will allow a new stop request to be enqueued if we do that after marking the VM as Running in step 5. But that disposition change is itself a problem--the goal is to reject requests that won't be handled because the VM is going to stop first, and setting the "now running" dispositions allows that.

I'm going to mull this over a little bit more--the more I look at the request queue's complexity, the less I like it, and the more I want to try to see how much of it is really essential.

gjcolombo · 2025-03-05T23:31:39Z

I'm going to mull this over a little bit more--the more I look at the request queue's complexity, the less I like it, and the more I want to try to see how much of it is really essential.

So this blew up. (Shame I don't have a SoundCloud.) I've described this problem in more detail and proposed a fix in #879.

My current plan is to put up a request queue refactoring PR that will fix #879, then amend this PR to take advantage of it. I have a working version of this PR that uses the improved request queue API to ditch the "self request" queue (the improved API lets the state driver pop VCR replacement requests without popping or reordering "state change" requests like stop requests). Assuming the fix for #879 is acceptable, I'll merge it and then push the necessary amendments to this PR.

gjcolombo · 2025-03-07T00:55:53Z

I've opened #880 to deal with the request queue improvements I discussed above and have changed this PR's merge base to that branch.

I also added commit 0fa8e37, which changes what happens when the state driver gets a stop request while block backends are still activating: instead of buffering the request, the driver simply returns and drops the block backend startup future. The only block backend that has an actually-asynchronous startup procedure is Crucible, and its only async call is to activate, so I believe this should be safe. I've added a warning label to the Backend trait describing this. (Note that, in any case, the "abort" path can only be reached if the VM's vCPUs haven't started yet, so there should be no outstanding I/O that could be damaged by unceremoniously dropping the future and the backends.)

Doing this eliminates the need for a "self-request" queue in the state driver, so the state driver logic turns out to be a bit simpler.

gjcolombo requested a review from hawkw February 26, 2025 00:17

gjcolombo linked an issue Feb 26, 2025 that may be closed by this pull request

want Crucible VCR replacement to be able to proceed before a VM has fully started #841

Open

hawkw reviewed Feb 26, 2025

View reviewed changes

gjcolombo requested review from hawkw and leftwo February 26, 2025 19:08

leftwo mentioned this pull request Feb 26, 2025

phd: add smoke test for VCR replacement #872

Merged

hawkw approved these changes Feb 26, 2025

View reviewed changes

phd-tests/tests/src/crucible/smoke.rs Show resolved Hide resolved

Base automatically changed from gjcolombo/phd-vcr-smoke-test to master February 27, 2025 00:18

gjcolombo added 2 commits February 27, 2025 00:19

PR feedback & comment cleanup

e794e68

gjcolombo force-pushed the gjcolombo/vcr-replace-during-activate branch from 0c7a689 to e794e68 Compare February 27, 2025 00:21

server: remove resolved TODO comment (thanks James)

35c4db9

mkeeter reviewed Mar 3, 2025

View reviewed changes

bin/propolis-server/src/lib/vm/state_driver.rs Outdated Show resolved Hide resolved

mkeeter reviewed Mar 3, 2025

View reviewed changes

bin/propolis-server/src/lib/vm/state_driver.rs Outdated Show resolved Hide resolved

mkeeter reviewed Mar 3, 2025

View reviewed changes

bin/propolis-server/src/lib/vm/state_driver.rs Outdated Show resolved Hide resolved

mkeeter reviewed Mar 3, 2025

View reviewed changes

bin/propolis-server/src/lib/vm/state_driver.rs Outdated Show resolved Hide resolved

mkeeter reviewed Mar 3, 2025

View reviewed changes

phd-tests/tests/src/crucible/smoke.rs Show resolved Hide resolved

gjcolombo added 2 commits March 4, 2025 01:21

phd: tidy up call to map

65f862f

gjcolombo requested review from hawkw and mkeeter March 4, 2025 01:34

server: fix/refine request queue tests

b6427f7

mkeeter reviewed Mar 4, 2025

View reviewed changes

bin/propolis-server/src/lib/vm/request_queue.rs Outdated Show resolved Hide resolved

mkeeter reviewed Mar 4, 2025

View reviewed changes

bin/propolis-server/src/lib/vm/state_driver.rs Outdated Show resolved Hide resolved

server: improve comments

e66e5de

gjcolombo mentioned this pull request Mar 5, 2025

server: state driver request queue dispositions are difficult to manage #879

Open

gjcolombo mentioned this pull request Mar 6, 2025

server: extend API request queue's memory of prior requests #880

Open

gjcolombo added 2 commits March 7, 2025 00:03

merge gjcolombo/refactor-server-request-queue

310288e

server: abort startup if a stop request is received while awaiting

0fa8e37

gjcolombo changed the base branch from master to gjcolombo/refactor-server-request-queue March 7, 2025 00:44

gjcolombo changed the title ~~server: allow VCR replacement during block backend start~~ server: improve behavior of starting VMs that are waiting for Crucible activation Mar 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server: improve behavior of starting VMs that are waiting for Crucible activation #873

server: improve behavior of starting VMs that are waiting for Crucible activation #873

gjcolombo commented Feb 26, 2025 •

edited

Loading

gjcolombo commented Feb 26, 2025

hawkw left a comment

mkeeter Mar 3, 2025

gjcolombo Mar 4, 2025

gjcolombo commented Mar 4, 2025

gjcolombo commented Mar 4, 2025

mkeeter commented Mar 4, 2025

gjcolombo commented Mar 4, 2025

gjcolombo commented Mar 5, 2025

gjcolombo commented Mar 7, 2025

server: improve behavior of starting VMs that are waiting for Crucible activation #873

Are you sure you want to change the base?

server: improve behavior of starting VMs that are waiting for Crucible activation #873

Conversation

gjcolombo commented Feb 26, 2025 • edited Loading

gjcolombo commented Feb 26, 2025

hawkw left a comment

Choose a reason for hiding this comment

mkeeter Mar 3, 2025

Choose a reason for hiding this comment

gjcolombo Mar 4, 2025

Choose a reason for hiding this comment

gjcolombo commented Mar 4, 2025

gjcolombo commented Mar 4, 2025

mkeeter commented Mar 4, 2025

gjcolombo commented Mar 4, 2025

gjcolombo commented Mar 5, 2025

gjcolombo commented Mar 7, 2025

gjcolombo commented Feb 26, 2025 •

edited

Loading