Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vsock: Fix issues with sibling VM communication #385

Merged
merged 3 commits into from
Jul 20, 2023

Conversation

techiepriyansh
Copy link
Contributor

@techiepriyansh techiepriyansh commented Jul 5, 2023

This PR attempts to resolve issues with sibling VM communication described in #384.

Resolves partially #384.

@techiepriyansh techiepriyansh marked this pull request as draft July 5, 2023 19:10
@techiepriyansh techiepriyansh force-pushed the fix-sibling-vm-communication branch from 6a11b01 to 600f053 Compare July 5, 2023 19:14
@techiepriyansh
Copy link
Contributor Author

CI fails with cargo audit. Need to manually cargo update again.

@stefano-garzarella
Copy link
Member

CI fails with cargo audit. Need to manually cargo update again.

@vireshk is fixing that in #380, we can rebase on that after the merge ;-)

crates/vsock/src/main.rs Outdated Show resolved Hide resolved
crates/vsock/src/vhu_vsock.rs Outdated Show resolved Hide resolved
@techiepriyansh techiepriyansh force-pushed the fix-sibling-vm-communication branch from 600f053 to 7ae35ad Compare July 7, 2023 13:05
@stefano-garzarella
Copy link
Member

/cc @germag

@techiepriyansh techiepriyansh force-pushed the fix-sibling-vm-communication branch from 7ae35ad to 409dfb6 Compare July 13, 2023 09:45
@techiepriyansh
Copy link
Contributor Author

Resolved starvation of the raw packets queue from the standard RX queue.

@techiepriyansh techiepriyansh force-pushed the fix-sibling-vm-communication branch from 409dfb6 to 808a45e Compare July 13, 2023 09:59
@techiepriyansh
Copy link
Contributor Author

Rebased over main.

The deadlock occurs when two sibling VMs simultaneously try to send each
other packets. The `VhostUserVsockThread`s corresponding to both the VMs
hold their own locks while executing `thread_backend.send_pkt` and then
try to lock each other to access their counterpart's `raw_pkts_queue`.
This ultimately results in a deadlock.

Resolved by separating the mutex over `raw_pkts_queue` from the mutex over
`VhostUserVsockThread`.

Signed-off-by: Priyansh Rathi <[email protected]>
@techiepriyansh techiepriyansh force-pushed the fix-sibling-vm-communication branch from 808a45e to 9e048b2 Compare July 13, 2023 10:08
@techiepriyansh techiepriyansh marked this pull request as ready for review July 13, 2023 10:13
Copy link
Member

@stefano-garzarella stefano-garzarella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a comment about rx handling

crates/vsock/src/vhu_vsock.rs Outdated Show resolved Hide resolved
@stefano-garzarella
Copy link
Member

Regarding issue #384, what things are missing that need to be resolved?

@techiepriyansh techiepriyansh force-pushed the fix-sibling-vm-communication branch from 9e048b2 to 6fe6d98 Compare July 13, 2023 18:35
@techiepriyansh
Copy link
Contributor Author

Moved the RX handling (queue switching on each iteration to prevent starvation) to VhostUserVsockThread.

@techiepriyansh
Copy link
Contributor Author

techiepriyansh commented Jul 13, 2023

Regarding issue #384, what things are missing that need to be resolved?

Even with the fixes introduced in this PR, iperf-vsock does not seem to work. Logging the packet traffic received by the vhost-user-vsock application reveals that after sending only a few RW packets, the client stops sending any more packets. Following could be the reasons for this problem:

  • There might be a notification problem inside the vhost-user-vsock application
  • It could be due to the way iperf works internally
  • It might have something to do with vsock credit updates

Copy link
Member

@stefano-garzarella stefano-garzarella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Just a little thing if you want: I'd replace process_standard_rx with process_unix_sockets. We can also remove pub from it.

@stefano-garzarella
Copy link
Member

Regarding issue #384, what things are missing that need to be resolved?

Regarding issue #384, what things are missing that need to be resolved?

Even with the fixes introduced in this PR, iperf-vsock does not seem to work. Logging the packet traffic received by the vhost-user-vsock application reveals that after sending only a few RW packets, the client stops sending any more packets. Following could be the reasons for this problem:

* There might be a notification problem inside the vhost-user-vsock application

* It could be due to the way iperf works internally

* It might have something to do with vsock credit updates

Okay thanks! Can you update the issue with this information?

Currently, the `raw_pkts_queue` is processed only when a
`SIBLING_VM_EVENT` is received. But it may happen that the
`raw_pkts_queue` could not be processed completely due to insufficient
space in the RX virtqueue at that time. So, try to process raw packets on
other events too similar to what happens in the RX of standard packets.

Signed-off-by: Priyansh Rathi <[email protected]>
@techiepriyansh
Copy link
Contributor Author

Done.

@stefano-garzarella
Copy link
Member

@mathieupoirier @vireshk @stsquad can you take a look?
I think we should merge this before the v0.1.0 release.

@vireshk vireshk enabled auto-merge (rebase) July 20, 2023 07:56
@vireshk vireshk merged commit 6feb938 into rust-vmm:main Jul 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants