Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spmc queue #74

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bartoszmodelski
Copy link
Contributor

@bartoszmodelski bartoszmodelski commented Jun 2, 2023

Motivation

Currently, the blessed data structure for Domainslib is the deque. It's LIFO, and while that's optimal for locality, it's quite easy to shoot oneself in the foot with liveness issues. For example, if a web server is processing a stream of requests and starts some compute in the background, all requests will eventually have to wait for the compute to finish. Even knowing about the issue there's not much that can be improved here (without re-enginering the workload or creating multiple schedulers) because LIFO keeps working on the existing sub-tree of tasks until done by design. FIFO, on the other hand, juggles all subtrees and treats a single task as unit of work. I believe it's a much safer choice for the default scheduling strategy.

Thus, this PR adds a simple single-producer multi-consumer queue inspired by the work-stealing deque and Golang's scheduler. It's useful as a general structure but has been written mostly with Domainslib in mind.

Design

Similar design to the deque. The array is not atomic. Writer first inserts the item and increments the tail index. Stealers first read item in the array and try to claim it with cas on the head. Thus the writer operates on the region of the array between tail (incl.) and head (excl.), while stealers between head (incl.) and tail (excl.). Stealer may do a non-linearizable read of the array but it won't be returned to the caller as cas fails in such a case.

Local deque could be identical to stealing. I've modified to first modify index and then read the array to ensure wait-freedom (or, in particular, to eliminate the risk of local deque competing with steals). I've added further explanations in the code.

The structure is wait-free for the owner of the queue and lock-free for stealers. This design should help it keep stable performance as system becomes loaded and stealing decreases.

Testing

Tests:

  • DSCheck tests. I've used granular dependency branch, where they take around 0.05s.
  • Standard multicore tests with multiple domains hammering the structure.

We can also add a lock-free steal-half function, which will improve work distribution on skewed workloads, but keeping it simple for now.

@bartoszmodelski bartoszmodelski requested a review from a team June 2, 2023 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant