Michael-Scott queue : safe - unsafe versions #146

lyrm · 2024-07-11T20:59:36Z

This PR follows PR #122 work: the optimizations or changes made in this previous PR are separated into two to get a safe Michael-Scott queue and an "unsafe" one (with Obj.magic use).

TODO :

For now, the dscheck tests for the unsafe version get segmentation faults.
Documentation

lyrm · 2024-07-23T15:44:40Z

@polytypic: The dscheck tests of the unsafe version result in a segmentation fault. I encountered the same issue with PR #122 after rebasing on main and adding the few necessary changes. Could you have a look?

lyrm · 2024-07-29T17:58:52Z

@polytypic : this PR is ready for review. Could you have a look ?

polytypic · 2024-07-29T18:47:17Z

README.md

@@ -286,6 +304,16 @@ Because of the great properties of OCaml 5 memory model (see the
 more details), not a lot can go wrong here. At least, data corruption or
 segmentation fault won't happen like it can in other languages.

+## Safe and unsafe data structures
+Some data structures are available in two versions: a normal version and a more optimized but **unsafe** version. The **unsafe** version utilizes `Obj.magic` in a way that may be unsafe with `flambda2` optimizations.


Let's format .md files with prettier. I recommend setting up your editor so that it does that automatically.

It should have been working this way, but apparently, my editor config was not working right ! Thanks for catching it.

polytypic · 2024-07-29T18:52:30Z

README.md

@@ -33,15 +34,24 @@ is distributed under the

 # Contents

+- [Saturn — Parallelism-Safe Data Structures for Multicore OCaml](#saturn--parallelism-safe-data-structures-for-multicore-ocaml)


This is not an issue with this PR, but some time ago GitHub started to automatically produce a kind of contents or navigation drop down for .md files. So, I've been thinking of potentially starting to drop contents section from .md files (less stuff to maintain). On quick look ocaml.org/packages doesn't yet produce such a navigation helper, so I guess there is still value in having a contents section. So, just mentioning this here as a potential thing to discuss at some point.

polytypic · 2024-07-30T07:58:38Z

src_lockfree/michael_scott_queue.ml

@@ -75,7 +72,7 @@ let rec fix_tail tail new_tail =
    && not (Atomic.compare_and_set tail old_tail new_tail)
  then fix_tail tail new_tail

-let push { tail; _ } value =
+let push_exn { tail; _ } value =


Hmm... The convention has been to use the _exn suffix on operations that raise an expected exception in some normal cases (such as when the queue is empty or full), but in this case no such exception is being used. Why should we use the _exn suffix in this case?

Good catch ! I made the change on saturn_benchmarks to match the generic Queue signature. I should not have move it here !

polytypic · 2024-07-31T11:50:14Z

Looking at the "safe" version, the push logic:

(* ... *)

let push { tail; _ } value =
  let rec find_tail_and_enq curr_end node =
    if not (Atomic.compare_and_set curr_end Nil node) then
      match Atomic.get curr_end with
      | Nil -> find_tail_and_enq curr_end node
      | Next (_, n) -> find_tail_and_enq n node
  in
  let new_tail = Atomic.make Nil in
  let newnode = Next (value, new_tail) in
  let old_tail = Atomic.get tail in
  find_tail_and_enq old_tail newnode;
  if not (Atomic.compare_and_set tail old_tail new_tail) then
    fix_tail tail new_tail

doesn't have some of the (entirely safe) optimizations from the "unsafe" version:

(* ... *)

let push { tail; _ } value =
  let (Next _ as new_node : (_, [ `Next ]) Node.t) = Node.make value in
  let old_tail = Atomic.get tail in
  let link = Node.as_atomic old_tail in
  if Atomic.compare_and_set link Nil new_node then
    Atomic.compare_and_set tail old_tail new_node |> ignore
  else
    let backoff = Backoff.once Backoff.default in
    push tail link new_node backoff

In particular, notice how the "unsafe" version does not call any recursive functions in the fast path:

let push { tail; _ } value =
  let (Next _ as new_node : (_, [ `Next ]) Node.t) = Node.make value in
  let old_tail = Atomic.get tail in
  let link = Node.as_atomic old_tail in
  (* first we try the fast path *)
  if Atomic.compare_and_set link Nil new_node then
    Atomic.compare_and_set tail old_tail new_node |> ignore
  else
    (* failed, so we do the slow path *)

It should be possible to incorporate this optimization into the "safe" version and it might speed it up a tiny bit.

polytypic · 2024-07-31T11:59:27Z

test/michael_scott_queue/michael_scott_queue_dscheck.ml

+  let module Safe = Dscheck_ms_queue (Michael_scott_queue) in
+  let safe_test = Safe.tests "safe" in
+  let module Unsafe = Dscheck_ms_queue (Michael_scott_queue_unsafe) in
+  let unsafe_test = Unsafe.tests "unsafe" in


Not a major issue, but this would probably be a bit less verbose with first-class modules. It could be like:

let safe_tests = tests "safe" (module Michael_scott_queue) in let unsafe_tests = tests "unsafe" (module Michael_scott_queue_unsafe) in ...

The module language can be a bit cumbersome at times and in a case like this where we are only interested in a simple (non module) return value (a list of tests) first-class modules can be (IMHO) more convenient.

So, I pushed the dscheck tests with a first class module and may be I did not do it right but it does not feel less verbose :) But it works.

polytypic · 2024-07-31T12:02:42Z

test/michael_scott_queue/stm_michael_scott_queue.ml

+  let module Safe = STM_ms_queue (Ms_queues.Michael_scott_queue) in
+  let exit_code = Safe.run () in
+  if exit_code <> 0 then exit exit_code
+  else
+    let module Unsafe = STM_ms_queue (Ms_queues.Michael_scott_queue_unsafe) in
+    Unsafe.run () |> exit


This could also be a bit less verbose using first-class modules. We are not really interested in the Safe and Unsafe modules as such. We just want to run the tests and get an int as a result.

lyrm · 2024-07-31T16:51:31Z

About the missing safe optimizations: I added the fast path in the push function. However, the benchmarks suggest that adding backoff (either for in fix_tail or find_tail_and_end) does not improve performance (at least on my computer). Did I miss other safe optimizations ?

polytypic · 2024-08-01T10:12:30Z

About the missing safe optimizations: I added the fast path in the push function. However, the benchmarks suggest that adding backoff (either for in fix_tail or find_tail_and_end) does not improve performance (at least on my computer). Did I miss other safe optimizations ?

Hmm... One could also leave the backoff out. It is possible that the backoff helps on other CPU (micro)architectures and we can add that later.

polytypic

The logic looks good to me. 🎉

toots · 2024-08-26T02:17:04Z

Hi there! While I understand and appreciate the hard work that went into optimizing the queue, what is the recommended alternative to grab the elements of the queue not that snapshot are gone? A pop loop followed by a push loop?

polytypic · 2024-08-26T09:53:07Z

As a side note, the removal of the snapshot mechanism wasn't just about optimization.

Here is a puzzle about the snapshot mechanism. The code below uses the previous 0.4.1 version of this library, which is the only published version of this library that has the MS queue with the snapshot mechanism and no (additional) space leaks. Let's first define a module alias:

module Q = Saturn.Queue

Here is a function that converts a "snapshot" into a list:

let[@tail_mod_cons] rec to_list s =
  match Q.next s with
  | None -> []
  | Some (v, s) -> v :: to_list s

Question A: What is the return value of the below program?

let q = Q.create () in
let s = Q.snapshot q in
Q.push q 101;
ignore (Q.pop q);
to_list s

Question B: What is the return value of the below program?

let q = Q.create () in
Q.push q 101;
let s = Q.snapshot q in
ignore (Q.pop q);
Q.push q 42;
ignore (Q.pop q);
to_list s

Click to see answer and explanation

The first program returns [] and the second program returns [101; 42]. Note that there is no point in time when the queue in the second program contains both 101 and 42.

To me, a "snapshot" should be an immutable instant. But that is not how the mechanism worked. Instead, snapshot simply returned a pointer to the first non-dummy (if any) node of the queue and the next operation then dereferenced the (mutable) pointer to the next node.

This meant that if the queue happened to be empty, the snapshot would be empty. If the queue happened to be non-empty, the snapshot would contain all elements pushed to the queue at that point and potentially all the element to be pushed to the queue in the future.

This behaviour of the snapshot mechanism is, of course, not limited to simple sequential programs. Concurrent pushes to the queue are also potentially visible through the "snapshot".

toots · 2024-08-26T16:24:37Z

Good to know! Given the nature of the concurent queue, I was never expecting a strict control over the interleaving calls but this is clearly opposite to what it was advertising.

I think it'd be nice to have a way to get through the elements in the queue without removing them, basically a deeper peek API.

I might see if I can propose a PR for that.

polytypic · 2024-08-27T12:14:06Z

Note that providing a snapshot operation can be easier for some other lock-free queue approaches.

The "lazy" queue, for example, makes it particularly easy.

The approach using two (essentially immutable) stacks also makes it relatively easy to take a snapshot of the queue. The Picos project has an implementation of a two-stack multi-producer, multi-consumer queue that should be relatively easy to extend with a snapshot operation.

Also, the queue provided with Kcas has snapshot functionality.

lyrm added 4 commits July 11, 2024 15:40

Safe MS queue.

2605147

Add unsafe ms_queue.

22be84f

Tests work for both safe and unsafe ms queue.

a486633

Format

45e1348

lyrm changed the title ~~Ms queue safe unsafe~~ Michael-Scott queue : safe - unsafe versions Jul 11, 2024

lyrm marked this pull request as draft July 11, 2024 21:00

lyrm added 2 commits July 23, 2024 15:17

Fix bench.

bb431fa

Format.

47b9a04

Add missing file to make dscheck tests work.

171b666

lyrm marked this pull request as ready for review July 25, 2024 15:03

Add a short note in README about safe/unsafe data structures.

68326c1

polytypic reviewed Jul 29, 2024

View reviewed changes

polytypic reviewed Jul 30, 2024

View reviewed changes

lyrm added 2 commits July 30, 2024 10:43

Format README.

40a1d45

push_exn -> push.

81e465c

polytypic reviewed Jul 31, 2024

View reviewed changes

Fast path for push function.

4b2eb74

polytypic approved these changes Aug 1, 2024

View reviewed changes

lyrm force-pushed the ms_queue_safe_unsafe branch from 3e0a71a to 4b2eb74 Compare August 19, 2024 09:22

lyrm merged commit 1e3024b into ocaml-multicore:main Aug 19, 2024
9 checks passed

lyrm mentioned this pull request Aug 19, 2024

Optimize Michael-Scott queue and remove snapshot mechanism #122

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Michael-Scott queue : safe - unsafe versions #146

Michael-Scott queue : safe - unsafe versions #146

lyrm commented Jul 11, 2024 •

edited

Loading

lyrm commented Jul 23, 2024

lyrm commented Jul 29, 2024

polytypic Jul 29, 2024

lyrm Jul 30, 2024

polytypic Jul 29, 2024 •

edited

Loading

polytypic Jul 30, 2024

lyrm Jul 30, 2024

polytypic commented Jul 31, 2024

polytypic Jul 31, 2024

lyrm Aug 1, 2024

polytypic Jul 31, 2024

lyrm commented Jul 31, 2024

polytypic commented Aug 1, 2024

polytypic left a comment

toots commented Aug 26, 2024 •

edited

Loading

polytypic commented Aug 26, 2024 •

edited

Loading

toots commented Aug 26, 2024

polytypic commented Aug 27, 2024 •

edited

Loading

		@@ -33,15 +34,24 @@ is distributed under the

		# Contents

		- [Saturn — Parallelism-Safe Data Structures for Multicore OCaml](#saturn--parallelism-safe-data-structures-for-multicore-ocaml)

Michael-Scott queue : safe - unsafe versions #146

Michael-Scott queue : safe - unsafe versions #146

Conversation

lyrm commented Jul 11, 2024 • edited Loading

lyrm commented Jul 23, 2024

lyrm commented Jul 29, 2024

polytypic Jul 29, 2024

Choose a reason for hiding this comment

lyrm Jul 30, 2024

Choose a reason for hiding this comment

polytypic Jul 29, 2024 • edited Loading

Choose a reason for hiding this comment

polytypic Jul 30, 2024

Choose a reason for hiding this comment

lyrm Jul 30, 2024

Choose a reason for hiding this comment

polytypic commented Jul 31, 2024

polytypic Jul 31, 2024

Choose a reason for hiding this comment

lyrm Aug 1, 2024

Choose a reason for hiding this comment

polytypic Jul 31, 2024

Choose a reason for hiding this comment

lyrm commented Jul 31, 2024

polytypic commented Aug 1, 2024

polytypic left a comment

Choose a reason for hiding this comment

toots commented Aug 26, 2024 • edited Loading

polytypic commented Aug 26, 2024 • edited Loading

toots commented Aug 26, 2024

polytypic commented Aug 27, 2024 • edited Loading

lyrm commented Jul 11, 2024 •

edited

Loading

polytypic Jul 29, 2024 •

edited

Loading

toots commented Aug 26, 2024 •

edited

Loading

polytypic commented Aug 26, 2024 •

edited

Loading

polytypic commented Aug 27, 2024 •

edited

Loading