Replies: 3 comments
-
I'm running experiment with a fork of the stm crate. I tried replacing some internal data structures with no-alloc variants (using There is one structure that I haven't been able to replace though: a Essentially,
|
Beta Was this translation helpful? Give feedback.
-
perf report of the sequential impl, benching the perf report of the parallel impl: it seems like the stm variables require that many more cycles to run; for example, the number of cycles dedicated to computing orbits goes from 10% down to less than 1% |
Beta Was this translation helpful? Give feedback.
-
parallel impl:
|
Beta Was this translation helpful? Give feedback.
-
I noticed that a huge regression in performance happened with the changes of #201.
Observed changes
while we can expect a loss of speed due to sync overhead, the observed changes were absurdly important:
builder-time
grisubal-time
Cause
I haven't fully identified the cause(s) yet, but my first impression is that the initalization of
TVar
s is very costly, see the snippet below from the stm crate source code:Beta Was this translation helpful? Give feedback.
All reactions