edgecases: disadvantages for parallelization? #3113
-
your work looks more than interesting... would be great to understand if your parallel approach is always of great benefit Example:
https://neo4j.com/developer-blog/speed-up-queries-neo4j-parallel-runtime/ |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi @hpvd, Thanks for your interest. In general the answer is, yes, parallelism should always help. We adopt morsel-driven parallelism approach to parallelize queries: https://db.in.tum.de/~leis/papers/morsels.pdf. This is a state-of-the-art approach adopted in many DBMSs. But overall this is a hard question to answer because there is likely to be some queries where the overhead of trying to parallelize can make some queries slower but I don't ever expect this to be a major slow down frankly. But if they are, these are likely performance bugs we can fix. The other thing is that some parts of queries are not parallelized, so even if you have 64 threads, we will run in intentionally with 1 thread. For example if you had a query that ordered by and picked the top 10 nodes and performed further computation, we would run the latter parts of the computation single threaded to maintain the order. Something like:
The second MATCH after WITH line will be single threaded because we assume the user wants to keep the order (at least for now). Hope this helps. |
Beta Was this translation helpful? Give feedback.
Hi @hpvd,
Thanks for your interest. In general the answer is, yes, parallelism should always help. We adopt morsel-driven parallelism approach to parallelize queries: https://db.in.tum.de/~leis/papers/morsels.pdf. This is a state-of-the-art approach adopted in many DBMSs. But overall this is a hard question to answer because there is likely to be some queries where the overhead of trying to parallelize can make some queries slower but I don't ever expect this to be a major slow down frankly. But if they are, these are likely performance bugs we can fix.
The other thing is that some parts of queries are not parallelized, so even if you have 64 threads, we will run in intentionally with 1 t…