Efficient node selection based on node IDs #1394
-
Is there a way to efficiently select nodes by their ids and use them in the query as a replacement for the whole set of nodes? I have a node (inproceeding) with over 2.5 million instances and it has a relation to it's proceeding and the conference it took place. MATCH This works well but I want to get that information not for just one id but for multiple up to 80'000 record ids. I could loop over the 80'000 ids and do each query individual but that takes very long. I also tried to use UNWIND: UNWIND [list of record ids] AS rec_ids but that does not work and I get a buffer manager exception because it runs out of memory. From a logical perspective I think the query is very simple as I know exactly which nodes from inproceeding are of my interest and I can just follow the relations and I think it should be possible to get the results fast but I didnt find a way that seems to work. Any help would be highly appriciated |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 5 replies
-
Hi, The most common way to select nodes from a large set of IDs is to use
In our release v0.0.2, we have a limitation that a List size cannot exceed 4KB. So you might need to chunk your 80,000 ids into 200 ids per batch and run multiple queries. Apologize for this constraint. We will fix it very soon. Best, P.S. The UNWIND is also an alternative. It runs out-of-memory because we pick a bad plan. I'll fix it too. But the most performant way should be list_contains because running filter is preferred (from performance perspective) than running a join |
Beta Was this translation helpful? Give feedback.
Hi,
The most common way to select nodes from a large set of IDs is to use
list_contains
functionE.g.
In our release v0.0.2, we have a limitation that a List size cannot exceed 4KB. So you might need to chunk your 80,000 ids into 200 ids per batch and run multiple queries. Apologize for this constraint. We will fix it very soon.
Best,
Xiyang
P.S. The UNWIND is also an alternative. It runs out-of-memory because we pick a bad plan. I'll fix it too. But the most performant way should be list_contains because running filter is preferred (from performance …