BatchFetch IN corrupts cache by overwriting BatchFetchPolicy dataResults #1998

andilem · 2023-11-14T13:28:19Z

EclipseLink version 4.0.2

When loading more referenced entities by @BatchFetch(value = BatchFetchType.IN) than the BatchFetch size (seems to be fixed 500, the annotation value is not used), multiple queries are issued to retrieve all results. But the IN (...) arguments for the last issued query are stored in the cache for every retrieved entity as "source" query arguments, which means that all entities which were not fetched with the last query have corrupted @OneToMany references (when loading them, an empty collection is returned because the query uses the wrong IDs).
The cause is BatchFetchPolicy.setDataResults which overwrites the previous rows. I implemented a dirty workaround by adding the results in executeObjectLevelReadQuery() instead of overwriting them, but this could cause bad performance because the number of rows can grow larger than the BatchFetch size, resulting in multiple queries when querying again.

This is relatively complicated to describe but very easy to reproduce. There is a simple test case attached with

two entity types (Parent and Child)
bidirectional @OneToMany/@ManyToOne with @BatchFetch(value = BatchFetchType.IN)
persist 501 parents with 1 child each
query all parents -> OK
query all children
observe that all parents apart from the 501st one seem to have no children anymore. FINE logging to see the wrong query.

eclipselink-batchfetch.zip

The text was updated successfully, but these errors were encountered:

andilem · 2025-01-09T23:48:43Z

@Sheikah45 you seem to have some understanding with the BatchFetchPolicy, do you have any idea how to fix this properly?

impacsl · 2025-02-17T14:04:42Z

We have the same issue with OneToOne-references using BatchFetchType IN/JOIN/EXISTS - with EclipseLink version 4.0.5 as well.

When removing @BatchFetch or deactivating the cache it works as expected, but with performance losses.

I first thought it is related to #2011
But the fix to this issue didn't solve the problem.

Sheikah45 · 2025-02-17T17:49:45Z

@andilem I have taken a look at this and agree with your assessment that the cause is the overwriting of the dataResults when executing the batched queries.

Unfortunately this seems like a difficult issue to fix when the relationship is itself lazy. When the relationship is eager than the lifetime of the dataResults is fine since the objects are fully constructed before the DataResults are overwritten by the next batch. However in the case of the lazy relationship the query and thus the batchFetchPolicy and it's DataResults are stored inside of the BatchValueHolder to be later used for instantiating the object. This results only the most recent batch rows being seen when creating the relationships when later instantiated by the BatchValueHolder.

I believe a proper solution might entail creating clones of the query and BatchFetchPolicy for each batch to ensure that the BatchValueHolder has the correct set of rows. Although this does mean that there will be greater memory usage since currently the relationship is limited to only the batch size where as this change would store all the needed rows in memory as long as the BatchValueHolder existed.

andilem · 2025-02-17T19:46:28Z

Thanks for looking into this! Unfortunately, the current code is wrong even for eager relationships, and, much worse, it even corrupts the cache because the query originally used to fetch the row seems to be stored in the cache together with the latest batch arguments (not the correct ones to fetch the specific row), so every future cache hit results in parents with no children.

Sheikah45 · 2025-02-17T20:12:51Z

I have opened a PR that will clone the batch query so that it can maintain it's own batchfetchpolicy independent of further batches. Additionally for the eager cache corruption do you have an example because the test cases that I used all seemed to operate correctly when set to eager fetch or it resulted in a StackOverFlow.

andilem · 2025-02-17T21:24:39Z

I currently cannot reproduce the issue when the parent-child-relationship is eager because of the StackOverflowError. The primary (owning) child-parent-relationship doesn't seem to matter.
But in many cases, it is just not possible to eagerly load all children, that would simply take too long and load too much data.
That StackOverflow is probably a different issue.

Sheikah45 · 2025-02-17T21:30:49Z

~~Understood, the stack overflow is due to the cache not being populated until the object is fully constructed.~~ And with the early batch fetching that causes an issues. I am not aware of any easy solution there.

For some reason in the test in the PR the eager fetch didn't not result in a stack overflow but I haven't dug into why not there.

If you are able to give the source in the PR a test in your use case that could be good confirmation the lazy case is resolved.

andilem · 2025-02-17T21:56:02Z

I just tested it and I can confirm that the issue is resolved for my case.

In the test, there is no StackOverflow because you're just testing three objects.
In my case, there are just too many calls for the stack.
When loading 501 parents and their children have to be fetched eagerly, it loads all 501 children, but each child requires its parent to be loaded, so it loads the remaining 500 parents, which loads the remaining 500 children, which loads the remaining 499 parents and so on.
There is no infinite recursion, it's just too deep.

Sheikah45 linked a pull request Feb 17, 2025 that will close this issue

[master] Clone batchQuery to allow for proper nested lazy instantiation #2367

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BatchFetch IN corrupts cache by overwriting BatchFetchPolicy dataResults #1998

BatchFetch IN corrupts cache by overwriting BatchFetchPolicy dataResults #1998

andilem commented Nov 14, 2023

andilem commented Jan 9, 2025

impacsl commented Feb 17, 2025

Sheikah45 commented Feb 17, 2025

andilem commented Feb 17, 2025

Sheikah45 commented Feb 17, 2025 •

edited

Loading

andilem commented Feb 17, 2025

Sheikah45 commented Feb 17, 2025 •

edited

Loading

andilem commented Feb 17, 2025 •

edited

Loading

BatchFetch IN corrupts cache by overwriting BatchFetchPolicy dataResults #1998

BatchFetch IN corrupts cache by overwriting BatchFetchPolicy dataResults #1998

Comments

andilem commented Nov 14, 2023

andilem commented Jan 9, 2025

impacsl commented Feb 17, 2025

Sheikah45 commented Feb 17, 2025

andilem commented Feb 17, 2025

Sheikah45 commented Feb 17, 2025 • edited Loading

andilem commented Feb 17, 2025

Sheikah45 commented Feb 17, 2025 • edited Loading

andilem commented Feb 17, 2025 • edited Loading

Sheikah45 commented Feb 17, 2025 •

edited

Loading

Sheikah45 commented Feb 17, 2025 •

edited

Loading

andilem commented Feb 17, 2025 •

edited

Loading