You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Recently, I had a look at the ValidationSupportChain class and found that the logic for caching of StructureDefinitions is distributed around the class and feels a bit brittle. I thought a bit about the code and found two execution sequences that can be problematic:
fetchAllStructureDefinitions() returns a copy of myStructureDefinitionsAsList, but does not use a synchronized block. If this is called concurrently to invalidateCaches() all bets are off (according to the API guarantees of ArrayList) how myStructureDefinitionsAsList looks like when being read. Since instructions may be re-orderd, until the end of the synchronized block in invalidateCaches(), the resulting list could e.g. contain an array partially or completely filled with null values (at least in OpenJDK implementation). This seems very unlikely to happen (the critical re-ordering has to occur and may depend on the processor architecture), but is still a possibility.
There is another interaction between fetchAllStructureDefinitions() and invalidateCaches() that is problematic: If invalidateCaches() is called, the flag myHaveFetchedAllStructureDefinitions is set to false. If, after this instruction, fetchAllStructureDefinitions() is called concurrently, all StructureDefinitions are cached and myHaveFetchedAllStructureDefinitions is set to true. When invalidateCaches() continues, myStructureDefinitionsByUrl and myStructureDefinitionsAsList are cleared. This leaves myHaveFetchedAllStructureDefinitions set to true, but the collections are empty. When fetchAllStructureDefinitions() is called again, an empty list is returned.
To Reproduce
I created a simple test case, that tries to trigger the condition (by running in the loop until the race condition is met) and the failure occurs on every run on my machine.
Expected behavior
A thread safe ValidationSupportChain.
Environment (please complete the following information):
HAPI FHIR master
Additional context
I find it much easier to wrap my head around thread safety of multiple values (that have to be kept in a consistent state) by using immutable state that is completely swapped out leveraging compare-and-swap instructions (using AtomicReference). This makes it impossible to read inconsistent state.
In an branch of mine, I refactored the code of ValidationSupportChain to use these principles and extracted the logic for management of the cached StructureDefinitions into a separate class.
You can have a look at the commit here, to see how this may look like: master...Boereck:hapi-fhir:feature/cas_structdef_cache
In theory the lockfree copy-on-write approach can be more expensive on writing and more performant when reading, compared to the current appoach using synchronized blocks. However, in a few local tests I did with JMH, performing some operations onValidationSupportChain
showed no measurable difference (differences are in the margin of error).
Another advantage of factoring the caching logic into a seperate class is that the caching logic can be tested in isolation, as done on the branch.
The proposed API has method names aligned to naming from java collections, so it feels familiar. The API can, of course, be changed if alignment with the collections API is not valued as much. E.g. methods names starting with compute could have a fetch prefix instead. The computeIfAbsent fetching function takes the URL as parameter, which is alligned to the compute methods from java.util.Map and is implemented this way to potentially avoid lambdas capturing the key variable from the context (since creation of capturing lambdas is more expensive). However, the calls in ValidationSupportChain do capture the this reference anyway, so we could turn the Function into a Supplier and just capture the URL parameter.
Let me know if you consider this a good way to pursue, if I should open a PR, or if the APIs need some change.
The text was updated successfully, but these errors were encountered:
Describe the bug
Recently, I had a look at the
ValidationSupportChain
class and found that the logic for caching of StructureDefinitions is distributed around the class and feels a bit brittle. I thought a bit about the code and found two execution sequences that can be problematic:fetchAllStructureDefinitions()
returns a copy ofmyStructureDefinitionsAsList
, but does not use a synchronized block. If this is called concurrently toinvalidateCaches()
all bets are off (according to the API guarantees ofArrayList
) howmyStructureDefinitionsAsList
looks like when being read. Since instructions may be re-orderd, until the end of the synchronized block ininvalidateCaches()
, the resulting list could e.g. contain an array partially or completely filled withnull
values (at least in OpenJDK implementation). This seems very unlikely to happen (the critical re-ordering has to occur and may depend on the processor architecture), but is still a possibility.There is another interaction between
fetchAllStructureDefinitions()
andinvalidateCaches()
that is problematic: IfinvalidateCaches()
is called, the flagmyHaveFetchedAllStructureDefinitions
is set tofalse
. If, after this instruction,fetchAllStructureDefinitions()
is called concurrently, all StructureDefinitions are cached andmyHaveFetchedAllStructureDefinitions
is set totrue
. WheninvalidateCaches()
continues,myStructureDefinitionsByUrl
andmyStructureDefinitionsAsList
are cleared. This leavesmyHaveFetchedAllStructureDefinitions
set totrue
, but the collections are empty. WhenfetchAllStructureDefinitions()
is called again, an empty list is returned.To Reproduce
I created a simple test case, that tries to trigger the condition (by running in the loop until the race condition is met) and the failure occurs on every run on my machine.
Expected behavior
A thread safe
ValidationSupportChain
.Environment (please complete the following information):
Additional context
I find it much easier to wrap my head around thread safety of multiple values (that have to be kept in a consistent state) by using immutable state that is completely swapped out leveraging compare-and-swap instructions (using
AtomicReference
). This makes it impossible to read inconsistent state.In an branch of mine, I refactored the code of
ValidationSupportChain
to use these principles and extracted the logic for management of the cached StructureDefinitions into a separate class.You can have a look at the commit here, to see how this may look like:
master...Boereck:hapi-fhir:feature/cas_structdef_cache
In theory the lockfree copy-on-write approach can be more expensive on writing and more performant when reading, compared to the current appoach using
synchronized
blocks. However, in a few local tests I did with JMH, performing some operations onValidationSupportChain
showed no measurable difference (differences are in the margin of error).
Another advantage of factoring the caching logic into a seperate class is that the caching logic can be tested in isolation, as done on the branch.
The proposed API has method names aligned to naming from java collections, so it feels familiar. The API can, of course, be changed if alignment with the collections API is not valued as much. E.g. methods names starting with
compute
could have afetch
prefix instead. ThecomputeIfAbsent
fetching function takes the URL as parameter, which is alligned to thecompute
methods fromjava.util.Map
and is implemented this way to potentially avoid lambdas capturing the key variable from the context (since creation of capturing lambdas is more expensive). However, the calls inValidationSupportChain
do capture thethis
reference anyway, so we could turn theFunction
into aSupplier
and just capture the URL parameter.Let me know if you consider this a good way to pursue, if I should open a PR, or if the APIs need some change.
The text was updated successfully, but these errors were encountered: