Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JUnit parallel executor running too many tests with a fixed policy. #3108

Open
1 task
OPeyrusse opened this issue Dec 15, 2022 · 58 comments · May be fixed by #3685
Open
1 task

JUnit parallel executor running too many tests with a fixed policy. #3108

OPeyrusse opened this issue Dec 15, 2022 · 58 comments · May be fixed by #3685

Comments

@OPeyrusse
Copy link

JUnit only parallel executor service relies on a ForkJoinPool. Unfortunately, this does not play well with code also using ForkJoinPools.

The observed issue is that, when activating parallel tests, JUnit uses ForkJoinPoolHierarchicalTestExecutorService. However, our tests are also using ForkJoinPools and ForkJoinTasks. The orchestration of the test awaits for the completion of those tasks before moving on in the tests.
But the issue is that ForkJoinTask and ForkJoinWorkerThread are capable of detecting the use of the FJP framework (here for example) and react to it. As JUnit tasks and the project tasks are in different ForkJoinPools, they cannot help each other. This only results in more tests being started by already running and incomplete tests.

This can be an issue when tests are resource sensitive. For example, it may not be possible to open too many connections to a Database.
Tough not explicitly illustrated in this project, we also faced StackOverflowError because of recursive test executions in a single worker. In the following logs, produced by the reproducing project, we can see that two workers are recursively starting tests before finishing any:

ForkJoinPool-1-worker-1 starting a new test. Running: 2
ForkJoinPool-1-worker-2 starting a new test. Running: 1
ForkJoinPool-1-worker-1 starting a new test. Running: 3
ForkJoinPool-1-worker-1 starting a new test. Running: 4
ForkJoinPool-1-worker-1 starting a new test. Running: 5
ForkJoinPool-1-worker-2 starting a new test. Running: 6
ForkJoinPool-1-worker-1 starting a new test. Running: 7
ForkJoinPool-1-worker-1 starting a new test. Running: 8
ForkJoinPool-1-worker-1 starting a new test. Running: 9
ForkJoinPool-1-worker-2 starting a new test. Running: 10
ForkJoinPool-1-worker-1 starting a new test. Running: 11
ForkJoinPool-1-worker-1 starting a new test. Running: 12
ForkJoinPool-1-worker-1 starting a new test. Running: 14
ForkJoinPool-1-worker-2 starting a new test. Running: 13
ForkJoinPool-1-worker-1 starting a new test. Running: 15
ForkJoinPool-1-worker-1 starting a new test. Running: 16
ForkJoinPool-1-worker-1 starting a new test. Running: 17
ForkJoinPool-1-worker-1 starting a new test. Running: 18
ForkJoinPool-1-worker-2 starting a new test. Running: 19
ForkJoinPool-1-worker-1 starting a new test. Running: 20

worker-1 started 15 tests, worker-2 started 5 tests

In actual code, given the location of the point of respawn, this can generate very large stacks.

An alternative implementation of the executor service, as shown in this PR, using a standard Thread Executor, would not show similar issues, at the expense of not ideally orchestrating multiple executions.

Let's note that this is a very sneaky issue. Even in this project, we can see that the call to ForkJoinTask#get is not visible in the JUnit method. And it can be as deep as we want in the stack, even hidden to some users as it is happening in a used framework or tool. It may not be always possible for a user to detect that pattern.

Steps to reproduce

See this project https://github.com/OPeyrusse/junitforkjoinpool
After building the project, run the test class TestCalculator (or look at the README if changed since opening this issue).

Context

  • Used versions (Jupiter/Vintage/Platform): 5.8.2, with the Jupiter runner
  • Build Tool/IDE: Build with Intellij, failing in Intellij and when running it with Maven
  • Maven Surefire version: 3.0.0-M5

Deliverables

  • At least one standard test executor not relying on the ForkJoinPool
@OPeyrusse
Copy link
Author

Some additional thoughts:
Looking at the code and being used to work with ForkJoinPool, I can guess some benefits of using the ForkJoin framework. It would naturally focus on sub-tasks to complete the execution of a given class or parameterized tests before moving to other classes.
However, IMO, the mistake is that a ForkJoinTask should only run CPU intensive tasks. JUnit cannot know the content of tests. Some may be real unit tests, only depending on the CPU. Some may be real unit tests but reading the disk. Some may be more integration tests, making network calls using whatever framework. In those cases, for defensive measures, users of the ForkJoin framework have to rely on ManagedBlock. The drawback of Managed blocker is that they start new threads.

Yet another solution could be to use a ForkJoinPool for scheduling of the tests and a standard executor for the execution of the tests.

@OPeyrusse
Copy link
Author

There is also a short snippet reproducing this behavior here.
However, the current choice was to say that the doc will be update to say the configuration limits the number of threads, not the number of tests. This issue focuses on the number of tests being run, as it is the actual problem at stake, leading to StackOverflowErrors.

@OPeyrusse
Copy link
Author

To add to this, it is not currently possible to have parallel tests without being tricked by JUnit current implementation.
It is either sequential tests or parallel tests that must take care of what they do.

@marcphilipp
Copy link
Member

@OPeyrusse Do the features released as part of 5.9.2 help in your case?

@OPeyrusse
Copy link
Author

Hello @marcphilipp , nope it will not help. The feature advertised was already one we looked at, when it was at the stage of PR.
If you look again, you will see that only two threads from the ForkJoinPool are being used.
Tough not tested, I would even say that with only one thread, you can reproduce the problem above. As stated in my description, it comes from the FJ framework detecting the use of ForkJoinPool in a ForkJoinThread, leading it to attempt some work instead of waiting.
So limiting the number of threads in the pool or its saturation will not resolve the issue. Here, the issue is that too many tests are started.

I can give you an analogy with the following story:
There is a factory that can manufacture cars. It takes time, more or less depending on the car models. Building a car is the analogy to running a single test. The factory has enough workers to build several cars concurrently.
There are also clerks coming to the factory with orders to manufacture cars. Those are JUnit ForkJoinTasks, produced after discovering that tests must be run.
When a clerk arrives at the factory, it delivers the order to the factory, that immediately starts working on it.
The factory has chosen, for company reasons, to operate using the FJP methodology (they are smart). The weird thing is that the clerk sees that, and because the clerk company is also using the FJP methodology, the clerk decides not to stay idle and do something while waiting for the car to be made. Obviously, the clerk cannot join the factory workers - it is not its job - so to do something, the clerk picks the next task available to him, which is go back to the office to pick another order.
Because clerks make faster trips to the office than the factory can produce cars, the factory becomes overflowed with orders - which, in the computer world, translates to memory starvation, stack overflow, etc

I hope this helps understand the logic triggering the issue 😃

@marcphilipp
Copy link
Member

Nice analogy. I wish the car factory wouldn't leak implementation details... 😉

Yet another solution could be to use a ForkJoinPool for scheduling of the tests and a standard executor for the execution of the tests.

I think we can consider that. Would you like to give it a shot?

@OPeyrusse
Copy link
Author

Sure, that would be interesting. If you have some guidance on how to write it - and mostly how to write the associated tests - I am ready to try it.

@marcphilipp
Copy link
Member

@OPeyrusse In ParallelExecutionIntegrationTests we have tests that check that tests are run on different threads by collecting the names of those threads (see the nested ThreadReporter class). I think for this, all we'd need to verify is that they are not ForkJoinWorkerThread.

Generally, I think we should make this new behavior opt-in via a configuration parameter since it introduces additional overhead.

@OPeyrusse
Copy link
Author

Thanks for the tips. I will look at it in the coming days.

@OPeyrusse
Copy link
Author

Just a quick update: I have not abandoned nor forgotten this issue. It is simply that I am currently taking care of my children during the holiday season here. I already have ideas on how to proceed and will start on this next week 😄

@asardaes
Copy link

asardaes commented May 6, 2023

If I may ask, do you think it would be possible to allow custom HierarchicalTestExecutorService implementations? I imagine it would be possible to simply extend the configuration of the dynamic strategy to allow choosing between predefined implementations, but I would personally like more control over the thread pool in a programmatic way. I think the custom strategy could then check if junit.jupiter.execution.parallel.config.custom.class is an instance of ParallelExecutionConfigurationStrategy or HierarchicalTestExecutorService, or something like that.

@marcphilipp
Copy link
Member

@asardaes That's an interesting idea. Could you please raise a new issue for it?

@asardaes
Copy link

asardaes commented May 7, 2023

@marcphilipp since you mentioned @OPeyrusse's implementation should be opt-in, I think it's worth discussing in the context of this issue, killing 2 birds with 1 stone. I gave it some more thought, let me explain what I have in mind.

A new configuration parameter could be added, say junit.jupiter.execution.parallel.executor.class. If this is not set, it defaults to either org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService or org.junit.platform.engine.support.hierarchical.ForkJoinPoolHierarchicalTestExecutorService depending on the value of junit.jupiter.execution.parallel.enabled. With this new parameter, users could either choose the "standard executor" implemented by @OPeyrusse or a custom one they themselves implement; if the users specify a custom one, they could decide whether to ignore junit.jupiter.execution.parallel.enabled or not (well, a custom one can basically ignore all parameters and hard-code everything anyway, but that's up to the user).

To instantiate junit.jupiter.execution.parallel.executor.class, all implementations would need some convention that could be used by JupiterTestEngine (and any other custom engine that wants to support different executors). I imagine it would be easiest to require them to have a no-arg constructor, and then adding a default configure(ConfigurationParameters) method to HierarchicalTestExecutorService. This would allow custom implementations to read custom configuration parameters, but it would mean the executors included out of the box in JUnit would have to check if junit.jupiter.execution.parallel.config.strategy=custom and instantiate junit.jupiter.execution.parallel.config.custom.class themselves.

I'm not so sure about the instantiation/configuration convention. Given that both included executor services are marked as stable, I imagine they need to keep existing constructors, but adding a new one and having a default configure method in the interface should be fine, no?

And for completeness, the logic in a test engine would have to be roughly:

if (junit.jupiter.execution.parallel.executor.class is set) {
    // instantiate it and call its configure method
} else {
    // check junit.jupiter.execution.parallel.enabled
    // and instantiate an included Executor,
    // possibly with a custom junit.jupiter.execution.parallel.config.custom.class
}

@OPeyrusse
Copy link
Author

@asardaes do you have a special use-case in mind, driving your question?

I am asking because my idea for this issue was to have two different ways of running the tests: one inlining the execution withing the ForkJoinPool, and one submitting a task to an external Executor and waiting for the result.
Behind this, though I don't know the historical reasons for creating a test executor on top of the ForkJoinPool, I assume that it comes from its work-stealing feature. It can benefit to test execution when you launch a group of tests - often those under a class - as the executor should focus on executing those tests in parallel instead of starting the tests of the next class.
I want to keep this hypothetical design choice. Therefore, my task is only to separate the way one individual test is running.
If we wanted some extension, I could open this to load a user-selected class, from its name, as you suggested. But it would only affect individual test execution, not the general scheduling of all the tests.
Would that suits your needs?

@asardaes
Copy link

asardaes commented May 9, 2023

@OPeyrusse no, I don't think that would fit what I had in mind, but if your implementation doesn't need a completely new HierarchicalTestExecutorService but rather does something else inside an existing one, then I guess my idea really would require a separate issue.

Personally I'm just experimenting with kotlin coroutines, so I don't need a custom test engine, I just want a custom executor.

@OPeyrusse
Copy link
Author

@asardaes If you say so. Not being a core developer of JUnit, I cannot decide what to do. I will let @marcphilipp decide what to do and will start going forward with my plan, that is not creating a new executor.

@igogoka
Copy link

igogoka commented Feb 9, 2024

I can also confirm again like I just did in my last comment. Setting the max pool size and saturation does not work. The only way I could mitigate was to make sure the actual test execution is run in a separate thread pool that is not a ForkJoinPool and that has the max threads set to parallelism, so that the test scheduling can use the work-stealing, but the test execution is not.

In my Spock-using project I have a Spock extension that creates a thread pool using threadPool = Executors.newFixedThreadPool(runnerConfiguration.parallel.parallelExecutionConfiguration.parallelism) and adds an iteration interceptor that does

spec.allFeatures*.addIterationInterceptor {
    threadPool.submit(it::proceed).get()
}

and with that it behaves as expected, effectively doing the solution suggested here above.

You can most probably do something similar for Jupiter tests. Looking through the Jupiter docs I found https://junit.org/junit5/docs/current/user-guide/#extensions-intercepting-invocations where it is demonstrated how to do the actual test execution in the Swing EDT. This is exactly what you also need here for the work-around I used in Spock, so probably something like

@Override
public void interceptTestMethod(Invocation<Void> invocation,
        ReflectiveInvocationContext<Method> invocationContext,
        ExtensionContext extensionContext) throws Throwable {
    threadPool.submit(invocation::proceed).get();
}

Please, write an example how I can set your sollution

@Vampire
Copy link

Vampire commented Feb 9, 2024

You just quoted the example?

@igogoka
Copy link

igogoka commented Feb 9, 2024

You just quoted the example?
Where I should put this code "threadPool = Executors.newFixedThreadPool(runnerConfiguration.parallel.parallelExecutionConfiguration.parallelism)" to run fixed thread pools

@Vampire
Copy link

Vampire commented Feb 9, 2024

Wherever you need it.
Depends on how you adapt the work-around, for example which test engine you are using.
In my Spock project I have it in the start() method of a global extension.

@igogoka
Copy link

igogoka commented Feb 10, 2024

I use groovyVersion = '3.0.19', spockVersion = '2.4-M1-groovy-3.0', gebVersion = '6.0', seleniumVersion = '4.17.0'. In SpockConfig.groovy :
runner { parallel { enabled true fixed 4 } }
But "fixed 4" doesn't work. So you override start() method from IGlobalExtension? Can you show how did you that with all imports, please.

@asardaes asardaes linked a pull request Feb 10, 2024 that will close this issue
6 tasks
@igogoka
Copy link

igogoka commented Feb 14, 2024

Wherever you need it. Depends on how you adapt the work-around, for example which test engine you are using. In my Spock project I have it in the start() method of a global extension.

Can you show how you Override start() method?

@Vampire
Copy link

Vampire commented Feb 14, 2024

class FixJunit3108Extension implements IGlobalExtension {
    private static ExecutorService threadPool

    private final RunnerConfiguration runnerConfiguration

    FixJunit3108Extension(RunnerConfiguration runnerConfiguration) {
        this.runnerConfiguration = runnerConfiguration
    }

    @Override
    void start() {
        threadPool = Executors.newFixedThreadPool(runnerConfiguration.parallel.parallelExecutionConfiguration.parallelism)
    }

    @Override
    void visitSpec(SpecInfo spec) {
        // work-around for https://github.com/junit-team/junit5/issues/3108
        spec.allFeatures*.addIterationInterceptor {
            threadPool.submit(it::proceed).get()
        }
    }

    @Override
    void stop() {
        threadPool?.shutdown()
    }
}

@mufasa1
Copy link

mufasa1 commented Feb 17, 2024

IGlobalExten
class FixJunit3108Extension implements IGlobalExtension {
    private static ExecutorService threadPool

    private final RunnerConfiguration runnerConfiguration

    FixJunit3108Extension(RunnerConfiguration runnerConfiguration) {
        this.runnerConfiguration = runnerConfiguration
    }

    @Override
    void start() {
        threadPool = Executors.newFixedThreadPool(runnerConfiguration.parallel.parallelExecutionConfiguration.parallelism)
    }

    @Override
    void visitSpec(SpecInfo spec) {
        // work-around for https://github.com/junit-team/junit5/issues/3108
        spec.allFeatures*.addIterationInterceptor {
            threadPool.submit(it::proceed).get()
        }
    }

    @Override
    void stop() {
        threadPool?.shutdown()
    }
}

Hi, could you tell me where I can get this interface 'IGlobalExtension'? Is it in junit 5 anther issue branch?

@Vampire
Copy link

Vampire commented Feb 17, 2024

This is for Spock. I described above how you probably can do the same for Jupiter.

@igogoka
Copy link

igogoka commented Feb 28, 2024

class FixJunit3108Extension implements IGlobalExtension {
    private static ExecutorService threadPool

    private final RunnerConfiguration runnerConfiguration

    FixJunit3108Extension(RunnerConfiguration runnerConfiguration) {
        this.runnerConfiguration = runnerConfiguration
    }

    @Override
    void start() {
        threadPool = Executors.newFixedThreadPool(runnerConfiguration.parallel.parallelExecutionConfiguration.parallelism)
    }

    @Override
    void visitSpec(SpecInfo spec) {
        // work-around for https://github.com/junit-team/junit5/issues/3108
        spec.allFeatures*.addIterationInterceptor {
            threadPool.submit(it::proceed).get()
        }
    }

    @Override
    void stop() {
        threadPool?.shutdown()
    }
}

Thanks. I use your solution, but in Ui tests with Geb and Spock I have problem, that every feature (test in Spec)create new thread. How I can rewrite this method to create a new thread only for Spec ?

 @Override
    void visitSpec(SpecInfo spec) {
        spec.allFeatures*.addIterationInterceptor {
            threadPool.submit(it::proceed).get()
        }

    } 

@Vampire
Copy link

Vampire commented Feb 28, 2024

It does not create a new thread for every features.
It uses a thread pool with a fixed size of which the threads are reused.

@igogoka
Copy link

igogoka commented Feb 28, 2024

image
How I can set defaultExecutionMode = SAME_THREAD, beause it doesn't work from SpockConfig

@Vampire
Copy link

Vampire commented Feb 28, 2024

You are even more deviating from the topic here.
This is not a Spock-support channel and for every message you write, every watcher gets an e-mail. ;-)

@mufasa1
Copy link

mufasa1 commented Mar 4, 2024

@Vampire

This is for Spock. I described above how you probably can do the same for Jupiter.

Hi, I tried the description your wrote, but it didn't work. I would greatly appreciate it if you had a validated solution..

@Vampire
Copy link

Vampire commented Mar 4, 2024

I provided a full-blown example above, what more do you need?

@mufasa1
Copy link

mufasa1 commented Mar 4, 2024

I provided a full-blown example above, what more do you need?

I mean for Junit, not for Spock. I tried your description like this
'@OverRide
public void interceptTestMethod(Invocation invocation,
ReflectiveInvocationContext invocationContext,
ExtensionContext extensionContext) throws Throwable {
threadPool.submit(invocation::proceed).get();
}'

but it didn't work. What I need is a solution that has been validated.

@Vampire
Copy link

Vampire commented Mar 4, 2024

Works perfectly fine for me, except that in Jupiter proceed throws an exception, so you have to handle it.
But once you made the code compilable and ensured it is used, it does exactly what is intended.

@marcphilipp marcphilipp linked a pull request Oct 16, 2024 that will close this issue
6 tasks
alecgrieser added a commit to alecgrieser/fdb-record-layer that referenced this issue Dec 24, 2024
This begins the process of allowing for parallel test execution. A few issues were noticed that required some changes:

* Some tests were taking up a bunch of time and benefitted from parallelized execution of tests within the fixture. Those have been updated to CONCURRENT execution mode
* A known issue with JUnit (junit-team/junit5#3108) means that if one of the tests involves a future waiting, that can look to the `ForkJoinPool` like the thread is available for work stealing, so too many tests can end up being executed at once. A new test extension was added that has a semaphore, and that appears to be enough to stop extra tests from being executed
* The server was running out of batch GRV transactions, which resulted in tests failing with "batch GRV transactions exhausted". This mainly affected indexing tests, and I was able to resolve this by upping the transaction priority, for better or worse

There are still some issues:

* A number of tests were hitting deadline exceeded exceptions. It looked like some kind of weird concurrency stuff may be going on, because there were things happening like key space path resolution while creating a record store would include stack traces from closing the test key space path manager, which seems like things were sharing objects that shouldn't have been. This affected both the `:fdb-record-layer-core:test` and `:fdb-lucene:test` tasks
* The relational layer tests are not structured to allocate unique key spaces for each test, so they immediatetly hit concurrency problems when run in a parallelized manner
alecgrieser added a commit to alecgrieser/fdb-record-layer that referenced this issue Jan 21, 2025
This begins the process of allowing for parallel test execution. A few issues were noticed that required some changes:

* Some tests were taking up a bunch of time and benefitted from parallelized execution of tests within the fixture. Those have been updated to CONCURRENT execution mode
* A known issue with JUnit (junit-team/junit5#3108) means that if one of the tests involves a future waiting, that can look to the `ForkJoinPool` like the thread is available for work stealing, so too many tests can end up being executed at once. A new test extension was added that has a semaphore, and that appears to be enough to stop extra tests from being executed
* The server was running out of batch GRV transactions, which resulted in tests failing with "batch GRV transactions exhausted". This mainly affected indexing tests, and I was able to resolve this by upping the transaction priority, for better or worse

There are still some issues:

* A number of tests were hitting deadline exceeded exceptions. It looked like some kind of weird concurrency stuff may be going on, because there were things happening like key space path resolution while creating a record store would include stack traces from closing the test key space path manager, which seems like things were sharing objects that shouldn't have been. This affected both the `:fdb-record-layer-core:test` and `:fdb-lucene:test` tasks
* The relational layer tests are not structured to allocate unique key spaces for each test, so they immediatetly hit concurrency problems when run in a parallelized manner
alecgrieser added a commit to alecgrieser/fdb-record-layer that referenced this issue Jan 27, 2025
This begins the process of allowing for parallel test execution. A few issues were noticed that required some changes:

* Some tests were taking up a bunch of time and benefitted from parallelized execution of tests within the fixture. Those have been updated to CONCURRENT execution mode
* A known issue with JUnit (junit-team/junit5#3108) means that if one of the tests involves a future waiting, that can look to the `ForkJoinPool` like the thread is available for work stealing, so too many tests can end up being executed at once. A new test extension was added that has a semaphore, and that appears to be enough to stop extra tests from being executed
* The server was running out of batch GRV transactions, which resulted in tests failing with "batch GRV transactions exhausted". This mainly affected indexing tests, and I was able to resolve this by upping the transaction priority, for better or worse

There are still some issues:

* A number of tests were hitting deadline exceeded exceptions. It looked like some kind of weird concurrency stuff may be going on, because there were things happening like key space path resolution while creating a record store would include stack traces from closing the test key space path manager, which seems like things were sharing objects that shouldn't have been. This affected both the `:fdb-record-layer-core:test` and `:fdb-lucene:test` tasks
* The relational layer tests are not structured to allocate unique key spaces for each test, so they immediatetly hit concurrency problems when run in a parallelized manner
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Todo
Development

Successfully merging a pull request may close this issue.

7 participants