Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][meta] Use ForkJoinPool.commonPool to handle Metadata operations #22542

Closed

Conversation

heesung-sn
Copy link
Contributor

Motivation

Observed that the metadata store executor thread can be blocked when handling chained future operations (e.g. while running handleGetResult). Especially, when the chained future operation needs to wait for another metadata access, this can block the metadata get/put operations, because the flush() will never be called(because the executor is blocked now).

Modifications

  • Use ForkJoinPool.commonPool threads to execute the metadata operations, instead of the metadata executor thread.

Verifying this change

  • Make sure that the change passes the CI checks.

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: heesung-sn#65

@heesung-sn heesung-sn changed the title [fix][broker] Use ForkJoinPool.commonPool to handle Metadata operations [fix][meta] Use ForkJoinPool.commonPool to handle Metadata operations Apr 19, 2024
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Apr 19, 2024
@lhotari
Copy link
Member

lhotari commented Apr 19, 2024

I didn't review in detail yet. I think something like this has come up before. This might have an impact on correctness since ordering guarantees would be lost.
It's worth postponing merging the PR until there's a review from @merlimat .

@heesung-sn heesung-sn force-pushed the metadata-handler-complete-async branch from b153708 to 8629e9d Compare April 19, 2024 20:24
@heesung-sn heesung-sn force-pushed the metadata-handler-complete-async branch from 8629e9d to d832553 Compare April 19, 2024 20:31
return store.get(prefix + "/e1").thenApply((ignore2) -> {
verify.run();
return null;
}).get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is a mix between sync & async calls.
eg: blocking in a callback thread.

Using ForkJoinPool, it's only moving the problem to a different place. It will be able to avoid deadlocking on a simple 1:1 situation, though when the ForkJoinPool threads count is maxed, we will still be subject to deadlock.

No matter how we handle this, it's going to be problematic, hence we always have had to be careful in avoiding blocking calls to metadata.

@mattisonchao
Copy link
Member

Questions:

  1. This looks like just wanna mitigate the issue to the common pool. we should avoid blocking calls for metadata.
  2. Did you think about multi-thread will change the request order?

@heesung-sn
Copy link
Contributor Author

heesung-sn commented Apr 20, 2024

Although ForkJoinPool can grow very large(max 32767, commonPool's maximumPoolSize is 256 by default) -- it is less likely exhausted (Should be more than the number of caller threads), I agree that we should not chain the "heavy operation" (in this case, metadata sync calls) after the metadata operation.

I will fix the caller code not to trigger the metadata sync call, instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc-not-needed Your PR changes do not impact docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants