Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] key_shared consumer with backlog cause pulsar broker read a huge amount data from bookkeeper but dispatch a little, cause network bandwith exhausted #23514

Open
2 of 3 tasks
mawenyu opened this issue Oct 25, 2024 · 1 comment
Labels
type/bug The PR fixed a bug or issue reported a bug

Comments

@mawenyu
Copy link

mawenyu commented Oct 25, 2024

Search before asking

  • I searched in the issues and found nothing similar.

Read release policy

  • I understand that unsupported versions don't get bug fixes. I will attempt to reproduce the issue on a supported version of Pulsar client and Pulsar broker.

Version

linux x64, java17, pulsar 3.0.4, pulsar c++ client 3.4.2

Minimal reproduce step

  1. stop key_shared consumer 5 mininutes
  2. start key_shared consumer

What did you expect to see?

  1. broker the data dispatch to consumer is closely to the the data read speed from bookie
  2. consumer can finish read backlog after a couple of minutes

What did you see instead?

  1. broker read a lot of data from bookkeeper only dispatch a little

the node with only broker: it‘s network receive is very big, but only transmit a little data to consumer
Snipaste_2024-10-25_14-55-27

the node with only bookeeper: it‘s network transmit is 23Gb/s, network band with reach the hardware limit ;

  1. consumer can not catch up;

Anything else?

the bookkeeper disk read is very low; so the data is in the bookkeeper read cache; I think maybe some bug cause pulsar broker keep read same entry from bookkeeper , cause the network bandwith exhausted

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@mawenyu mawenyu added the type/bug The PR fixed a bug or issue reported a bug label Oct 25, 2024
@mawenyu mawenyu changed the title [Bug] key_shared consume with backlog cause pulsar broker read a huge amount data from bookkeeper but dispatch a little, cause network bandwith exhausted [Bug] key_shared consumer with backlog cause pulsar broker read a huge amount data from bookkeeper but dispatch a little, cause network bandwith exhausted Oct 25, 2024
@lhotari
Copy link
Member

lhotari commented Oct 25, 2024

@mawenyu Please upgrade to Pulsar 3.0.7 . You are facing a known issue that has been fixed by #22245 and #22533.

That will address the most severe issue. However not all challenges are solved with that fix. Some further problems such as #23200 has been addressed in Pulsar 4.0.0 together with many other improvements to Key_Shared subscriptions.

I'd recommend upgrading to Pulsar 4.0.0 where Key_shared subscription has been significantly improved. There are details in the StreamNative blog post about Pulsar 4.0 and in the updated Pulsar documentation for Key_Shared subscription ordering guarantees and troubleshooting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

No branches or pull requests

2 participants