Implement PscMetadataClient #44

jeffxiang · 2024-08-28T22:38:34Z

Implement PscMetadataClient which allows backend-agnostic metadata queries and operations, similar to Kafka's AdminClient. This implementation does not yet achieve API parity with KafkaAdminClient, but rather implements the methods needed for full Flink 1.15 upgrade and integration. Other gaps in the API can be added over time.

Backend agnostic behavior is achieved via a similar design as PscConsumer and PscProducer, where the calls to the API's at runtime (e.g. listOffsets()) will create a backend-specific implementation of the metadata client (PscBackendMetadataClient) if one is not already registered and created in the PscMetadataClient instance for that particular backend and cluster. As such, each public API in PscMetadataClient accepts a TopicUri which can be "incomplete", i.e. excluding the topic and includes everything up to the cluster. This is enough for the ServiceDiscoveryManager to identify the correct backend service and endpoints to connect to.

Unit and integration tests were added to ensure expected behavior and correctness of API's in PscMetadataClient under different scenarios.

vahidhashemian · 2024-08-30T21:59:12Z

psc/src/main/java/com/pinterest/psc/metadata/client/PscMetadataClient.java

+     * @throws InterruptedException
+     * @throws TimeoutException
+     */
+    public List<TopicRn> listTopicRns(TopicUri clusterUri, long timeout, TimeUnit timeUnit) throws ExecutionException, InterruptedException, TimeoutException {


Should we make a ClusterUri class, maybe as a subclass of TopicUri to avoid confusion? The current contract is that a topic URI has to have a valid topic as the last component, which I think we shouldn't change.

Plus, a variable names clusterUri of type TopicUri is rather confusing.

Just took a stab at this to see how much work would be involved. It is non-trivial. Introducing a ClusterUri class will require major refactoring of existing service discovery classes and their logic. The current logic in TopicUri and subclasses does not require a valid topic as the last component (surprisingly), and service discovery mechanism works out of the box without the topic in TopicUri.

I tried two approaches to introduce ClusterUri - ClusterUri extends BaseTopicUri and ClusterUri implements TopicUri. Both of these approaches still required decent effort to refactor some commonly used classes such as service discovery, and both approaches introduces the risk of someone erroneously supplying a ClusterUri instance to the regular consumer/producer API's that require a TopicUri (since TopicUri is now a superclass of ClusterUri).

A third approach is to create separate Cluster* classes equivalent to TopicRn, TopicUri, BaseTopicUri, and KafkaTopicUri. This will still require major refactoring of service discovery and is more work.

I do think that we should refactor at some point to introduce the concept of ClusterUri, but given the effort I think the refactoring itself deserves a separate PR.

Thanks for looking into this. If too much effort is involved, we can leave it in the backlog.
For clusterUri what is a sample input we'd be passing?

A sample input for clusterUri would be:

plaintext:/rn:kafka:env:cloud_region::cluster:

note that this is the same as getTopicUriPrefix() in BaseTopicUri: https://github.com/pinterest/psc/blob/3.2/psc/src/main/java/com/pinterest/psc/common/BaseTopicUri.java#L124-L126

which is basically constructed via the protocol + getTopicRnPrefixString(): https://github.com/pinterest/psc/blob/3.2/psc/src/main/java/com/pinterest/psc/common/BaseTopicUri.java#L37

and getTopicRnPrefixString() is just a regular TopicRn but without the topic: https://github.com/pinterest/psc/blob/main/psc/src/main/java/com/pinterest/psc/common/TopicRn.java#L43-L45

Thanks! That makes sense. I was trying to see if we can rename clusterUri to better associate it with TopicUri as type. But that's not a big concern. We can leave it as is.

psc/src/main/java/com/pinterest/psc/metadata/client/PscMetadataClient.java

jeffxiang added 2 commits August 27, 2024 17:10

WIP metadataClient impl

1d34288

WIP metadataClient API impl; finished listOffsets

961db58

jeffxiang requested a review from a team as a code owner August 28, 2024 22:38

jeffxiang added 4 commits August 28, 2024 18:58

Minor code cleanups

00562d5

Add test for listConsumerGroupOffsets

7b83ee8

Try to fix test

201a417

Add javadocs

f044efe

vahidhashemian reviewed Aug 30, 2024

View reviewed changes

Address comments

f033886

vahidhashemian approved these changes Sep 11, 2024

View reviewed changes

jeffxiang merged commit ce370c3 into 3.2 Sep 11, 2024
1 check passed

jeffxiang deleted the metadata_client branch September 11, 2024 19:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement PscMetadataClient #44

Implement PscMetadataClient #44

jeffxiang commented Aug 28, 2024 •

edited

Loading

vahidhashemian Aug 30, 2024

jeffxiang Sep 11, 2024 •

edited

Loading

vahidhashemian Sep 11, 2024

jeffxiang Sep 11, 2024

vahidhashemian Sep 11, 2024

Implement PscMetadataClient #44

Implement PscMetadataClient #44

Conversation

jeffxiang commented Aug 28, 2024 • edited Loading

vahidhashemian Aug 30, 2024

Choose a reason for hiding this comment

jeffxiang Sep 11, 2024 • edited Loading

Choose a reason for hiding this comment

vahidhashemian Sep 11, 2024

Choose a reason for hiding this comment

jeffxiang Sep 11, 2024

Choose a reason for hiding this comment

vahidhashemian Sep 11, 2024

Choose a reason for hiding this comment

jeffxiang commented Aug 28, 2024 •

edited

Loading

jeffxiang Sep 11, 2024 •

edited

Loading