Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xds: introduce generic xds clients xDS and LRS Client API signatures #8042

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

purnesh42H
Copy link
Contributor

@purnesh42H purnesh42H commented Jan 27, 2025

This is the next part of generic xds clients to be usable outside of grpc which builds on top of #8024. This change is adding the user API signatures for xDS and LRS client (without implementation) to communicate with xDS management server.

POC
Internal Design

RELEASE NOTES: None

@purnesh42H purnesh42H changed the title Generic xds client 2 interface xds: introduce xDS and LRS Client API signatures Jan 27, 2025
@purnesh42H purnesh42H requested review from dfawley and easwars January 27, 2025 18:50
@purnesh42H purnesh42H changed the title xds: introduce xDS and LRS Client API signatures xds: introduce generic xds client xDS and LRS Client API signatures Jan 27, 2025
Copy link

codecov bot commented Jan 27, 2025

Codecov Report

Attention: Patch coverage is 0% with 26 lines in your changes missing coverage. Please review.

Project coverage is 82.25%. Comparing base (c7db760) to head (a1fc72b).
Report is 5 commits behind head on master.

Files with missing lines Patch % Lines
xds/internal/clients/lrsclient/load_store.go 0.00% 12 Missing ⚠️
xds/internal/clients/xdsclient/xdsclient.go 0.00% 8 Missing ⚠️
xds/internal/clients/lrsclient/lrsclient.go 0.00% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8042      +/-   ##
==========================================
+ Coverage   82.23%   82.25%   +0.01%     
==========================================
  Files         387      390       +3     
  Lines       38947    38971      +24     
==========================================
+ Hits        32028    32054      +26     
- Misses       5601     5609       +8     
+ Partials     1318     1308      -10     
Files with missing lines Coverage Δ
xds/internal/clients/config.go 97.61% <ø> (+4.43%) ⬆️
xds/internal/clients/lrsclient/lrsclient.go 0.00% <0.00%> (ø)
xds/internal/clients/xdsclient/xdsclient.go 0.00% <0.00%> (ø)
xds/internal/clients/lrsclient/load_store.go 0.00% <0.00%> (ø)

... and 30 files with indirect coverage changes

@purnesh42H purnesh42H changed the title xds: introduce generic xds client xDS and LRS Client API signatures xds: introduce generic xds clients xDS and LRS Client API signatures Jan 27, 2025
@purnesh42H purnesh42H added Area: xDS Includes everything xDS related, including LB policies used with xDS. Type: Feature New features or improvements in behavior labels Jan 27, 2025
@purnesh42H purnesh42H added this to the 1.71 Release milestone Jan 27, 2025
@purnesh42H purnesh42H force-pushed the generic-xds-client-2-interface branch 3 times, most recently from 042b236 to f8f7af5 Compare January 30, 2025 13:56
@dfawley dfawley assigned purnesh42H and unassigned easwars and dfawley Jan 30, 2025
@purnesh42H purnesh42H force-pushed the generic-xds-client-2-interface branch from f8f7af5 to 415b2ca Compare January 31, 2025 16:14
@purnesh42H purnesh42H requested a review from dfawley February 3, 2025 07:56
@purnesh42H purnesh42H assigned dfawley and unassigned purnesh42H Feb 3, 2025
@easwars easwars self-assigned this Feb 3, 2025
@purnesh42H
Copy link
Contributor Author

purnesh42H commented Feb 4, 2025

@dfawley i did the godoc review and made following changes (see the last commit)

  • removed the helper methods on struct that are not being used to avoid exporting them. we can add later during implementation.
  • removed square brackets from docstrings because actual param and struct field already has the link to respective object definition.

Comment on lines 38 to 39
// Servers specifies a list of xDS servers to connect to. This field should
// be used only for old-style names without an authority.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave "old" and "style" out of the docstrings and describe what it is or what it is used for in specific terms instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have modified the documentation. One more thing i have added in both Authority and here about order of servers in the list for precedence. Let me know if thats okay. I remember there was a question recently about how the fallback servers are chosen. Should we mention the gRFC as well?

@dfawley dfawley assigned purnesh42H and unassigned dfawley Feb 6, 2025
@purnesh42H purnesh42H force-pushed the generic-xds-client-2-interface branch from 20e26f8 to c3e96f7 Compare February 7, 2025 10:14
}

// PerCluster returns the PerClusterReporter for the given cluster and service.
func (s *LoadStore) PerCluster(clusterName, serviceName string) PerClusterReporter {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super nit. Maybe the receiver name can be ls instead of s. I don't know if we would have more methods on this type, but if we do, we would want to use the same receiver name on all of them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@easwars
Copy link
Contributor

easwars commented Feb 21, 2025

There are a few open questions here:

  • How should the client(s) and the streams (watch or load) be closed? We need to finalize on an approach that is consistent between the clients (if possible), and one that would work with an underlying reference counted implementation.
    • In the existing xDS client code, the function to create the client returns a cancel function to close the client. And the method to create the watch and the load report also return a cancel function which closes the watch and the load report respectively.
    • In the current PR, the New function returns a client which has a Close method.
      • The WatchResource method on the client returns a cancel function to cancel the watch.
      • The ReportLoad method on the client returns a LoadStore, but there is no way to cancel a load reporting stream.
  • Name of the parameter in the watch callbacks to be invoked by the user to indicate completion of local processing. (minor)
  • How and where will type assertions happen on the WatchResource method?
    • Finalize the plan on how this API will be used. As I mentioned previously, in the current codebase, we have convenience methods for each type of watch, and users don't generally directly call the generic WatchResource method.

@easwars easwars assigned purnesh42H and unassigned easwars Feb 21, 2025
@purnesh42H purnesh42H force-pushed the generic-xds-client-2-interface branch from fd7d9d4 to 07c5597 Compare February 21, 2025 09:10
@purnesh42H purnesh42H requested a review from easwars February 21, 2025 13:27
@purnesh42H
Copy link
Contributor Author

There are a few open questions here:

  • How should the client(s) and the streams (watch or load) be closed? We need to finalize on an approach that is consistent between the clients (if possible), and one that would work with an underlying reference counted implementation.

    • In the existing xDS client code, the function to create the client returns a cancel function to close the client. And the method to create the watch and the load report also return a cancel function which closes the watch and the load report respectively.

    • In the current PR, the New function returns a client which has a Close method.

      • The WatchResource method on the client returns a cancel function to cancel the watch.
      • The ReportLoad method on the client returns a LoadStore, but there is no way to cancel a load reporting stream.
  • Name of the parameter in the watch callbacks to be invoked by the user to indicate completion of local processing. (minor)

  • How and where will type assertions happen on the WatchResource method?

    • Finalize the plan on how this API will be used. As I mentioned previously, in the current codebase, we have convenience methods for each type of watch, and users don't generally directly call the generic WatchResource method.
  1. Keep the returned cancel function in xDS client watcher. Have Stop() method in LoadStore to close the LRS stream.
  2. Keep a single name like 'done' for function and document that way
  3. Figure out if we can avoid user requiring type assertion in watcher callbacks. Try if we can use generics. If we can't then current way is okay.

@dfawley @easwars do these look good to you?

@easwars
Copy link
Contributor

easwars commented Feb 24, 2025

Yes for 1 and 2.

And as discussed offline, using generics for 3 might not be worth the effort. It might be a fun side project to play with generics and to see how far you can get, but for the purpose of this effort, we are OK with having type assertion in the watcher.

@purnesh42H purnesh42H assigned easwars and dfawley and unassigned purnesh42H Feb 25, 2025
// indicates that resource deletion errors can be ignored and cached
// resource data can be used.
// indicates that resource deletion errors from xDS management servers can
// be ignored and cached resource data can be used.
//
// This will be removed in the future once we implement gRFC A88
// and two new fields FailOnDataErrors and
// ResourceTimerIsTransientError will be introduced.
IgnoreResourceDeletion bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still bothering me.

The transport shouldn't need this, so it shouldn't be in the thing we pass to it. It's specific to the xds client. So it should really be more like:

package clients

type ServerIdentifier /* ? open to bikeshedding this name */ struct {
	URI string
	Extensions any
}

...

package xdsclient

type ServerConfig struct {
	ServerIdentifier clients.ServerIdentifier
	IgnoreResourceDeletion bool
}

If we don't do this now, then eventually we'll need to add all server configuration flags into this top-level struct for all types of clients.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems wrong to me. Are you suggesting ServerIdentifier to be common struct between lrs and xds client? Because I remember @markdroth mentioning that even the flags are part of server hash. We do the same in internal client implementation.

Comment on lines -153 to 126
// ClientFeatures is a list of xDS features supported by this client.
// clientFeatures is a list of xDS features supported by this client.
// These features are set within the xDS client, but may be overridden only
// for testing purposes.
clientFeatures []string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also strikes me as "detail of xdsclient itself, so shouldn't be here", but I'm OK with leaving it for now...let's see how the rest of the code looks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah these will be populated by xds client implementation.

@@ -0,0 +1,57 @@
//revive:disable:unused-parameter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:(

Let's avoid annotations for tools polluting our code wherever possible.

For this one, can you not make the unused parameters nameless or _?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for this file and xdsclient because the params like locality are informational but I have removed from lrsclient.go


package lrsclient

// LoadStore keep track of the loads for multiple clusters and services that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"A LoadStore keeps track of"

And how about "collects" or "aggregates" instead of "keeps track of"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aggregates seems better

}

// PerCluster returns the PerClusterReporter for the given cluster and service.
func (ls *LoadStore) PerCluster(clusterName, serviceName string) PerClusterReporter {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something like ReporterForCluster?

Copy link
Contributor Author

@purnesh42H purnesh42H Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

// TransportBuilder is used to connect to the xDS management server.
TransportBuilder clients.TransportBuilder

// ResourceTypes is a map from resource type URLs to resource type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we delete TypeURL from ResourceType? (I feel like this came up during design review, too, but I'm not remembering if we decided to remove it, or keep it and why.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We pass around the ResourceType object and xdsclient implementation use typeURL while sending and receiving messages. So, we have to persist in ResourceType instance or pass around everywhere. I think having in ResourceType instance is more convenient.

// implementations. Each resource type URL uniquely identifies a specific
// kind of xDS resource, and the corresponding resource type implementation
// provides logic for parsing, validating, and processing resources of that
// type.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be really helpful to add an example here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean example of mapping like below?

listenerType = xdsclient.ResourceType{
		TypeURL:                    "type.googleapis.com/envoy.service.discovery.v3.Resource",
		TypeName:                   ListenerResourceTypeName,
		AllResourcesRequiredInSotW: true,
		Decoder:                    listenerDecoder{},
	}
resourceTypes["url"] = ListenerResourceType


// Authority contains configuration for an xDS control plane authority.
//
// See: https://github.com/grpc/grpc/blob/master/doc/grpc_xds_bootstrap_format.md
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link doesn't seem that helpful to me -- at least, it's a big-ish doc with a bunch of other stuff in it, and doesn't link to a section about the topic.

It doesn't seem like anyone actually uses federation, because I can't find any good documentation on it, either.

This is about all I can find in Envoy's docs, too:

https://github.com/cncf/xds/blob/main/proposals/TP1-xds-transport-next.md

https://www.envoyproxy.io/docs/envoy/latest/xds/core/v3/resource_locator.proto#xds-core-v3-resourcelocator

Copy link
Contributor Author

@purnesh42H purnesh42H Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have put https://www.envoyproxy.io/docs/envoy/latest/xds/core/v3/resource_locator.proto#xds-core-v3-resourcelocator as that makes more sense as someone who is using authority map would be familiar with xdstp

Comment on lines 64 to 71
// The order of the servers in this list reflects the order of preference
// of the data returned by those servers. xDS client use the first
// available server from the list.
//
// See gRFC A71 for more details on fallback behavior when the primary
// xDS server is unavailable.
//
// gRFC A71: https://github.com/grpc/proposal/blob/master/A71-xds-fallback.md
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// The order of the servers in this list reflects the order of preference
// of the data returned by those servers. xDS client use the first
// available server from the list.
//
// See gRFC A71 for more details on fallback behavior when the primary
// xDS server is unavailable.
//
// gRFC A71: https://github.com/grpc/proposal/blob/master/A71-xds-fallback.md
// See Config.Servers for more details.

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// gRFC A71: https://github.com/grpc/proposal/blob/master/A71-xds-fallback.md
XDSServers []clients.ServerConfig

// Extensions can be populated with arbitrary authority-specific data to be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this used for?

What happens when an Authority isn't used for the resource?

Copy link
Contributor Author

@purnesh42H purnesh42H Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't been able to think of any business use case for extensions in authority but in the design discussion we had agreed on having structs for all config types. For grpc name resolver we need ClientListenerResourceNameTemplate in authority.

@dfawley dfawley assigned purnesh42H and unassigned dfawley Feb 25, 2025
@purnesh42H purnesh42H force-pushed the generic-xds-client-2-interface branch from 6091dd8 to a1fc72b Compare February 26, 2025 08:56
@purnesh42H purnesh42H requested a review from dfawley February 26, 2025 15:57
@purnesh42H purnesh42H assigned dfawley and unassigned purnesh42H Feb 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: xDS Includes everything xDS related, including LB policies used with xDS. Type: Feature New features or improvements in behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants