-
Notifications
You must be signed in to change notification settings - Fork 653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide ability for local cluster topology query #808
base: unstable
Are you sure you want to change the base?
Provide ability for local cluster topology query #808
Conversation
Some applications might require frequent node cluster slots ownership query. For example lets take the case of cluster wide scan of keys. the application would need to perform the scan on each cluster node and after completion keep track of which slots were already covered. this will require to make an extra call to cluster SLOTS/SHARDS/NODES follow each scan. While we have some optimizations for some of these commands (eg cache cluster slots) It might not provide a good enough solution for clients who will still have to parse a potentially very long cluster slots response. This PR suggests the following modifications: 1. CLUSTER NODES will have an optional extra parameter 'node-id'. User can also provide 'myself'. When node-id is provided reply will contain information ONLY for the specific node. Example: ``` > cluster nodes myself "fcc9d096ad7dfa9f365c917941201a5bf037eb20 127.0.0.1:7000@17000 myself,slave 7cd62de6bb802699dad0c33961b2542c88afbeca 0 0 7 connected" ``` 2. CLUSTER SLOTS will have an optional extra parameter 'node-id'. User can also provide 'myself'. When node-id is provided reply will contain information ONLY for the specific node. In case the node is a replica (owns no slots) the information will be taken for the nodes current primary. Note that since small cluster might have few nodes but a very large slots response (eg fragmented slots range) We will also seperatly cache the local cluster slots response. Example: ``` > cluster slots myself 1) 1) (integer) 5461 2) (integer) 10922 3) 1) "127.0.0.1" 2) (integer) 7001 3) "7330c7461582a447520745518f2d3a53ea2a5b0c" 4) (empty array) 4) 1) "127.0.0.1" 2) (integer) 7005 3) "f8b7e3188b0ec5bc6ea1a52342733dbb3dde2680" 4) (empty array) ``` 3. CLUSTER SHARDS will have an optional extra parameter token 'myself'. When myself is provided, only the node's shards data will be return as part of the reply. Example: ``` > cluster shards myself 1) 1) "slots" 2) (empty array) 3) "nodes" 4) 1) 1) "id" 2) "7330c7461582a447520745518f2d3a53ea2a5b0c" 3) "port" 4) (integer) 7001 5) "ip" 6) "127.0.0.1" 7) "endpoint" 8) "127.0.0.1" 9) "role" 10) "master" 11) "replication-offset" 12) (integer) 2716 13) "health" 14) "online" 2) 1) "id" 2) "f8b7e3188b0ec5bc6ea1a52342733dbb3dde2680" 3) "port" 4) (integer) 7005 5) "ip" 6) "127.0.0.1" 7) "endpoint" 8) "127.0.0.1" 9) "role" 10) "replica" 11) "replication-offset" 12) (integer) 2716 13) "health" 14) "online" ``` Signed-off-by: Ran Shidlansik <[email protected]>
Signed-off-by: ranshid <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## unstable #808 +/- ##
============================================
+ Coverage 70.41% 70.65% +0.24%
============================================
Files 112 112
Lines 61509 61577 +68
============================================
+ Hits 43309 43505 +196
+ Misses 18200 18072 -128
|
Signed-off-by: ranshid <[email protected]>
Signed-off-by: ranshid <[email protected]>
@@ -834,22 +834,19 @@ void clusterCommand(client *c) { | |||
|
|||
if (c->argc == 2 && !strcasecmp(c->argv[1]->ptr, "help")) { | |||
clusterCommandHelp(c); | |||
} else if (!strcasecmp(c->argv[1]->ptr, "nodes") && c->argc == 2) { | |||
} else if (!strcasecmp(c->argv[1]->ptr, "nodes") && c->argc <= 3) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE to self: are all these checks really needed as we have command arity and schema validaitons?
…oduce-cluster-nodes-myself Signed-off-by: Ran Shidlansik <[email protected]>
Signed-off-by: Ran Shidlansik <[email protected]>
Signed-off-by: Ran Shidlansik <[email protected]>
Signed-off-by: Ran Shidlansik <[email protected]>
Signed-off-by: Ran Shidlansik <[email protected]>
@asafpamzn @barshaul Can you give feedback on what you guys think about the implementation. |
As you mentioned Ran, most clients don’t store the ‘node-id’ and instead use ‘host:port’ as the identifier. So, adding the ‘node-id’ alone wouldn’t be particularly useful, as it would require client-side code changes. However, including the ‘myself’ flag makes it compatible without requiring changes to existing clients. So overall LGTM, and I support the followup host:ip option, as it is more client- and user-friendly. Aside from executing a cluster-wide SCAN, are there other cases where you found this addition could be beneficial? @avifenesh worked on the cluster scan in GLIDE—maybe you have some insights to add? |
Speaking from the perspective of wide cluster scan only, that will be very beneficial, and potentially decrease the cost to almost "regular" Scan. |
Personally, I think we should instead actually do #33. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beyond the feature itself, I'm concerned about the current state of cluster topology API.
I feel like maintaining all the 3 API(s) is going to be a pain in the longer term. I think we should figure out which API slots
, shards
, nodes
has the right interface and flexibility to keep improving in the future. Then, we should make changes into that particular API. I felt like we all were aligning towards CLUSTER SLOTS
.
Overall, this should make the client developers life simpler (don't need to figure out which one is better) and also force them to pick the one which has all the latest improvements.
I feel like the three have different roles and returning different amount of information. |
@ranshid, I’m not entirely sure how common this specific use case will be, but I’m open to the idea. That said, if we’re going to introduce it, I think we should avoid assuming the position of new arguments. Using key-value pairs for filters, like this: CLUSTER SLOTS FILTER shard_id=<id> node_id=<id> ip=<ip> port=<port> would allow us to add more filtering options later without worrying about argument order or compatibility. |
I feel like we learned a very painful lesson from
The typical way to add these types of arguments would be like:
Typically all arguments are space delimited. |
@valkey-io/core-team So reading through this again, right now I don't think we should move forward with the proposal since the use case isn't all that clear. For the SCAN use case, I think we have the alternative proposal with #33 which I think is a bit more practical of a proposal for solving cluster scan. |
Let's move to #33 and can we close this pr now? |
Some applications might require frequent node cluster slots ownership query. For example lets take the case of cluster wide scan of keys. the application would need to perform the scan on each cluster node and after completion keep track of which slots were already covered. this will require to make an extra call to cluster SLOTS/SHARDS/NODES follow each scan. While we have some optimizations for some of these commands (eg cache cluster slots) It might not provide a good enough solution for clients who will still have to parse a potentially very long cluster slots response.
This PR suggests the following modifications:
Note about using node-id as parameter
In this initial PR I suggested to use the node-id in order to select which cluster node the data is requested for.
This is good in order to allow clients to randomly send the request to any of the cluster nodes, or to traget a specific node without the need to understand exactly which node is it (using myself).
However clients usually do not use cluster node-id and most clients would probably like to have host+port flavor of these different subcommands.
I think it would be fairly easy to add host port flavor eg:
cluster nodes 127.0.0.1 6397
or
cluster nodes some-node-hostname.com 7000
but the implementation would need to search the node by ip/hostname which will require some extra work and I think we can add it as a followup.
TODO: