Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SentinelConnector set KnownNodes empty during redis failover, which causes app could not access redis any more #3017

Open
jiajunsu opened this issue Oct 16, 2024 · 8 comments
Labels
for: team-attention An issue we need to discuss as a team to make progress status: help-wanted An issue that a contributor can help us with type: feature A new feature

Comments

@jiajunsu
Copy link

jiajunsu commented Oct 16, 2024

Bug Report

Current Behavior

In redis sentinel mode, lettuce may refresh topology nodes to empty, if the connection to redis sentinel closed just after lettuce received TopologyRefreshMessage. And that will cause lettuce cannot recover anymore until received next TopologyRefreshMessage.

We have two redis nodes, and redis is running in sentinel mode.
Assume the redis nodes' names are redis1 and redis2, and redis1 is master at the beginning, we inject errors as below:

  1. ifconfig down network at redis1, and wait the redis failover to redis2 finished
  2. ifconfig up network at redis1, and wait the redis failover to redis1 again
  3. ifconfig down network at redis2

At step3, redis sentinel send +sdown sentinel ... to lettuce client, and trigger lettuce executing method SentinelConnector::getTopologyRefreshRunnable. While at the same time, the connection between lettuce client and redis sentinel is closed, and then error occured.

lettuce logs below

[WARN] io.lettuce.core.RedisClient:605 Cannot connect Redis Sentinel at redis://*:*: java.util.concurrent.CompletionException: io.netty.channel.StacklessClosedChannelException
[WARN] io.lettuce.core.masterreplica.SentinelConnector:117 Topology refresh returned no nodes from redis-sentinel://**@*:*,*:*?sentinelMasterId=mymaster

And since that, the app could not handle any read and write operation because the knownNodes is empty.

org.springframework.data.redis.redisSystemException: Redis exception; nested exception is io.lettuce.core.RedisException: Cannot determine a node to read (Known nodes: []) while setting io.lettuce.core.ReadFromImpl$ReadFromUpstreamPreferred

org.springframework.data.redis.redisSystemException: Redis exception; nested exception is io.lettuce.core.RedisException: Master is currently unknown: []

By reviewing the git log, we've found this problem existed since commit Do not fail Master/Slave topology refresh if a node is not available. And it still exists at branch master;

When Requests#getRequest return null or future.isDone() returns false, it'll clear MasterReplicaConnectionProvider#knownNodes and the lettuce client could not get recovered untill receiving next TopologyRefreshMessage, or the lettuce client process was restarted.

src/main/java/io/lettuce/core/masterreplica/Requests.java
    protected void onEmit(Emission<List<RedisNodeDescription>> emission) {

        List<RedisNodeDescription> result = new ArrayList<>();

        Map<RedisNodeDescription, Long> latencies = new HashMap<>();

        for (RedisNodeDescription node : nodes) {

            TimedAsyncCommand<String, String, String> future = getRequest(node.getUri());

            if (future == null || !future.isDone()) {  
                // if this expression is true for all nodes, it'll clear MasterReplicaConnectionProvider#knownNodes
                continue;
            }

            RedisNodeDescription redisNodeDescription = findNodeByUri(nodes, node.getUri());
            latencies.put(redisNodeDescription, future.duration());
            result.add(redisNodeDescription);
        }

        SortAction sortAction = SortAction.getSortAction();
        sortAction.sort(result, new LatencyComparator(latencies));

        emission.success(result);
    }

Environment

  • Lettuce version(s): 6.1.10.RELEASE
  • Redis version: 6.2.7
  • Spring Boot version: 2.7.18
  • JRE version: 11.0.8

Possible Solution

We've checked the lettuce code, the code in SentinelConnector::getTopologyRefreshRunnable may be improved, to avoid setting empty list to knownNodes.

src/main/java/io/lettuce/core/masterreplica/SentinelConnector.java
    private Runnable getTopologyRefreshRunnable(MasterReplicaTopologyRefresh refresh,
            MasterReplicaConnectionProvider<K, V> connectionProvider) {

        return () -> {
            try {

                LOG.debug("Refreshing topology");
                refresh.getNodes(redisURI).subscribe(nodes -> {

                    EventRecorder.getInstance().record(new MasterReplicaTopologyChangedEvent(redisURI, nodes));

                    if (nodes.isEmpty()) {
                        LOG.warn("Topology refresh returned no nodes from {}", redisURI);
                        return;  // Add return here, to avoid setKnownNodes with empty nodes.
                    }

                    LOG.debug("New topology: {}", nodes);
                    connectionProvider.setKnownNodes(nodes);

                }, t -> LOG.error("Error during background refresh", t));

            } catch (Exception e) {
                LOG.error("Error during background refresh", e);
            }
        };
    }
@jiajunsu
Copy link
Author

jiajunsu commented Oct 18, 2024

Here is the modification to reproduce this issue.

01f96a6

Stack trace
[INFO] Running io.lettuce.core.masterreplica.MyTest
2024-10-18 17:11:17 [ERROR] [main] TEST: getNodes called count 1 (MasterReplicaTopologyRefresh:60)
2024-10-18 17:11:18 [ERROR] [lettuce-epollEventLoop-4-1] TEST: refresh getNodes initialNodes [RedisMasterReplicaNode [redisURI=redis://localhost:6482?timeout=10s, role=UPSTREAM], RedisMasterReplicaNode [redisURI=redis://localhost:6483?timeout=10s, role=REPLICA]] (MasterReplicaTopologyRefresh:65)
2024-10-18 17:11:19 [ERROR] [lettuce-epollEventLoop-4-2] TEST: nodeList after ping [RedisMasterReplicaNode [redisURI=redis://localhost:6483?timeout=10s, role=REPLICA], RedisMasterReplicaNode [redisURI=redis://localhost:6482?timeout=10s, role=UPSTREAM]] (MasterReplicaTopologyRefresh:85)
2024-10-18 17:11:19 [ERROR] [main] TEST: +sdown 1 (MyTest:47)
2024-10-18 17:11:24 [ERROR] [lettuce-eventExecutorLoop-1-4] TEST: getNodes called count 2 (MasterReplicaTopologyRefresh:60)
2024-10-18 17:11:24 [ERROR] [lettuce-epollEventLoop-4-6] TEST: refresh getNodes initialNodes [RedisMasterReplicaNode [redisURI=redis://localhost:6482?timeout=10s, role=UPSTREAM], RedisMasterReplicaNode [redisURI=redis://localhost:6483?timeout=10s, role=REPLICA]] (MasterReplicaTopologyRefresh:65)
2024-10-18 17:11:24 [ERROR] [lettuce-epollEventLoop-4-8] TEST: inject requestPing failing (MasterReplicaTopologyRefresh:75)
2024-10-18 17:11:24 [ERROR] [lettuce-epollEventLoop-4-8] TEST: Requests node redis://localhost:6482?timeout=10s set future to null  (Requests:63)
2024-10-18 17:11:24 [ERROR] [lettuce-epollEventLoop-4-8] TEST: Requests node redis://localhost:6483?timeout=10s set future to null  (Requests:63)
2024-10-18 17:11:24 [ERROR] [lettuce-epollEventLoop-4-8] TEST: nodeList after ping [] (MasterReplicaTopologyRefresh:85)
2024-10-18 17:11:24 [WARN ] [lettuce-epollEventLoop-4-8] Topology refresh returned no nodes from redis-sentinel://127.0.0.1,127.0.0.1:26380?sentinelMasterId=mymaster&timeout=10s (SentinelConnector:126)
2024-10-18 17:11:29 [ERROR] [main] TEST: +sdown 2 (MyTest:50)
2024-10-18 17:11:30 [ERROR] [main] TEST: +sdown 3 (MyTest:53)
2024-10-18 17:11:34 [ERROR] [lettuce-eventExecutorLoop-1-1] TEST: getNodes called count 3 (MasterReplicaTopologyRefresh:60)
2024-10-18 17:11:34 [ERROR] [lettuce-epollEventLoop-4-9] TEST: refresh getNodes initialNodes [RedisMasterReplicaNode [redisURI=redis://localhost:6482?timeout=10s, role=UPSTREAM], RedisMasterReplicaNode [redisURI=redis://localhost:6483?timeout=10s, role=REPLICA]] (MasterReplicaTopologyRefresh:65)
2024-10-18 17:11:34 [ERROR] [lettuce-epollEventLoop-4-10] TEST: inject requestPing failing (MasterReplicaTopologyRefresh:75)
2024-10-18 17:11:34 [ERROR] [lettuce-epollEventLoop-4-10] TEST: Requests node redis://localhost:6482?timeout=10s set future to null  (Requests:63)
2024-10-18 17:11:34 [ERROR] [lettuce-epollEventLoop-4-10] TEST: Requests node redis://localhost:6483?timeout=10s set future to null  (Requests:63)
2024-10-18 17:11:34 [ERROR] [lettuce-epollEventLoop-4-10] TEST: nodeList after ping [] (MasterReplicaTopologyRefresh:85)
2024-10-18 17:11:34 [WARN ] [lettuce-epollEventLoop-4-10] Topology refresh returned no nodes from redis-sentinel://127.0.0.1,127.0.0.1:26380?sentinelMasterId=mymaster&timeout=10s (SentinelConnector:126)
2024-10-18 17:11:40 [ERROR] [main] TEST: known nodes [] (MyTest:58)
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 26.33 s <<< FAILURE! -- in io.lettuce.core.masterreplica.MyTest
[ERROR] io.lettuce.core.masterreplica.MyTest.refreshEmptyNodes -- Time elapsed: 24.42 s <<< ERROR!
io.lettuce.core.RedisException: Master is currently unknown: []
        at io.lettuce.core.masterreplica.MasterReplicaConnectionProvider.getMaster(MasterReplicaConnectionProvider.java:305)
        at io.lettuce.core.masterreplica.MyTest.refreshEmptyNodes(MyTest.java:60)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)

@tishun
Copy link
Collaborator

tishun commented Oct 18, 2024

Hey @jiajunsu ,

Thanks for the verbose analysis. Before we jump into the way the Lettuce driver works let's first analyse the scenario and environment you are working with and why sentinel returns an empty list of nodes in the first place.

How many sentinel processes are you using to test this scenario? Can you describe your complete environment?

@tishun tishun added the status: waiting-for-feedback We need additional information before we can continue label Oct 18, 2024
@jiajunsu
Copy link
Author

jiajunsu commented Oct 21, 2024

@tishun

  1. With 2 sentinel, 1 upstream and 1 replica nodes running.
  2. The empty list of nodes is not replied by sentinel, I've tested that possibility, and found that empty-list could not be responsed from sentinel(even if sentinel response empty list, lettuce code could not reach the WARN log Topology refresh returned no nodes).
code in src/main/java/io/lettuce/core/masterreplica/MasterReplicaTopologyRefresh.java
    public Mono<List<RedisNodeDescription>> getNodes(RedisURI seed) {
        CompletableFuture<List<RedisNodeDescription>> future = topologyProvider.getNodesAsync();

        Mono<List<RedisNodeDescription>> initialNodes = Mono.fromFuture(future).doOnNext(nodes -> {
            applyAuthenticationCredentials(nodes, seed);
        });
        return initialNodes.map(this::getConnections)  // if initialNodes is empty, map `getConnections` will be skipped
                .flatMap(asyncConnections -> asyncConnections.asMono(seed.getTimeout(), eventExecutors))
                ...
    }

And the empty list was made from connections.requestPing(). The detail is:

  1. sentinel send +sdown to lettuce client.
  2. lettuce try to get nodes list from sentinel and success.
  3. lettuce try to PING the nodes, and at the same time, the redis nodes are unreachable from network.
  4. at last we get an empty list in SentinelConnector#getTopologyRefreshRunnable

This could be reproduced by running MyTest in my forked repo and commit. Just make the test environment by make prepare and make start, and the testcase use the sentinel in lettuce unittest environment running with port 26379 and 26380, while the upstream-replica ports are 6482 and 6483.

@tishun
Copy link
Collaborator

tishun commented Oct 21, 2024

Oh, okay, let me address these one at the time:

With 2 sentinel, 1 upstream and 1 replica nodes running.

This is not a recommended setup for Redis Sentinel as you have to have at least 3 sentinel nodes to achieve master promotion and avoid split brain scenarios. See Example 1 and the explanation why this is a broken setup.

The empty list of nodes is not replied by sentinel, I've tested that possibility, and found that empty-list could not be responsed from sentinel(even if sentinel response empty list, lettuce code could not reach the WARN log Topology refresh returned no nodes).

My bad, confused topology refresh with master/replica refresh. There are two types of topology refresh (check the SentinelConnector:connectAsync() method):

  1. MasterReplicaTopologyRefresh - handles the connection to the current master node and monitors if a replica node is promoted to a master node
  2. SentinelTopologyRefresh - handles the overall nodes in the current Sentinel topology (all masters and replicas)

And the empty list was made from connections.requestPing(). The detail is:

  1. sentinel send +sdown to lettuce client.
  2. lettuce try to get nodes list from sentinel and success.
  3. lettuce try to PING the nodes, and at the same time, the redis nodes are unreachable from network.
  4. at last we get an empty list in SentinelConnector#getTopologyRefreshRunnable

I think you are simulating an invalid state of the system, which the driver could not (reasonably) handle. The fix in #3019 is also not valid, because we will end up with a list of nodes that we know we cant connect to. At the moment the refresh is happening, there is basically no valid node to connect to, as the masters returned from the NODES command are not returning a reply to our PING. The only thing the driver could do is wait until the next refresh attempt. What else could it do? If it attempts to send commands to the last known nodes they are sure to fail, as they are not reachable.

At this point I think this is an invalid scenario to consider.

@jiajunsu
Copy link
Author

This is not a recommended setup for Redis Sentinel as you have to have at least 3 sentinel nodes to achieve master promotion and avoid split brain scenarios. See Example 1 and the explanation why this is a broken setup.

Sorry for the misleading response. In prod, we do have 3 sentiniel nodes. The 2 running sentinel is in my test env for reproducing this issue.

At the moment the refresh is happening, there is basically no valid node to connect to, as the masters returned from the NODES command are not returning a reply to our PING. The only thing the driver could do is wait until the next refresh attempt. What else could it do? If it attempts to send commands to the last known nodes they are sure to fail, as they are not reachable.

IMHO, a better way may could be:

  1. If some nodes(not all nodes) do not reply to PING, we could remove them from KnownNodes. Possibly those nodes are fault.
  2. If all nodes do not reply to PING, we could just do nothing rather than clear the KnownNodes list. Possibly it's not the redis nodes' problem, may the lettuce process node's network being broken. After we clear KnownNodes, if the sentinel don't send any psub message, the lettuce client could not get recovered ever.

I just compared the lettuce features between redis-cluster and redis-sentinel, maybe we could support enablePeriodicRefresh feature in sentinel mode, just like what the cluster mode do. That could handle this scenario in a better way.

@tishun
Copy link
Collaborator

tishun commented Oct 22, 2024

Sorry for the misleading response. In prod, we do have 3 sentiniel nodes. The 2 running sentinel is in my test env for reproducing this issue.

I understand. I do - however - feel this would not happen if you use 3 sentinel nodes.
Have you tried reproducing the problem this way?

IMHO, a better way may could be:

  1. If some nodes(not all nodes) do not reply to PING, we could remove them from KnownNodes. Possibly those nodes are fault.
  2. If all nodes do not reply to PING, we could just do nothing rather than clear the KnownNodes list. Possibly it's not the redis nodes' problem, may the lettuce process node's network being broken. After we clear KnownNodes, if the sentinel don't send any psub message, the lettuce client could not get recovered ever.
  1. We are in agreement about step 1, this is also how the logic should be working right now
  2. Again - the question here is WHY is there a situation where no nodes are returned, and I believe this is an invalid state of the system. I disagree we should leave the existing knownNodes as they are. If we do that it would mean that we assume that the knownNodes would eventually come up - and we have no way of knowing that.

Here is the modification to reproduce this issue.

01f96a6

This test scenario simply simulates what would happen if a PING command fails. I am not sure why the Sentinel would return a node that is unreachable. Could you verify that somehow? Could it be that the master is never promoted?

I just compared the lettuce features between redis-cluster and redis-sentinel, maybe we could support enablePeriodicRefresh feature in sentinel mode, just like what the cluster mode do. That could handle this scenario in a better way.

Assuming we add this periodic refresh - which node should we ask for what type of information?
Currently we depend on the Sentinel to push data about all master nodes, and all master nodes to give us master/replica status. If we are to believe your research the Sentinel pushes a topology refresh, giving us a master node and when we attempt to connect to it we get a list of nodes that we can't PING.

@jiajunsu
Copy link
Author

jiajunsu commented Oct 22, 2024

I do - however - feel this would not happen if you use 3 sentinel nodes.
Have you tried reproducing the problem this way?

The original case was happend in our production environment, with 3 sentinel nodes. And I tried to reproduce it by dropping down the replica and 2 sentinel nodes, it didn't happen in about 100 times. That means the trigger condition is too extream to reproduce in real environment. So I have to inject exceptions to lettuce code and test it in unittest env.

WHY is there a situation where no nodes are returned, and I believe this is an invalid state of the system. I disagree we should leave the existing knownNodes as they are. If we do that it would mean that we assume that the knownNodes would eventually come up - and we have no way of knowing that.

Because I failed to reproduce it in real env, and without debug logs, it's hard to get the detail reason why PING command fails. After reading and testing the lettuce code, it seems to be just one possible cause that could make that WARN log Topology refresh returned no nodes. Considering the network latency between lettuce client and redis sentinel/master, a likely scenario maybe the network between lettuce and sentinel/master was down just after the psub message being sent from sentinel. But it's just a guess.
I also checked the redis sentinel logs, and it records the psub messages was sent at that time.

redis sentinel logs

This log is from sentinel-node3, at the same available zone with the master node

22:10:51.299 # +sdown sentinel sentinel-node1  @ mymaster 
22:10:51.371 # +sdown slave slave-node  @ mymaster 
22:10:51.587 # +sdown sentinel sentinel-node2  @ mymaster 

I agree that leaving knownNodes with unreachable data is not a good idea. Clearing knownNodes and waiting for next psub message to refresh the topology data is not a good choice either. So maybe supporting periodicRefresh is a better choice.

Assuming we add this periodic refresh - which node should we ask for what type of information?

We could make it the same way as the callback function for psub message, SentinelConnector#getTopologyRefreshRunnable, getting master and slave nodes from sentinel. Differ from just receiving psub message from sentiniel passively, we could refresh
knownNodes periodically. That means, if that extream scenario happened, the lettuce client could get to work just after the network recovered, without doing anything by SRE(like restarting the client process, that is what we have to do in this case).

@tishun tishun added type: feature A new feature and removed status: waiting-for-feedback We need additional information before we can continue labels Oct 22, 2024
@tishun
Copy link
Collaborator

tishun commented Oct 22, 2024

This is a fair point. We could improve the resiliency of the driver if we schedule some action in this particular scenario.

I think we could model it in a similar manner to the ClusterTopologyRefreshScheduler.

This is no trivial task however, I will have to think on it's priority given our backlog and given the fact it is a bit of an exotic issue. Meanwhile contributions are welcome in that direction.

@tishun tishun added status: help-wanted An issue that a contributor can help us with for: team-attention An issue we need to discuss as a team to make progress labels Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
for: team-attention An issue we need to discuss as a team to make progress status: help-wanted An issue that a contributor can help us with type: feature A new feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants