-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BatchGet doesn't wait for BasePolicy.TotalTimeout #354
Comments
|
@khaf thanks for your quick reply. For batchCommand.Execute(), the cmd.conn, err = ifc.getConnection(policy)
if err != nil {
isClientTimeout = true
if err == ErrConnectionPoolEmpty {
// if the connection pool is empty, we still haven't tried
// the transaction to increase the iteration count.
iterations--
}
Logger.Debug("Node " + cmd.node.String() + ": " + err.Error())
continue
} |
It does. In the new commands, a new node will be selected (you never know why connection pool is empty, but it should mostly be due to a faulty node). The new command will be retried on a new node. |
@khaf The case for empty connection pool here is that we don't maintain min idle connection pool, because of historical limitation. So the new BatchGet after long time always needs waiting for a connection. Besides that, burst traffic always needs waiting for new connections. The question is new BatchGet doesn't wait for the BasePolicy.TotalTimeout. Here is the debug log below. So is there a way to wait until TotalTimeout for BatchGet?
I0607 07:36:39.613429 14051 command.go:1872] [aerospike] Node BB955E61AF14F0A 10.x.y.1:3000: Connection pool is empty. This happens when either all connection are in-use already, or no connections were available
I0607 07:36:39.614553 14051 command.go:1872] [aerospike] Node BB9D3C58669050A 10.x.y.2:3000: Connection pool is empty. This happens when either all connection are in-use already, or no connections were available
I0607 07:36:39.615713 14051 command.go:1872] [aerospike] Node BB955E61AF14F0A 10.x.y.1:3000: Connection pool is empty. This happens when either all connection are in-use already, or no connections were available
I0607 07:36:39.616863 14051 ... Get response: records=[nil] For aerospike client version: 3.1.1, and it seems latest code in master branch has the same logic. |
@khaf any comment? |
Sorry I missed your comment. While a new replica should be selected, the problem is that if that replica is also connection starved, you'll end up with errors. I'll have to think about how to deal with this situation. Will get back to you later this week. ps. Why are you so worried about using MinIdleConns? |
@khaf thanks for your comment. MinIdleConns is great feature. Most import, even enabling MinIdleConns, the conns will NOT be enough for traffic peak(10X compared to normal), because of so many app instances. |
@khaf any update from your side? |
I'm still working on this, trying multiple different approaches. The current workaround would be increasing |
After working on this issue for a few days, I have come to the conclusion that the best scalable way to handle this use case for now would be tweaking the value of the following policy attributes:
Any other solution adds a bottleneck for a different use case. The retry mechanism should handle these cases well, given enough time and retries. |
@khaf sounds good. thanks for the work |
For waiting for new connection,
aerospike-client-go/command.go
Line 2079 in 2a68420
For BatchGet, why doesn't it wait until BasePolicy.TotalTimeout?
The text was updated successfully, but these errors were encountered: