You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After restarting a node that previously created K/V ensembles, riak_ensemble will start up those ensemble peers before riak_kv itself has come up. This leads to the peers all crashing trying to access the AAE information ETS table as part of the riak_kv_ensemble_backend:sync logic. Not only does this spam the log, but at least one case has been seen where the crashes happen fast enough to force the riak_ensemble_peer supervisor to reach maximum restart intensity and bring the Riak node down.
We should ensure that ensemble peers are not started before the relevant application has started.
Suggested solution is to extend the riak_ensemble_backend behavior to have a ready_to_start (or some such named) callback that is checked by the riak_ensemble_manager to gate the call to riak_ensemble_peer_sup:start_peer. This allows backends to generically choose their ready conditions.
For riak_kv_ensemble_backend, this new callback should be implemented to return true only when the riak_kv service is ready/up.
After restarting a node that previously created K/V ensembles,
riak_ensemble
will start up those ensemble peers beforeriak_kv
itself has come up. This leads to the peers all crashing trying to access the AAE information ETS table as part of theriak_kv_ensemble_backend:sync
logic. Not only does this spam the log, but at least one case has been seen where the crashes happen fast enough to force theriak_ensemble_peer
supervisor to reach maximum restart intensity and bring the Riak node down.We should ensure that ensemble peers are not started before the relevant application has started.
Suggested solution is to extend the
riak_ensemble_backend
behavior to have aready_to_start
(or some such named) callback that is checked by theriak_ensemble_manager
to gate the call toriak_ensemble_peer_sup:start_peer
. This allows backends to generically choose their ready conditions.For
riak_kv_ensemble_backend
, this new callback should be implemented to returntrue
only when theriak_kv
service is ready/up./cc basho/riak#536
The text was updated successfully, but these errors were encountered: