Add operation on ShuffleSharding to filter READONLY ingesters #6517
+243
−51
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does:
We found an issue when a tenant has
ingestion_tenant_shard_size
lower than the number of ACTIVE ingesters or high number of READONLY ingesters when testing the new status of READONLY.Eg:
Failed Push
Lets assume we have a ring with
10 ACTIVE ingesters
50 READONLY ingesters
tenantA ingestion_tenant_shard_size of 20
The current subRing of this tenant can be created with only READONLY ingesters. In this case, DoBatch will fail as there will be no health ingesters to send data.
Early throttle
Lets assume we have a ring with
80 ACTIVE ingesters
20 READONLY ingesters
tenantA ingestion_tenant_shard_size of 20
The current subRing can be created as a mix of ACTIVE and READONLY ingesters. This will cause a subRing of size 20 but only 15 ACTIVE ingesters supposedly. The localLimit for each ingesters will be calculated over 20 as the shard size but only 15 ingester are receiving all data. The new scenario will create a subRing over just the 80 ACTIVE ingesters.
This PR introduce a ShuffleShard ring filter by operation. We then create a new
WriteShard
operation to filter out READONLY ingesters of shuffleSharding subRings. The cache for sharding will be clean as usual when the ring changes or a READONLY status changes in the ring.In the first case, we will create a subRing with only 10 ACTIVE ingesters avoiding causing 5xx on Push
The last scenario, we will create a subRing taking in consideration just the 80 ACTIVE ingesters avoiding READONLY ingester s to miss count the limit.
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]