Limit data sent after dht differences #94
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Applies the same logic that we have for "new ops" to the op ids returned by the DHT diff process. I believe this is necessary regardless of whether it's a complete solution to the "new agent joining an existing network" problem.
By leaving this unlimited, we'd just return the entire data set in one batch during the first gossip round with new peers. That's too much load on the peer you happen to hit. As I'm writing, I'm wondering if we should enforce a maximum value for this that peers can request...
By limiting how much data we get in one go, we aren't doing anything too complicated like trying to decide which sectors we want data from or which time slices, when both are sparse and we'd get unpredictable results that way. Like this, we do our best to sync. Then we have to hope that between one gossip round and the next, the other peer will fetch enough to reduce the number of sectors/slices that produce a diff on the next round. If it's not that quick, then over time that is the effect.
As soon as we stop having a diff for a time slice or a sector, we'll be able to progress to new areas of the DHT in a natural way. There will be some duplicate requests of op ids but that's a reasonable thing to have happen I think. it's part of learning about the network.
This change really needs tests that exercise the 3 different paths
But adding those tests requires modifying the testing in the same ways as #85