Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scripts for estimating validation bit rate #1240

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

nfrisby
Copy link
Contributor

@nfrisby nfrisby commented Sep 6, 2024

No description provided.

@nfrisby nfrisby changed the title Scripts for estimating validation rate Scripts for estimating validation bit rate Sep 6, 2024
@nfrisby
Copy link
Contributor Author

nfrisby commented Sep 6, 2024

This PR is draft.

  • I'm about to open a PR to add block size to the benchmark-ledger-ops, which will be a nice simplification of this script.
  • I should actually script-ify it, though the db-analyser run takes many hours.

@nfrisby
Copy link
Contributor Author

nfrisby commented Sep 6, 2024

Here's what the resulting plot looks like for me, today.
image

github-merge-queue bot pushed a commit that referenced this pull request Sep 9, 2024
…k-header-size (#1241)

For example, the analysis in PR
#1240 can be
simplified via this PR.
@nfrisby
Copy link
Contributor Author

nfrisby commented Sep 9, 2024

Esgen also has some matplotlib plots that slice the data slightly different and are a bit more information dense.

I suggest we use this PR to decide what on one presentation and only merge that.

@nfrisby
Copy link
Contributor Author

nfrisby commented Sep 12, 2024

@amesgen My scripts divide time into 10 seconds chunks. If I recall correctly, your Python script actually slides a window of X blocks, for a few values of X.

I think using proper time is closer to information we want, but it introduces the wrinkle of deciding how to account for blocks whose validation spans a boundary of the sliding window.

In my script, I include only all blocks that began validating in the 10 seconds window, even if the last such block took 100 seconds to validate. In fact, if some block were to take more than 2*10 seconds to validate, then there'd necessarily be at least one 10 seconds chunk in which there were no blocks whatsoever. 🤔

My script uses the total validation time of those blocks as the denominator --- it doesn't unsoundly assume it took exactly 10 seconds to validate those blocks. This means the "windows" in my analysis are of actually of varying size... as determined by the moments between the blocks' validation intervals --- which is quite similar to yours, actually!

The following approach might be excessively accurate, but: I think it would be most appropriate (at least for for our current purposes) to partition a block's size into multiple (sliding) windows in proportion to how much of the block's total validation duration overlaps with the window. Which is a relatively straight-forward calculation, despite making the sliding window logic unusually dynamic/awkward.

Edit: Having boldly written "the most appropriate", I couldn't help but immediately start considering alternatives. I'll share one in my next comment.

@nfrisby
Copy link
Contributor Author

nfrisby commented Sep 13, 2024

This is similar to my first idea, except I divided time in to 10 seconds chunks 100 times, each offset by 0.1 seconds. Then, for each set of 100 "aligned" windows, I kept only the one with the maximum bit rate.

image

Something is still slightly off: the ratio I'm calculating is as if I need to download the same blocks I'm validating as I'm validating them. That's not quite right: I need to download the "next" blocks, not the ones I am validating. I'll give this a bit more thought tomorrow.

@nfrisby
Copy link
Contributor Author

nfrisby commented Sep 13, 2024

After my morning call with Esgen, we developed the plot generated by my most recent commit. Here are the two files that come out. You'll want to view them in a dedicated browser tab so that you can zoom to 100% height and the scroll along the x-axis.

See the README.md for details. Summary: each data point is the necessary bit rate in order to download the blocks numbered [X+B+1,X+2B] blocks while the blocks numbered [X+1, X+B] are being validated. There's one data point per block, as the two B-sized windows slide.

plot-1.png (2.2 megabytes)
plot-2.png (3.2 megabytes)

It's not trivial to relate this buffer size to the current code. It has up to 10 blocks explicitly buffered between the BlockFetch client and the ChainSel logic, but it also has an additional "buffer" of the bytes in-flight with the BlockFetch peer. For a strong connection, that might be enough bytes to fit several max size blocks, eg.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant