-
Notifications
You must be signed in to change notification settings - Fork 689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[wip]perf: cache block related data in ChainStore
#13006
base: master
Are you sure you want to change the base?
Conversation
chain/chain/src/store/mod.rs
Outdated
block_cache: RefCell<HashMap<CryptoHash, Block>>, | ||
block_header_cache: RefCell<HashMap<CryptoHash, BlockHeader>>, | ||
block_hash_by_height_cache: RefCell<HashMap<BlockHeight, CryptoHash>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I think it might make more sense to maintain a view of the most recent blocks in memory (pending analysis into access patterns), rather than caching separate queries and trying to implement something like LRU-evicted access-based cache here.
This would among other things potentially make the cache maintenance somewhat more straightforward and easier to reason about. First, because the only place where the block information is written is in ChainStoreUpdate::finalize
, so this is where we'd update the cache too. And second is because some stray query (which can happen due to these queries being dependent on the user input) for a block from yesterday ago wouldn't affect the happy path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than caching separate queries
After making ChainStoreUpdate::finalize
update the cache, I tried to return cache.block(h).header()
from get_block_header
. That made a lot of tests fail and turns out some columns get special treatment in finalize
or garbage collection. So the safest way might be to have a separate cache for each DBCol
, even if it's a bit of a pain to handle. See the comment on ChainStoreCache
.
The apply-range
command doesn't trigger ChainStoreUpdate::finalize
. I suppose finding a way to do it manually or to otherwise populate the caches for apply-range
shouldn't be too hard. As a temporary workaround I'm inserting cache-missed data in get_block, ...
into the cache. More details in the comments.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #13006 +/- ##
==========================================
+ Coverage 69.73% 69.91% +0.18%
==========================================
Files 859 859
Lines 175933 176279 +346
Branches 175933 176279 +346
==========================================
+ Hits 122683 123250 +567
+ Misses 48086 47870 -216
+ Partials 5164 5159 -5
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
|
// Interior mutability will not be required anymore once the apply-range command takes | ||
// care of populating the cache. See comments below regarding apply-range not calling finalize. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, yeah, apply-range
can't finalize
as that would advance the chain, which is in direct opposition if what we need it to do when benchmarking. I'd say lets add a method that just fills the caches that gets called from finalize
and then separately by apply-range
if its in benchmark mode.
blocks: RefCell<HashMap<CryptoHash, Block>>, // DBCol::Block | ||
block_headers: RefCell<HashMap<CryptoHash, BlockHeader>>, // DBCol::BlockHeader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought: might make sense to look at the callers of get_block
etc. to see how frequently they are paired with get_block_header
. If that's a common occurrence (definitely happens in check_transaction_validity
IIRC) then we can reduce the lookup count by holding HashMap<CryptoHash, (Block, BlockHeader)>
and providing a method that looks up both at once.
For storage I would suggest something similar to VecDeque<(CryptoHash, Block, BlockHeader)>
. Of course this would require that the view of the chain is always contiguous for the cache.
If ChainStoreCache
is made to hold the BlockHeight
of the first block in the VecDeque
, then block_hashes_by_height
becomes a super efficient deque[index + offset].0
, while other lookups by hash at least:
- always make the same lookup in the same
Map<CryptoHash, BlockHeight>
before hitting theVecDeque
; - whenever there's a lookup for
get_block
, it would get the related types into a cache line as well for related lookups that can happen in code paths nearby.
All that said this is quite unlikely to make a difference either way in terms of performance (though I would still default to BTreeMap
), and the only really important thing we need to resolve here is getting rid of RefCell
.
ChainStore
queries currently go to rocksdb, as described in #12997. In particular the queries triggered bycheck_transaction_validity_period
account for 21% in the profile.Currently this PR is a proof of concept to validate and measure performance improvements of adding a cache. On my local workstation I observed the following improvements for
apply-range
with native transfers (vs the recent commit on master this PR is based on):Next steps
apply-range
performanceChainStoreUpdate::finalize
trigger cache cleanup.gc_col
get_chunk_extra, get_ligth_client_block, ...
). If profiles show they take up a lot of time, consider caching them too.apply-range
command populate the cache onChainStore
construction (it doesn't callChainStoreUpdate::finalize
).