[wip]perf: cache block related data in `ChainStore` #13006

mooori · 2025-02-27T11:33:11Z

ChainStore queries currently go to rocksdb, as described in #12997. In particular the queries triggered by check_transaction_validity_period account for 21% in the profile.

Currently this PR is a proof of concept to validate and measure performance improvements of adding a cache. On my local workstation I observed the following improvements for apply-range with native transfers (vs the recent commit on master this PR is based on):

With 100 accounts: 29% increase from 60k to 77.5k TPS
With 10k accounts: 6% increase from 17.1k to 18.2k

Next steps

nagisa · 2025-02-27T12:47:41Z

chain/chain/src/store/mod.rs

+    block_cache: RefCell<HashMap<CryptoHash, Block>>,
+    block_header_cache: RefCell<HashMap<CryptoHash, BlockHeader>>,
+    block_hash_by_height_cache: RefCell<HashMap<BlockHeight, CryptoHash>>,


FWIW I think it might make more sense to maintain a view of the most recent blocks in memory (pending analysis into access patterns), rather than caching separate queries and trying to implement something like LRU-evicted access-based cache here.

This would among other things potentially make the cache maintenance somewhat more straightforward and easier to reason about. First, because the only place where the block information is written is in ChainStoreUpdate::finalize, so this is where we'd update the cache too. And second is because some stray query (which can happen due to these queries being dependent on the user input) for a block from yesterday ago wouldn't affect the happy path.

rather than caching separate queries

After making ChainStoreUpdate::finalize update the cache, I tried to return cache.block(h).header() from get_block_header. That made a lot of tests fail and turns out some columns get special treatment in finalize or garbage collection. So the safest way might be to have a separate cache for each DBCol, even if it's a bit of a pain to handle. See the comment on ChainStoreCache.

The apply-range command doesn't trigger ChainStoreUpdate::finalize. I suppose finding a way to do it manually or to otherwise populate the caches for apply-range shouldn't be too hard. As a temporary workaround I'm inserting cache-missed data in get_block, ... into the cache. More details in the comments.

codecov · 2025-02-27T23:13:48Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 69.91%. Comparing base (0c499cb) to head (b27b806).
Report is 12 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #13006      +/-   ##
==========================================
+ Coverage   69.73%   69.91%   +0.18%     
==========================================
  Files         859      859              
  Lines      175933   176279     +346     
  Branches   175933   176279     +346     
==========================================
+ Hits       122683   123250     +567     
+ Misses      48086    47870     -216     
+ Partials     5164     5159       -5

Flag	Coverage Δ
backward-compatibility	`0.36% <0.00%> (-0.01%)`	⬇️
db-migration	`0.36% <0.00%> (-0.01%)`	⬇️
genesis-check	`1.43% <0.00%> (-0.01%)`	⬇️
linux	`69.45% <100.00%> (+0.34%)`	⬆️
linux-nightly	`69.47% <100.00%> (+0.17%)`	⬆️
pytests	`1.74% <0.00%> (-0.01%)`	⬇️
sanity-checks	`1.55% <0.00%> (-0.01%)`	⬇️
unittests	`69.75% <100.00%> (+0.18%)`	⬆️
upgradability	`0.36% <0.00%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

mooori · 2025-02-28T01:21:16Z

ChainStoreUpdate::finalize now updates the cache and garbage collection clears items if required. The performance numbers mentioned in the PR description remain basically unchanged. The tasks for the next steps have been updated.

nagisa · 2025-02-28T11:49:51Z

chain/chain/src/store/mod.rs

+    // Interior mutability will not be required anymore once the apply-range command takes
+    // care of populating the cache. See comments below regarding apply-range not calling finalize.


Hm, yeah, apply-range can't finalize as that would advance the chain, which is in direct opposition if what we need it to do when benchmarking. I'd say lets add a method that just fills the caches that gets called from finalize and then separately by apply-range if its in benchmark mode.

nagisa · 2025-02-28T12:02:25Z

chain/chain/src/store/mod.rs

+    blocks: RefCell<HashMap<CryptoHash, Block>>, // DBCol::Block
+    block_headers: RefCell<HashMap<CryptoHash, BlockHeader>>, // DBCol::BlockHeader


Thought: might make sense to look at the callers of get_block etc. to see how frequently they are paired with get_block_header. If that's a common occurrence (definitely happens in check_transaction_validity IIRC) then we can reduce the lookup count by holding HashMap<CryptoHash, (Block, BlockHeader)> and providing a method that looks up both at once.

For storage I would suggest something similar to VecDeque<(CryptoHash, Block, BlockHeader)>. Of course this would require that the view of the chain is always contiguous for the cache.

If ChainStoreCache is made to hold the BlockHeight of the first block in the VecDeque, then block_hashes_by_height becomes a super efficient deque[index + offset].0, while other lookups by hash at least:

always make the same lookup in the same Map<CryptoHash, BlockHeight> before hitting the VecDeque;

whenever there's a lookup for get_block, it would get the related types into a cache line as well for related lookups that can happen in code paths nearby.

All that said this is quite unlikely to make a difference either way in terms of performance (though I would still default to BTreeMap), and the only really important thing we need to resolve here is getting rid of RefCell.

mooori added 2 commits February 26, 2025 21:17

Caches for header, hash_by_height

a5b9142

Add block cache

859d5ee

nagisa reviewed Feb 27, 2025

View reviewed changes

mooori added 6 commits February 27, 2025 19:37

Add block to cache in ChainStoreUpdate::finalize

6accccd

Enable block cache in apply-range

6f0eca1

Enable block_header cache in apply-range

532a955

Explain why multiple caches

f175a17

cleanup

dd846aa

Make gc of blocks clear block cache

59a1e7a

mooori added 2 commits February 28, 2025 00:52

Cache block_hash_by_height

4215951

clippy

b27b806

nagisa reviewed Feb 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[wip]perf: cache block related data in `ChainStore` #13006

[wip]perf: cache block related data in `ChainStore` #13006

mooori commented Feb 27, 2025 •

edited

Loading

nagisa Feb 27, 2025 •

edited

Loading

mooori Feb 28, 2025

codecov bot commented Feb 27, 2025 •

edited

Loading

mooori commented Feb 28, 2025

nagisa Feb 28, 2025

nagisa Feb 28, 2025

		// Interior mutability will not be required anymore once the apply-range command takes
		// care of populating the cache. See comments below regarding apply-range not calling finalize.

		blocks: RefCell<HashMap<CryptoHash, Block>>, // DBCol::Block
		block_headers: RefCell<HashMap<CryptoHash, BlockHeader>>, // DBCol::BlockHeader

[wip]perf: cache block related data in ChainStore #13006

Are you sure you want to change the base?

[wip]perf: cache block related data in ChainStore #13006

Conversation

mooori commented Feb 27, 2025 • edited Loading

Next steps

nagisa Feb 27, 2025 • edited Loading

Choose a reason for hiding this comment

mooori Feb 28, 2025

Choose a reason for hiding this comment

codecov bot commented Feb 27, 2025 • edited Loading

Codecov Report

mooori commented Feb 28, 2025

nagisa Feb 28, 2025

Choose a reason for hiding this comment

nagisa Feb 28, 2025

Choose a reason for hiding this comment

[wip]perf: cache block related data in `ChainStore` #13006

[wip]perf: cache block related data in `ChainStore` #13006

mooori commented Feb 27, 2025 •

edited

Loading

nagisa Feb 27, 2025 •

edited

Loading

codecov bot commented Feb 27, 2025 •

edited

Loading