Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wip]perf: cache block related data in ChainStore #13006

Draft
wants to merge 10 commits into
base: master
Choose a base branch
from
40 changes: 37 additions & 3 deletions chain/chain/src/store/mod.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
use std::cell::RefCell;
use std::collections::hash_map::Entry;
use std::collections::{BTreeMap, HashMap, HashSet};
use std::io;
Expand Down Expand Up @@ -282,6 +283,13 @@
save_trie_changes: bool,
/// The maximum number of blocks for which a transaction is valid since its creation.
pub(super) transaction_validity_period: BlockHeightDelta,

// TODO make caches evict oldest blocks when growing above threashold.

Check warning on line 287 in chain/chain/src/store/mod.rs

View workflow job for this annotation

GitHub Actions / spellcheck

Unknown word (threashold)
// Refcell for inner mutability, has traits require getters to take immutable `&self`.
/// Blockheaders are indexed by their hash.

Check warning on line 289 in chain/chain/src/store/mod.rs

View workflow job for this annotation

GitHub Actions / spellcheck

Unknown word (Blockheaders)
block_cache: RefCell<HashMap<CryptoHash, Block>>,
block_header_cache: RefCell<HashMap<CryptoHash, BlockHeader>>,
block_hash_by_height_cache: RefCell<HashMap<BlockHeight, CryptoHash>>,
Copy link
Collaborator

@nagisa nagisa Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I think it might make more sense to maintain a view of the most recent blocks in memory (pending analysis into access patterns), rather than caching separate queries and trying to implement something like LRU-evicted access-based cache here.

This would among other things potentially make the cache maintenance somewhat more straightforward and easier to reason about. First, because the only place where the block information is written is in ChainStoreUpdate::finalize, so this is where we'd update the cache too. And second is because some stray query (which can happen due to these queries being dependent on the user input) for a block from yesterday ago wouldn't affect the happy path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than caching separate queries

After making ChainStoreUpdate::finalize update the cache, I tried to return cache.block(h).header() from get_block_header. That made a lot of tests fail and turns out some columns get special treatment in finalize or garbage collection. So the safest way might be to have a separate cache for each DBCol, even if it's a bit of a pain to handle. See the comment on ChainStoreCache.

The apply-range command doesn't trigger ChainStoreUpdate::finalize. I suppose finding a way to do it manually or to otherwise populate the caches for apply-range shouldn't be too hard. As a temporary workaround I'm inserting cache-missed data in get_block, ... into the cache. More details in the comments.

}

impl Deref for ChainStore {
Expand Down Expand Up @@ -314,6 +322,9 @@
latest_known: once_cell::unsync::OnceCell::new(),
save_trie_changes,
transaction_validity_period,
block_cache: RefCell::new(Default::default()),
block_header_cache: RefCell::new(Default::default()),
block_hash_by_height_cache: RefCell::new(Default::default()),
}
}

Expand Down Expand Up @@ -936,7 +947,14 @@

/// Get full block.
fn get_block(&self, h: &CryptoHash) -> Result<Block, Error> {
ChainStoreAdapter::get_block(self, h)
if let Some(block) = self.block_cache.borrow().get(&h) {
return Ok(block.clone());
}
let res = ChainStoreAdapter::get_block(self, h);
if let Ok(ref block) = res {
self.block_cache.borrow_mut().insert(h.clone(), block.clone());
}
res
}

/// Get full chunk.
Expand Down Expand Up @@ -987,12 +1005,28 @@

/// Get block header.
fn get_block_header(&self, h: &CryptoHash) -> Result<BlockHeader, Error> {
ChainStoreAdapter::get_block_header(self, h)
if let Some(header) = self.block_header_cache.borrow().get(h) {
return Ok(header.clone());
}
// Query block_cache and if there's a hit, return the block's header?
// Check if that can be done despite `Block` and `BlockHeader` being different `DBCol`.
let res = ChainStoreAdapter::get_block_header(self, h);
if let Ok(ref header) = res {
self.block_header_cache.borrow_mut().insert(h.clone(), header.clone());
}
res
}

/// Returns hash of the block on the main chain for given height.
fn get_block_hash_by_height(&self, height: BlockHeight) -> Result<CryptoHash, Error> {
ChainStoreAdapter::get_block_hash_by_height(self, height)
if let Some(hash) = self.block_hash_by_height_cache.borrow().get(&height) {
return Ok(hash.clone());
}
let res = ChainStoreAdapter::get_block_hash_by_height(self, height);
if let Ok(ref hash) = res {
self.block_hash_by_height_cache.borrow_mut().insert(height, hash.clone());
}
res
}

fn get_next_block_hash(&self, hash: &CryptoHash) -> Result<CryptoHash, Error> {
Expand Down
Loading