Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ArchivalMutatorSet: Fix get_mutator_set_update_to_tip #227

Merged
merged 1 commit into from
Nov 6, 2024

Conversation

Sword-Smith
Copy link
Member

@Sword-Smith Sword-Smith commented Nov 6, 2024

Fix a problem in get_mutator_set_update_to_tip where the wrong removal records were returned.

The function attempts to find a MutatorSetUpdate that transforms the mutator set from a state B to a later state A where B and A can be separated by an arbitrary number of blocks. The problem was the the removal records were collecting from all blocks between B and A. But the MutatorSetUpdate data structure needs the removal records valid for block B.

To fix this, we make ephemeral changes to the archival mutator set to assemble the required information. And then roll those ephemeral changes back once the information is gathered.

This closes #225.

A solution where a "snapshot" of the archival mutator set was returned -- an archival mutator set that could read from the databases but only write to cache, not databases -- would be a more elegant and robust solution, I think. But that seems like a rather big rewrite.

Since a write-lock is held over the ArchivalMutatorSet throughout this operation, I believe that the operation is safe and should not lead to data corruption.

Prior to this commit, the test models::state::archival_state::archival_state_tests::update_mutator_set_rollback_many_blocks_multiple_inputs_outputs_test was flaky. With this commit, that is no longer the case.

Copy link
Collaborator

@dan-da dan-da left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm overall.

I requested one change to remove clear_cache() from StorageVecBase trait, where I don't think it belongs. I didn't review the actual mutator-set mods in great detail as I'm not clear on how all that works anyway, so I trust your judgement on that.

I worry any time we do a lot of work inside an &mut self method, for concurrency reasons. at least we have the log-slow-write-lock feature now we can check for any serious problems.

src/models/state/wallet/rusty_wallet_database.rs Outdated Show resolved Hide resolved
src/util_types/mutator_set/archival_mmr.rs Outdated Show resolved Hide resolved
src/util_types/mutator_set/chunk_dictionary.rs Outdated Show resolved Hide resolved
@@ -62,6 +62,9 @@ pub trait StorageVecBase<T: Send> {
/// never an intermediate state.
async fn set_many(&mut self, key_vals: impl IntoIterator<Item = (Index, T)> + Send);

/// Clear non-persisted values
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having a clear_cache() method in the trait feels gross/wrong because an internal cache is an implementation detail of certain implementors of the trait.

no other StorageVecBase trait method refers to a cache, so if one looks only at StorageVecBase in isolation, the clear_cache() method sticks out like a sore thumb and one is left wondering "what cache?".

I think its better to remove this from the trait and just have it as a pub fn in impl DbtVec that actually uses a cache. OrdinaryVec for example does not have a cache, so the method doesn't make sense for it.

This is a somewhat pedantic point because DbtVec is really the main reason this trait exists, but still I'd like to keep the trait api as clean as we can for understandability, etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoiding this method on the trait, means I have to add this impl block in archival_mmr, which I'm OK with.

impl ArchivalMmr<DbtVec<Digest>> {
    /// Delete ephemeral (cache) values, without persisting them.
    ///
    /// Can be used to roll-back ephemeral values to a persisted state.
    pub(crate) async fn delete_cache(&mut self) {
        self.digests.delete_cache().await;
    }
}

@@ -718,13 +720,12 @@ impl ArchivalState {
/// max search depth, as it loads all the blocks in the search path into
/// memory. A max search depth of 0 means that only the tip is checked.
pub(crate) async fn get_mutator_set_update_to_tip(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. perhaps consider changing the name of this fn. It seems odd to have a fn prefixed with get_ that mutates self.

  2. just a note regarding the change to &mut self. any time we perform a lot of db read operations inside a write-lock, it is potentially bad for concurrency as all readers must wait including rpc calls. Hopefully we can eventually refactor such methods into separate read and write methods where the read method only acquires read lock for gathering updates and the write-lock is held as briefly as possible to write out the updates. The tricky part though is atomicity over the entire read+write. (not proposing a change here at this time)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It takes a &mut self but does not actually mutate it -- as in it's the same mutator set before and after the call. It does persist any cached values but from a mathematical perspective, it's the same mutator set.

@Sword-Smith Sword-Smith force-pushed the _225-fix-get_mutator_set_update_to_tip branch from de24f7a to 17c628e Compare November 6, 2024 22:09
Fix a problem in get_mutator_set_update_to_tip where the wrong removal
records were returned.

The function attempts to find a `MutatorSetUpdate` that transforms the
mutator set from a state B to a later state A where B and A can be
separated by an arbitrary number of blocks. The problem was the the
removal records were collecting from all blocks between B and A. But the
MutatorSetUpdate data structure needs the removal records valid for
block B.

To fix this, we make ephemeral changes to the archival mutator set to
assemble the required information. And then roll those ephemeral changes
back once the information is gathered.

This commit includes a test that the function `get_mutator_set_update_to_tip`
does not mutate the archival mutator set.

This closes #225.
@Sword-Smith Sword-Smith force-pushed the _225-fix-get_mutator_set_update_to_tip branch from 17c628e to 9a9c76b Compare November 6, 2024 22:20
@Sword-Smith Sword-Smith merged commit e506ea8 into master Nov 6, 2024
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug in get_mutator_set_update_to_tip
2 participants