Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chain: don't skip early transaction validity checks in state witnesses #12627

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

nagisa
Copy link
Collaborator

@nagisa nagisa commented Dec 16, 2024

The invalid transactions are still simply ignored, rather than fully invalidating the chunk in the relaxed chunk validation setup.

The invalid transactions are still simply ignored, rather than fully
invalidating the chunk in the relaxed chunk validation setup.
@@ -67,6 +67,9 @@ impl MainTransition {
pub struct PreValidationOutput {
pub main_transition_params: MainTransition,
pub implicit_transition_params: Vec<ImplicitTransitionParams>,
/// List of the transactions that are valid and should be validated. False transactions get
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you mean "and should be executed"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess. I was thinking of "validated" as in "validation" step of the whole protocol thing, but I guess that makes the sentence pretty odd to read, doesn't it?

Comment on lines 380 to 385
let mut check = chain.transaction_validity_check(last_chunk_block.header().clone());
state_witness
.transactions
.iter()
.map(|t| check(t))
.collect::<Vec<_>>()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be the previous block.
AFAIU when checking validity for transactions included in chunk in block X, the validity check expects to receive the block before X.
Chunk witness proves the execution of chunk included in last_chunk_block, witness.chunk_header is the next chunk that will be applied in the future.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch. I was going to say I copied this from another callsite which was generating previous block header, but apparently my clipboard lost the nuance in between ^C and ^V :D

@@ -67,6 +67,9 @@ impl MainTransition {
pub struct PreValidationOutput {
pub main_transition_params: MainTransition,
pub implicit_transition_params: Vec<ImplicitTransitionParams>,
/// List of the transactions that are valid and should be validated. False transactions get
/// ignored.
pub transaction_validity: Vec<bool>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice to have a more specific name, maybe is_transaction_within_validity_period. With the current one someone could think that this is a complete validity check, not only the validity period.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I somewhat intentionally meant for this to not be super precise in order for this to make sense if it is extended with additional checks. But I guess people can rename the field when it comes to that 🤷

Copy link
Contributor

@pugachAG pugachAG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM, just the issue with the parent block

"protocol_feature_relaxed_chunk_validation",
RelaxedChunkValidation,
current_protocol_version
) {
let mut check = chain.transaction_validity_check(last_chunk_block.header().clone());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last_chunk_block is not the block we should use here.
on the chunk producer side prepare_transactions is called with:
&mut chain.transaction_validity_check(prev_block.header().clone())
so we should use parent block the way we do in validate_prepared_transactions:
let parent_block = chain.chain_store().get_block(chunk_header.prev_block_hash())?;

I suggest adding parent block to StateWitnessBlockRange, it seems like it belongs there

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to avoid touching StateWitnessBlockRange for now as something like

store.get_block_header(last_chunk_block.header().prev_hash())

is fallible, and I don't love the idea of introducing a hard to spot breaking change to the protocol :)

In principle this same sort of statement is already used later within the function. I similarly pondered moving it out to avoid duplicate work but decided against it due to a similar breaking change risk.

The code should be otherwise good now.

Comment on lines +389 to +390
if last_chunk_block.header().is_genesis() {
vec![true; state_witness.transactions.len()]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to add this to make test that start at the genesis to pass. There is no "prev_block_header" in this instance. The code further below that relies on the previous block has a similar special case for the genesis.

This vector currently ends up not being used, but if it did, it seemed to me that considering all transactions "valid" is reasonable enough: the transactions can't be much older than any useful transaction_validity_period. On the other hand if there are additional checks added to transaction_validity_check in the future, this might no longer hold… So this is not a particularly resilient piece of code in my eyes.

}
valid
});

let chunk_header = new_chunk_data.chunk_header.clone();
let NewChunkResult { apply_result: mut main_apply_result, .. } = apply_new_chunk(
ApplyChunkReason::ValidateChunkStateWitness,
Copy link
Contributor

@jancionear jancionear Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one more concern - this modifies the code for applying a chunk during state witness validation, but a chunk can also be applied in other ways. Nodes that track this shard will fetch the chunk and apply it independently, but I'm not sure if they'll filter the transactions in the same way. Applying a chunk with a different set of transactions could produce different results and the consensus could diverge.
Maybe it would be better to filter transactions in process_transactions during chunk application?

Copy link
Collaborator Author

@nagisa nagisa Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to implement exactly that at first, but this code is very much removed from the Chain data which is required to validate the transaction validity period.

Do you think it would be better to thread chain all the way to the bowels of the transaction runtime?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it would be better to thread chain all the way to the bowels of the transaction runtime?

It'd be more foolproof, but I don't know how hard that is to do. We already have EpochInfoProvider in the runtime, maybe we could add a ChainInfoProvider as well?

Copy link
Collaborator Author

@nagisa nagisa Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I experimented with this over the past week -- it is pretty darn difficult to implement something similar to EpochInfoProvider. First complicating factor is that in many cases with our current code structure the processing requires : Send + Sync or 'static on this conceptual ChainProvider. To get the minimal required such ChainProvider be 'static, we could put the Chain::chain_store into an Arc and move that around. But then this Chain::chain_store is accessed both mutably and immutably. So it also requires a Mutex.

From there a lot of our code is written assuming a) cheap access to ChainStore; b) to hold relatively long-lived references to ChainStore. So what would seem like initially a straightforward change suddenly becomes a big snowball rolling off a hillside.

I have a half-done prototype here: master...nagisa:nearcore:chain-provider-style-tx-validation with the main missing pieces being: a) updates to ChainStoreUpdate to work with Arc<Mutex<ChainStore>>; b) actual invocation of the newly added validation methods; c) some compile errors here and there, some todo!()s in places not related to state witness today which need additional threading through of data.

Now, I largely just steamrolled through the change here -- perhaps there is some refactor to chain that would be broadly beneficial and enable this change as well. But I wasn't able to spot anything of the sort with my limited knowledge of the chain code. Maybe you see a reason why it would be better to go with something like this still?

Copy link
Collaborator Author

@nagisa nagisa Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the PR above to cargo check successfully. This means that I have threaded the ChainProvider through to the transaction runtime (Runtime::apply) from all callers.

There are five separate major call-points in regular code flow where transaction runtime has forced to thread ChainProvider through.

  • Chain::get_update_shard_job -- internally validates transactions (calls validate_chunk_transactions;)
  • stateless validation -- addressed here;
  • Chain::set_state_finalize -- does not operate on untrusted chunks(?);
  • catchup/state sync -- did not verify transaction validity before state witnesses and continues not validating them.
  • tests.

Out of these the catchup code smells most prone to problems here. Other than that everything else is in test code, utilities and tools.

@pugachAG pugachAG self-requested a review December 20, 2024 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants