-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interop: make op-node play back old data to sync op-supervisor #12784
Comments
This is fixed via https://github.com/ethereum-optimism/optimism/pull/12818/files |
Step 1: Reproduction of Bug
Upon startup, we see that the supervisor is unable to proceed:
|
(making this note because I forgot about the particulars of this issue and had to rediscover) So the issue here is not to do with logs, which sync and continue to sync without issue when the supervisor is restarted. So, the entire chain_processor is not faulty as far as I can tell. Rather, this is about marking the |
I expanded the logging during this error, since we were only emitting the
Pulling the data out: New Block Attempt:
Existing Block:
This makes sense, unfortunately. It's because the L1 derivation information has advanced, and the The Supervisor should be able to reconstruct and repair this data when we detect this error. It can achieve this by:
This should recover the supervisor by forcing a backfill of data for the L1 blocks which don't contain L2 updates. |
@protolambda this suggestion seems very destructive to the node, but seeing as the Supervisor only has outbound connections to the Execution Engine, it seems the L1<>L2 relationship must be driven by the node. Are there ways we can redrive this data without resetting the node's derivation? |
Ok, in response to my own query from yesterday, "this seems destructive, is there no other way". The "other way" would be a Safety Index of some kind. Here is a document describing the need for a Safety Index in the So, in the short term, resetting the derivation pipeline back to the last safe head known to Supervisor should work. Nodes would then play forward and report the correct data. |
Above PR was merged, closing this. But we'll continue to track how the op-node / op-supervisor interact, so sync works well. |
When the op-supervisor is out of sync, due to e.g. a DB wipe, we need the op-node to reproduce the local-safe relations that the op-supervisor needs.
I.e. the op-node needs to resync derivation to help catch up the op-supervisor.
Alternatively/additionally we could make the op-supervisor start from a non-zero block. (
We'll need to fix one edge case, where the parent-block is missing for the first block entry, causing a panic).Then the op-supervisor could assume everything is cross-safe / finalized starting at that point, and error if there's a user-query for data before the sync anchor point.
The op-supervisor will still need to sync historical receipts data, to be able to verify cross-chain message dependencies of new blocks.
The text was updated successfully, but these errors were encountered: