Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(security): fix concurrency issues in tree key formats, and CPU usage in genesis tree roots #7392

Merged
merged 31 commits into from
Oct 19, 2023
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
9e70437
Add tree key format and cached root upgrades
teor2345 Oct 11, 2023
8cac3df
Document the changes in the upgrades
teor2345 Oct 11, 2023
b0f7213
Remove unnecessary clippy::unwrap_in_result
teor2345 Oct 11, 2023
0f4ab54
Fix database type
teor2345 Oct 11, 2023
6e629d5
Bump state version
teor2345 Oct 11, 2023
84a1491
Skip some checks if the database is empty
teor2345 Oct 11, 2023
84a9847
Fix tests for a short state upgrade
teor2345 Oct 11, 2023
5c6267c
Disable format checks in some tests
teor2345 Oct 11, 2023
2b4b0be
Document state performance issues
teor2345 Oct 11, 2023
988ec96
Clarify upgrade behaviour
teor2345 Oct 11, 2023
7407e95
Clarify panic messages
teor2345 Oct 11, 2023
7f222bc
Delete incorrect genesis trees write code
teor2345 Oct 11, 2023
64a4296
Fix metrics handling for genesis
teor2345 Oct 11, 2023
0669f85
Remove an unused import
teor2345 Oct 11, 2023
432138d
Explain why genesis anchors are ok
teor2345 Oct 11, 2023
a15bdb5
Update snapshots
teor2345 Oct 11, 2023
82ddb73
Debug a failing test
teor2345 Oct 11, 2023
def9973
Fix some tests
teor2345 Oct 11, 2023
62b07d2
Fix missing imports
teor2345 Oct 11, 2023
5cdaead
Move the state check in a test
teor2345 Oct 12, 2023
af5f97d
Merge branch 'main' into fix-genesis-tree-root
teor2345 Oct 12, 2023
c87bfe9
Fix comment and doc typos
teor2345 Oct 17, 2023
dd3138b
Clarify what a long upgrade is
teor2345 Oct 17, 2023
8dd083b
Rename unused function arguments
teor2345 Oct 18, 2023
bbd929b
Merge branch 'main' into fix-genesis-tree-root
teor2345 Oct 18, 2023
9fb4a5b
Add all_unordered log regex matching methods
teor2345 Oct 18, 2023
852e558
Fix timing issues with version upgrades and other logs
teor2345 Oct 18, 2023
29eaa8b
Merge branch 'main' into fix-genesis-tree-root
teor2345 Oct 18, 2023
58924d6
Fix argument name in docs
teor2345 Oct 18, 2023
b5916ad
Explain match until first for all regexes behaviour better
teor2345 Oct 18, 2023
3ff1dd7
Merge branch 'main' into fix-genesis-tree-root
teor2345 Oct 18, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 25 additions & 2 deletions book/src/dev/state-db-upgrades.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,28 @@ This means that:

If there is an upgrade failure, it can panic and tell the user to delete their cached state and re-launch Zebra.

### Performance Constraints

Some column family access patterns can lead to very poor performance.

Known performance issues include:
- using an iterator on a column family which also deletes keys
- creating large numbers of iterators
- holding an iterator for a long time

See the performance notes and bug reports in:
- https://github.com/facebook/rocksdb/wiki/Iterator#iterating-upper-bound-and-lower-bound
- https://tracker.ceph.com/issues/55324
- https://jira.mariadb.org/browse/MDEV-19670

But we need to use iterators for some operations, so our alternatives are (in preferred order):
1. Minimise the number of keys we delete, and how often we delete them
2. Avoid using iterators on column families where we delete keys
3. If we must use iterators on those column families, set read bounds to minimise the amount of deleted data that is read

Currently only UTXOs require key deletion, and only `utxo_loc_by_transparent_addr_loc` requires
deletion and iterators.

### Implementation Steps

- [ ] update the [database format](https://github.com/ZcashFoundation/zebra/blob/main/book/src/dev/state-db-upgrades.md#current) in the Zebra docs
Expand Down Expand Up @@ -87,7 +109,7 @@ We use the following rocksdb column families:
| *Sprout* | | | |
| `sprout_nullifiers` | `sprout::Nullifier` | `()` | Create |
| `sprout_anchors` | `sprout::tree::Root` | `sprout::NoteCommitmentTree` | Create |
| `sprout_note_commitment_tree` | `block::Height` | `sprout::NoteCommitmentTree` | Delete |
| `sprout_note_commitment_tree` | `()` | `sprout::NoteCommitmentTree` | Update |
teor2345 marked this conversation as resolved.
Show resolved Hide resolved
| *Sapling* | | | |
| `sapling_nullifiers` | `sapling::Nullifier` | `()` | Create |
| `sapling_anchors` | `sapling::tree::Root` | `()` | Create |
Expand All @@ -99,7 +121,7 @@ We use the following rocksdb column families:
| `orchard_note_commitment_tree` | `block::Height` | `orchard::NoteCommitmentTree` | Create |
| `orchard_note_commitment_subtree` | `block::Height` | `NoteCommitmentSubtreeData` | Create |
| *Chain* | | | |
| `history_tree` | `block::Height` | `NonEmptyHistoryTree` | Delete |
| `history_tree` | `()` | `NonEmptyHistoryTree` | Update |
| `tip_chain_value_pool` | `()` | `ValueBalance` | Update |

Zcash structures are encoded using `ZcashSerialize`/`ZcashDeserialize`.
Expand Down Expand Up @@ -131,6 +153,7 @@ Amounts:

Derived Formats:
- `*::NoteCommitmentTree`: `bincode` using `serde`
- stored note commitment trees always have cached roots
- `NonEmptyHistoryTree`: `bincode` using `serde`, using `zcash_history`'s `serde` implementation


Expand Down
4 changes: 2 additions & 2 deletions zebra-state/src/constants.rs
Original file line number Diff line number Diff line change
Expand Up @@ -49,11 +49,11 @@ pub(crate) const DATABASE_FORMAT_VERSION: u64 = 25;
/// - adding new column families,
/// - changing the format of a column family in a compatible way, or
/// - breaking changes with compatibility code in all supported Zebra versions.
pub(crate) const DATABASE_FORMAT_MINOR_VERSION: u64 = 2;
pub(crate) const DATABASE_FORMAT_MINOR_VERSION: u64 = 3;

/// The database format patch version, incremented each time the on-disk database format has a
/// significant format compatibility fix.
pub(crate) const DATABASE_FORMAT_PATCH_VERSION: u64 = 2;
pub(crate) const DATABASE_FORMAT_PATCH_VERSION: u64 = 0;

/// Returns the highest database version that modifies the subtree index format.
///
Expand Down
2 changes: 1 addition & 1 deletion zebra-state/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ pub use response::GetBlockTemplateChainInfo;
pub use service::{
arbitrary::{populated_state, CHAIN_TIP_UPDATE_WAIT_LIMIT},
chain_tip::{ChainTipBlock, ChainTipSender},
finalized_state::MAX_ON_DISK_HEIGHT,
finalized_state::{DiskWriteBatch, MAX_ON_DISK_HEIGHT},
init_test, init_test_services, ReadStateService,
};

Expand Down
48 changes: 37 additions & 11 deletions zebra-state/src/service/check/tests/anchors.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,9 @@ use zebra_chain::{
amount::Amount,
block::{Block, Height},
primitives::Groth16Proof,
sapling,
serialization::ZcashDeserializeInto,
sprout::JoinSplit,
sprout::{self, JoinSplit},
transaction::{JoinSplitData, LockTime, Transaction, UnminedTx},
};

Expand All @@ -18,7 +19,7 @@ use crate::{
write::validate_and_commit_non_finalized,
},
tests::setup::{new_state_with_mainnet_genesis, transaction_v4_from_coinbase},
SemanticallyVerifiedBlock, ValidateContextError,
DiskWriteBatch, SemanticallyVerifiedBlock, ValidateContextError,
};

// Sprout
Expand All @@ -31,20 +32,30 @@ fn check_sprout_anchors() {

let (finalized_state, mut non_finalized_state, _genesis) = new_state_with_mainnet_genesis();

// Bootstrap a block at height == 1.
// Delete the empty anchor from the database
let mut batch = DiskWriteBatch::new();
batch.delete_sprout_anchor(
&finalized_state,
&sprout::tree::NoteCommitmentTree::default().root(),
);
finalized_state
.write_batch(batch)
.expect("unexpected I/O error");

// Create a block at height 1.
let block_1 = zebra_test::vectors::BLOCK_MAINNET_1_BYTES
.zcash_deserialize_into::<Block>()
.expect("block should deserialize");

// Bootstrap a block just before the first Sprout anchors.
// Create a block just before the first Sprout anchors.
let block_395 = zebra_test::vectors::BLOCK_MAINNET_395_BYTES
.zcash_deserialize_into::<Block>()
.expect("block should deserialize");

// Add initial transactions to [`block_1`].
let block_1 = prepare_sprout_block(block_1, block_395);

// Bootstrap a block at height == 2 that references the Sprout note commitment tree state
// Create a block at height == 2 that references the Sprout note commitment tree state
// from [`block_1`].
let block_2 = zebra_test::vectors::BLOCK_MAINNET_2_BYTES
.zcash_deserialize_into::<Block>()
Expand Down Expand Up @@ -74,10 +85,13 @@ fn check_sprout_anchors() {
)
});

assert!(matches!(
check_unmined_tx_anchors_result,
Err(ValidateContextError::UnknownSproutAnchor { .. })
));
assert!(
matches!(
check_unmined_tx_anchors_result,
Err(ValidateContextError::UnknownSproutAnchor { .. }),
),
"unexpected result: {check_unmined_tx_anchors_result:?}",
);

// Validate and commit [`block_1`]. This will add an anchor referencing the
// empty note commitment tree to the state.
Expand Down Expand Up @@ -182,7 +196,17 @@ fn check_sapling_anchors() {

let (finalized_state, mut non_finalized_state, _genesis) = new_state_with_mainnet_genesis();

// Bootstrap a block at height == 1 that has the first Sapling note commitments
// Delete the empty anchor from the database
let mut batch = DiskWriteBatch::new();
batch.delete_sapling_anchor(
&finalized_state,
&sapling::tree::NoteCommitmentTree::default().root(),
);
finalized_state
.write_batch(batch)
.expect("unexpected I/O error");

// Create a block at height 1 that has the first Sapling note commitments
let mut block1 = zebra_test::vectors::BLOCK_MAINNET_1_BYTES
.zcash_deserialize_into::<Block>()
.expect("block should deserialize");
Expand Down Expand Up @@ -227,7 +251,7 @@ fn check_sapling_anchors() {

let block1 = Arc::new(block1).prepare();

// Bootstrap a block at height == 2 that references the Sapling note commitment tree state
// Create a block at height == 2 that references the Sapling note commitment tree state
// from earlier block
let mut block2 = zebra_test::vectors::BLOCK_MAINNET_2_BYTES
.zcash_deserialize_into::<Block>()
Expand Down Expand Up @@ -315,3 +339,5 @@ fn check_sapling_anchors() {
Ok(())
);
}

// TODO: create a test for orchard anchors
28 changes: 25 additions & 3 deletions zebra-state/src/service/finalized_state.rs
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,10 @@ pub use disk_format::{OutputIndex, OutputLocation, TransactionLocation, MAX_ON_D

pub(super) use zebra_db::ZebraDb;

pub(super) use disk_db::DiskWriteBatch;
#[cfg(any(test, feature = "proptest-impl"))]
pub use disk_db::DiskWriteBatch;
#[cfg(not(any(test, feature = "proptest-impl")))]
use disk_db::DiskWriteBatch;

/// The finalized part of the chain state, stored in the db.
///
Expand Down Expand Up @@ -88,14 +91,33 @@ pub struct FinalizedState {
}

impl FinalizedState {
/// Returns an on-disk database instance for `config` and `network`.
/// Returns an on-disk database instance for `config`, `network`, and `elastic_db`.
/// If there is no existing database, creates a new database on disk.
pub fn new(
config: &Config,
network: Network,
#[cfg(feature = "elasticsearch")] elastic_db: Option<elasticsearch::Elasticsearch>,
) -> Self {
let db = ZebraDb::new(config, network, false);
Self::new_with_debug(
config,
network,
false,
#[cfg(feature = "elasticsearch")]
elastic_db,
)
}

/// Returns an on-disk database instance with the supplied production and debug settings.
/// If there is no existing database, creates a new database on disk.
///
/// This method is intended for use in tests.
pub(crate) fn new_with_debug(
config: &Config,
network: Network,
debug_skip_format_upgrades: bool,
#[cfg(feature = "elasticsearch")] elastic_db: Option<elasticsearch::Elasticsearch>,
) -> Self {
let db = ZebraDb::new(config, network, debug_skip_format_upgrades);

#[cfg(feature = "elasticsearch")]
let new_state = Self {
Expand Down
2 changes: 1 addition & 1 deletion zebra-state/src/service/finalized_state/disk_db.rs
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ pub struct DiskDb {
// and make them accessible via read-only methods
#[must_use = "batches must be written to the database"]
#[derive(Default)]
pub(crate) struct DiskWriteBatch {
pub struct DiskWriteBatch {
/// The inner RocksDB write batch.
batch: rocksdb::WriteBatch,
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,10 @@ expression: empty_column_families
[
"balance_by_transparent_addr: no entries",
"history_tree: no entries",
"orchard_anchors: no entries",
"orchard_note_commitment_subtree: no entries",
"orchard_nullifiers: no entries",
"sapling_anchors: no entries",
"sapling_note_commitment_subtree: no entries",
"sapling_nullifiers: no entries",
"sprout_anchors: no entries",
"sprout_nullifiers: no entries",
"tip_chain_value_pool: no entries",
"tx_loc_by_transparent_addr_loc: no entries",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,10 @@ expression: empty_column_families
[
"balance_by_transparent_addr: no entries",
"history_tree: no entries",
"orchard_anchors: no entries",
"orchard_note_commitment_subtree: no entries",
"orchard_nullifiers: no entries",
"sapling_anchors: no entries",
"sapling_note_commitment_subtree: no entries",
"sapling_nullifiers: no entries",
"sprout_anchors: no entries",
"sprout_nullifiers: no entries",
"tip_chain_value_pool: no entries",
"tx_loc_by_transparent_addr_loc: no entries",
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
source: zebra-state/src/service/finalized_state/disk_format/tests/snapshot.rs
expression: cf_data
---
[
KV(
k: "ae2935f1dfd8a24aed7c70df7de3a668eb7a49b1319880dde2bbd9031ae5d82f",
v: "",
),
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
source: zebra-state/src/service/finalized_state/disk_format/tests/snapshot.rs
expression: cf_data
---
[
KV(
k: "ae2935f1dfd8a24aed7c70df7de3a668eb7a49b1319880dde2bbd9031ae5d82f",
v: "",
),
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
source: zebra-state/src/service/finalized_state/disk_format/tests/snapshot.rs
expression: cf_data
---
[
KV(
k: "fbc2f4300c01f0b7820d00e3347c8da4ee614674376cbc45359daa54f9b5493e",
v: "",
),
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
source: zebra-state/src/service/finalized_state/disk_format/tests/snapshot.rs
expression: cf_data
---
[
KV(
k: "fbc2f4300c01f0b7820d00e3347c8da4ee614674376cbc45359daa54f9b5493e",
v: "",
),
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
source: zebra-state/src/service/finalized_state/disk_format/tests/snapshot.rs
expression: cf_data
---
[
KV(
k: "d7c612c817793191a1e68652121876d6b3bde40f4fa52bc314145ce6e5cdd259",
v: "0001d7c612c817793191a1e68652121876d6b3bde40f4fa52bc314145ce6e5cdd259",
),
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
source: zebra-state/src/service/finalized_state/disk_format/tests/snapshot.rs
expression: cf_data
---
[
KV(
k: "d7c612c817793191a1e68652121876d6b3bde40f4fa52bc314145ce6e5cdd259",
v: "0001d7c612c817793191a1e68652121876d6b3bde40f4fa52bc314145ce6e5cdd259",
),
]
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ expression: cf_data
---
[
KV(
k: "000000",
k: "",
v: "0001d7c612c817793191a1e68652121876d6b3bde40f4fa52bc314145ce6e5cdd259",
),
]
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ expression: cf_data
---
[
KV(
k: "000001",
k: "",
v: "0001d7c612c817793191a1e68652121876d6b3bde40f4fa52bc314145ce6e5cdd259",
),
]
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ expression: cf_data
---
[
KV(
k: "000002",
k: "",
v: "0001d7c612c817793191a1e68652121876d6b3bde40f4fa52bc314145ce6e5cdd259",
),
]
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ expression: cf_data
---
[
KV(
k: "000000",
k: "",
v: "0001d7c612c817793191a1e68652121876d6b3bde40f4fa52bc314145ce6e5cdd259",
),
]
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ expression: cf_data
---
[
KV(
k: "000001",
k: "",
v: "0001d7c612c817793191a1e68652121876d6b3bde40f4fa52bc314145ce6e5cdd259",
),
]
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ expression: cf_data
---
[
KV(
k: "000002",
k: "",
v: "0001d7c612c817793191a1e68652121876d6b3bde40f4fa52bc314145ce6e5cdd259",
),
]
Loading
Loading