Replace IAVL+ with SMT #6

liamsi · 2021-01-04T19:13:26Z

Use https://github.com/lazyledger/smt/ instead of https://github.com/cosmos/iavl which is the default state tree in the cosmos sdk.

Ideally, this would be done in a way that makes upstreaming this to the Cosmos-SDK easy - either as an additional optional store, or, as default that replaces IAVL. The Cosmos-SDK team seems open to the idea and might want to collaborate with them on this.

ref: cosmos#7100

adlerjohn · 2021-01-06T16:40:10Z

There are two concerns with using an SMT over the IAVL+ tree:

Pruning of old state. This happens naturally with the SMT, so there's no work that needs to be done here.
Keeping old state. Necessary for archive nodes (ones that hold on to enough information to easily serve the state at any block height). One approach requires keeping track of all previous state deltas (i.e. changes with proofs from the state at one block height to the next). Coincidentally, these state deltas are used as part of the current immediate execution proposal.

musalbas · 2021-01-07T10:26:38Z

Keeping old state. Necessary for archive nodes (ones that hold on to enough information to easily serve the state at any block height). One approach requires keeping track of all previous state deltas (i.e. changes with proofs from the state at one block height to the next). Coincidentally, these state deltas are used as part of the current immediate execution proposal.

Summarising discussion from Slack: in other words, this could involve keeping an older version of the tree on disk, and storing new nodes in memory/cache. The new nodes are then persisted to disk, and all obsolete nodes pruned. Once this persisting happens, the older version of the state is no longer available on disk (and would have to be re-computed from the chain to be recovered).

liamsi · 2021-01-07T11:06:26Z

the older version of the state is no longer available on disk

If you do not keep around at least some older versions (e.g. every X heights), you would need start from genesis to recompute any state that isn't the most recent.

Also, @ethanfrey mentioned (on discord), that at least some past states might need to be efficiently queriable:

Support for persistent historical queries. At least allowing the last 10 or so blocks to be queried. I am pretty sure that this is required by ibc. In any case removing all historical queries is a major feature reduction that should be discussed carefully.

musalbas · 2021-01-07T11:19:18Z

What's the use case for historical queries of old state?

liamsi · 2021-01-07T11:47:01Z

@cwgoes @jackzampolin can you help us understand the requirements regarding querying past state in the context of IBC? Ethan Frey mentioned on discord that you (Jack) ran into some issues regarding past queries in the relayer?

ethanfrey · 2021-01-07T12:01:29Z

I took a brief look at https://github.com/lazyledger/smt/blob/master/go.mod and you don't seem to import any databases? What do you use to write to disk? (There is a whole lot of work in dbs to handle atomicity, crash recovery, etc that is not in the go standard library).

Update: I see you accept a generic MapStore to handle disk access. This doesn't allow batch atomic writes or any more advanced db functionality, and will need to be expanded to provide proper sdk support

ethanfrey · 2021-01-07T12:07:52Z

Here is my full discord comments related to committing to disk, historical versions:

Support for persistent historical queries. At least allowing the last 10 or so blocks to be queried. I am pretty sure that this is required by ibc. In any case removing all historical queries is a major feature reduction that should be discussed carefully.

All data atomically committed to disk upon abci commit message. There have been issues in the past of some substores with different heights when a node was shut down during commit phase. Also, not writing every block to disk was done in 0.38.0 and caused major headaches on restarts and any attempts to upgrade. Commit means commit.

I also think this historical query issue could be handled much better by low level dB functionality, like rockdb snapshots, which use os level CoW tricks to be highly efficient, rather than building this manually on top of then merklized store. This could remove all pruning logic and maintain historical queries, but only if the underlying kv store supports snapshots (or we try lvm snapshotting tricks, but that is too os dependent I think)

ethanfrey · 2021-01-07T12:07:56Z

I add this in particular to refute a suggestion above:

Summarising discussion from Slack: in other words, this could involve keeping an older version of the tree on disk, and storing new nodes in memory/cache. The new nodes are then persisted to disk, and all obsolete nodes pruned. Once this persisting happens, the older version of the state is no longer available on disk (and would have to be re-computed from the chain to be recovered).

This was tried in some iavl optimizations in 0.38.0 (flush to disk every 10 or 100 blocks) and caused huge problems. One poor implementation would never restart properly. Even after that was solved, it remained near impossible to properly upgrade nodes. For that we need everyone to shut down their nodes v1 at height H and either: dump full state at H to new genesis file, or start node v2 with upgrade handler to process block H+1. Neither of those are feasible if state at height H was not properly synced.

ethanfrey · 2021-01-07T12:11:02Z

We should be able to atomically commit each block. All in-progress work on txs inside a block is strictly in memory and only flushed to disk on abci Commit command, which must be safe. Currently, when substores use different backing dbs there is no way to guarantee atomicity (one substore may be updated, other not, the mutlistore table of "latest version" may be off). Bez had some ideas how to improve that, but for now, we should just plan one backing db for all state, with atomic commits.

As I said above, I think using db or os level solutions for serving older snapshots could be much easier and more efficient than the current iavl solution.

musalbas · 2021-01-07T12:16:47Z

This was tried in some iavl optimizations in 0.38.0 (flush to disk every 10 or 100 blocks) and caused huge problems.

The "cache" could be on-disk too, I was just considering it from an algorithms perspective, in terms of how to implement pruning.

Support for persistent historical queries. At least allowing the last 10 or so blocks to be queried. I am pretty sure that this is required by ibc. In any case removing all historical queries is a major feature reduction that should be discussed carefully.

What's the use case for historical queries of old state? Why does the SDK require this?

By "allowing the last 10 or so blocks to be queried" do you mean querying the data within those blocks, or the state of the key/value store historically?

ethanfrey · 2021-01-07T12:30:05Z

By "allowing the last 10 or so blocks to be queried" do you mean querying the data within those blocks, or the state of the key/value store historically?

Allowing the state of the key/value store to be historically queried. All block data is generally available back to genesis and handled by tendermint (not the sdk app)

I cannot answer why this is important, just that is was considered important and does exist (with considerable work to make it work properly) and cannot be tossed without a deep discussion with the sdk team (or prior sdk team that build this)

musalbas · 2021-01-07T12:33:38Z

cc @zmanian, any insight on the above? I'm not sure why the Cosmos SDK apps or IBC would need to be able to query recently stale state from the kv store. Even standard Ethereum nodes do not do this AFAIK. If apps need any "historical" data, it should use event logging.

alexanderbez · 2021-01-07T14:30:07Z

Ethereum and the SDK are drastically different (SDK doesn't have event logging). Historical queries are a critical part of an SDK-based app. Historical queries are used for many things (e.g. fetching old block and tx data, viewing account balances and rewards over time, performing analysis on distribution related data, etc...). Almost all clients (e.g. explorers, wallets, relayers, etc...) rely on historical queries in some capacity.

To summarize, we need the following from a logical store:

basic CRUD operations
historical queries with proofs (i.e. give me X at height/version H)
batching/atomic writes
(reverse) prefix iteration
inclusion proofs
exclusion proofs? (I'm not totally sure about this one)
range proofs

musalbas · 2021-01-07T14:39:46Z

Ethereum and the SDK are drastically different (SDK doesn't have event logging). Historical queries are a critical part of an SDK-based app. Historical queries are used for many things (e.g. fetching old block and tx data, viewing account balances and rewards over time, performing analysis on distribution related data, etc...). Almost all clients (e.g. explorers, wallets, relayers, etc...) rely on historical queries in some capacity.

So I think there's two things to consider (to be clear):

Cosmos SDK apps that need to access historical key->value data within the app's state machine itself
Clients that need historical key->value data, to query old app data, e.g. block explorers

It's clear why (2) is necessary, and this is what would be achieved by e.g. an Ethereum archive node or a Bitcoin full node with indexing enabled. But is (1) the case? That would be similar to, for example, Ethereum requiring you to run an archive node to be able to validate transactions, which would be very unpractical.

Support for persistent historical queries is definitely doable (and is currently already the case for the current SMT implementation, as it doesn't support pruning at all currently, so you can simply query via old roots) - but I'm wondering why this is a requirement for e.g. IBC, as ideally this is something that only clients observing the chain for various purposes (e.g. explorers) should do.

alexanderbez · 2021-01-07T14:49:29Z

No, (1) is not a valid use-case. Modules do not use historical data.

musalbas · 2021-01-07T14:57:19Z

I see, that makes things a lot clearer, thanks.

If it's infrastructure projects that need to query historical state, rather than the Cosmos SDK app's state machine itself, then maybe it would be feasible to run an instance of the Cosmos SDK app in "archival mode" to e.g. instruct the SMT to keep the historical data (basically, disable pruning). Will need to look more into how this currently works in the SDK/IAVL...

alexanderbez · 2021-01-07T15:10:11Z

maybe it would be feasible to run an instance of the Cosmos SDK app in "archival mode" to e.g. instruct the SMT to keep the historical data (basically, disable pruning).

Yes, as long as this is possible, then great! It would need to be configurable (e.g. disable/enable pruning). Just for reference, the way this currently works in the SDK is that we call an API that loads a versioned tree in memory during a query and if that tree doesn't exist, we gracefully return an error.

ethanfrey · 2021-01-07T19:08:07Z

Nice summary @alexanderbez and thanks for chiming in. Just two points for the proofs:

exclusion proofs? (I'm not totally sure about this one)

range proofs

Exclusion proofs are critical to IBC to allow timeouts (this packet was provably not received by height H).

Range proofs are not used nor required as far as I know. Even the IBC spec only refers to batch proofs, allowing us to prove eg. 20 items at once and compress a lot of the redundancy in the proofs, so it may be the size of maybe 3 or 4 single proofs.

In ICS23, I implemented inclusion, exclusion, and batch proofs. Batch proofs may also be provably range proofs if all keys are stored in order, but also work in many other circumstances (or for non-adjacent items)

alexanderbez · 2021-01-07T19:39:05Z

It has come to my attention that multiple stakeholders are in need of replacing the current usage of IAVL with a SMT or some equivalent (e.g. LL, Foam, Nomic, and Regen).

It is my suggestion that we all team up and try to flesh out an ADR on a minimal viable logical datastore interface that multiple projects/stakeholders can implement s.t. we can easily plug-n-play different implementations into the SDK's multi-store layer.

cwgoes · 2021-01-11T19:19:08Z

Nice summary @alexanderbez and thanks for chiming in. Just two points for the proofs:

exclusion proofs? (I'm not totally sure about this one)

range proofs

Exclusion proofs are critical to IBC to allow timeouts (this packet was provably not received by height H).

Range proofs are not used nor required as far as I know. Even the IBC spec only refers to batch proofs, allowing us to prove eg. 20 items at once and compress a lot of the redundancy in the proofs, so it may be the size of maybe 3 or 4 single proofs.

In ICS23, I implemented inclusion, exclusion, and batch proofs. Batch proofs may also be provably range proofs if all keys are stored in order, but also work in many other circumstances (or for non-adjacent items)

Only inclusion and exclusion proofs are strictly required for the current incarnation of IBC. Batching might be nice in the future.

hxrts · 2021-01-12T09:33:50Z

Continuing discussion here for anyone who may have missed the comment on cosmos#7100

liamsi mentioned this issue Jan 4, 2021

Keep track of and return intermediate state roots to be included in the block #8

Closed

liamsi added the C:ORU This issue will be necessary for both ORUs and the LL-abci app label Jan 4, 2021

tzdybal self-assigned this Jan 6, 2021

hxrts mentioned this issue Jan 8, 2021

What to do about IAVL? cosmos/cosmos-sdk#7100

Closed

evan-forbes mentioned this issue Jan 16, 2021

Look into fixing duplicate protobuf type registry celestiaorg/celestia-core#145

Closed

adlerjohn mentioned this issue Sep 6, 2021

Fraud provable staking #30

Open

5 tasks

liamsi added this to Celestia Node Dec 5, 2021

liamsi moved this to TODO in Celestia Node Dec 5, 2021

evan-forbes added the ice-box label May 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace IAVL+ with SMT #6

Replace IAVL+ with SMT #6

liamsi commented Jan 4, 2021 •

edited

Loading

adlerjohn commented Jan 6, 2021 •

edited

Loading

musalbas commented Jan 7, 2021 •

edited

Loading

liamsi commented Jan 7, 2021

musalbas commented Jan 7, 2021

liamsi commented Jan 7, 2021 •

edited

Loading

ethanfrey commented Jan 7, 2021 •

edited

Loading

ethanfrey commented Jan 7, 2021

ethanfrey commented Jan 7, 2021

ethanfrey commented Jan 7, 2021

musalbas commented Jan 7, 2021 •

edited

Loading

ethanfrey commented Jan 7, 2021

musalbas commented Jan 7, 2021 •

edited

Loading

alexanderbez commented Jan 7, 2021 •

edited

Loading

musalbas commented Jan 7, 2021 •

edited

Loading

alexanderbez commented Jan 7, 2021 •

edited

Loading

musalbas commented Jan 7, 2021

alexanderbez commented Jan 7, 2021

ethanfrey commented Jan 7, 2021

alexanderbez commented Jan 7, 2021 •

edited

Loading

cwgoes commented Jan 11, 2021

hxrts commented Jan 12, 2021

Replace IAVL+ with SMT #6

Replace IAVL+ with SMT #6

Comments

liamsi commented Jan 4, 2021 • edited Loading

adlerjohn commented Jan 6, 2021 • edited Loading

musalbas commented Jan 7, 2021 • edited Loading

liamsi commented Jan 7, 2021

musalbas commented Jan 7, 2021

liamsi commented Jan 7, 2021 • edited Loading

ethanfrey commented Jan 7, 2021 • edited Loading

ethanfrey commented Jan 7, 2021

ethanfrey commented Jan 7, 2021

ethanfrey commented Jan 7, 2021

musalbas commented Jan 7, 2021 • edited Loading

ethanfrey commented Jan 7, 2021

musalbas commented Jan 7, 2021 • edited Loading

alexanderbez commented Jan 7, 2021 • edited Loading

musalbas commented Jan 7, 2021 • edited Loading

alexanderbez commented Jan 7, 2021 • edited Loading

musalbas commented Jan 7, 2021

alexanderbez commented Jan 7, 2021

ethanfrey commented Jan 7, 2021

alexanderbez commented Jan 7, 2021 • edited Loading

cwgoes commented Jan 11, 2021

hxrts commented Jan 12, 2021

liamsi commented Jan 4, 2021 •

edited

Loading

adlerjohn commented Jan 6, 2021 •

edited

Loading

musalbas commented Jan 7, 2021 •

edited

Loading

liamsi commented Jan 7, 2021 •

edited

Loading

ethanfrey commented Jan 7, 2021 •

edited

Loading

musalbas commented Jan 7, 2021 •

edited

Loading

musalbas commented Jan 7, 2021 •

edited

Loading

alexanderbez commented Jan 7, 2021 •

edited

Loading

musalbas commented Jan 7, 2021 •

edited

Loading

alexanderbez commented Jan 7, 2021 •

edited

Loading

alexanderbez commented Jan 7, 2021 •

edited

Loading