Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTxO-HD targeting main #1267

Open
wants to merge 51 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
fbf2bd8
Update to an unreleased version of ledger with mempack usage
lehins Sep 17, 2024
127a68a
UTXO-HD
jasagredo Sep 19, 2024
c6fbbc6
Code review changes
jasagredo Nov 11, 2024
711bff0
Remove `HandleRegistry` from the `BackingStore` lockstep tests
jorisdral Nov 20, 2024
876900f
Resolve PR comments for `BackingStore` lockstep tests
jorisdral Nov 20, 2024
b55d9ff
Don't use `ltcollapse` like it is a fold
jorisdral Nov 21, 2024
508392a
Update DiffSeq haddocks
jorisdral Nov 25, 2024
62abd95
Rework SOP code on HardForkCombinator
jasagredo Nov 28, 2024
5094b5b
Code-review changes
jasagredo Nov 26, 2024
69b80a8
Reorganize LedgerDB
jasagredo Nov 29, 2024
055bb3b
Code-review changes
jasagredo Dec 2, 2024
21bd266
Formatting
jasagredo Dec 9, 2024
bfcc953
consensus: simplify some UTxO HD SOP code
nfrisby Dec 9, 2024
74bbbd9
consensus: abstact some query logic over UTxO HD footprints
nfrisby Dec 9, 2024
2b230a0
Fix rebase
jasagredo Dec 11, 2024
c2d6ecd
Some leftovers
jasagredo Dec 12, 2024
68ba21e
Some more leftovers
jasagredo Dec 12, 2024
6e3c614
Fix rebase
jasagredo Dec 13, 2024
56ad516
Further implement snapshot CRC features
jasagredo Dec 16, 2024
992b94e
Use mempack in ouroboros-consensus
jasagredo Dec 20, 2024
9ffa290
Use mempack in ouroboros-consensus-cardano
jasagredo Dec 20, 2024
6b6a7f0
Use mempack in tests
jasagredo Dec 20, 2024
49eeaff
Fix some snapshots minor issues
jasagredo Dec 30, 2024
32f6e5b
Update index-states and flakes
jasagredo Dec 31, 2024
7bfc107
Docs and changelogs
jasagredo Jan 9, 2025
fef4eb9
Make GHC 8.10.7 happy in CI
jasagredo Jan 9, 2025
54ac1b2
Fix linting and formatting in CI
jasagredo Jan 10, 2025
5525700
Translate ledger tables on pushDiffs
jasagredo Jan 17, 2025
73e493a
Provide CanUpgradeLedgerTables instances for tests
jasagredo Jan 17, 2025
d4e6a85
Update golden files
jasagredo Jan 23, 2025
8fef4b7
Upgrade tables also on V1 InMemory
jasagredo Jan 21, 2025
74d9a4c
Adapt tests to V1 InMemory also upgrading tables
jasagredo Jan 21, 2025
67b6855
Some suggested changes:
jorisdral Jan 22, 2025
d05dd6a
Implement IndexedMemPack and use it in V1 OnDisk
jasagredo Jan 21, 2025
fde3c5d
Adapt tests to use IndexedMemPack
jasagredo Jan 21, 2025
78454a3
Fix leftover cabal file formatting
jasagredo Jan 22, 2025
bfaf8dc
Some suggested changes
jorisdral Jan 22, 2025
53a77c8
Some cleanup and code-review comments
jasagredo Jan 27, 2025
275112b
Small changes
jorisdral Jan 27, 2025
bea6c97
deriving via Void instance IndexedMempack
jorisdral Jan 27, 2025
9409c69
MemPackIdx: fully integrate IndexedMemPack into ledger table combinators
jorisdral Jan 27, 2025
37c08aa
Re-introduce -Wno-orphans
jorisdral Jan 27, 2025
f7359ad
Remove redundant LANGUAGE pragmas
jasagredo Jan 28, 2025
f4a52de
Take snapshots as they fit
jasagredo Jan 31, 2025
5302b89
db-analyser: support V2 LedgerDB
amesgen Jan 11, 2025
dfda138
Promote cardano translations to a CAF
jasagredo Jan 31, 2025
069eb1f
Do not compute diffs in mempool snapshot
jasagredo Feb 6, 2025
c020514
Remove ShelleyTxIn
jasagredo Feb 6, 2025
e2ce936
Fixup tests, format code
jasagredo Feb 7, 2025
b0f07c7
Don't accumulate thunks in deserialization of snapshots
jasagredo Feb 7, 2025
d5232aa
Move a couple of mempool operations to work on ledgerstates instead
jasagredo Feb 7, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
5 changes: 5 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,11 @@ jobs:
cabal clean
cabal update

- name: Install lmdb
run: |
sudo apt update
sudo apt install liblmdb-dev

# We create a `dependencies.txt` file that can be used to index the cabal
# store cache.
#
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
/docs/website/build/
/ouroboros-consensus/docs/haddocks/

haddocks/

# GHC
.ghcid
.ghc.environment.*
Expand Down
35 changes: 35 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,41 @@ cabal test ouroboros-consensus:test:consensus-test --test-show-details=direct
Note the second one cannot be used when we want to provide CLI arguments to the
test-suite.

# Generating documentation and setting up hoogle

The documentation contains some [tikz](https://tikz.net) figures that require
some preprocessing for them to be displayed. To do this, use the documentation
script:

```bash
./scripts/docs/haddocks.sh
```

If not already in your `PATH` (eg when in a Nix shell), this will install
[`cabal-docspec`](https://github.com/phadej/cabal-extras/tree/master/cabal-docspec)
from a binary, and then build the haddocks for the project.

Often times, it is useful to have a
[`hoogle`](https://github.com/ndmitchell/hoogle) server at hand, with the
packages and its dependencies. Our suggestion is to install
[`cabal-hoogle`](https://github.com/kokobd/cabal-hoogle) from github:

```bash
git clone [email protected]:kokobd/cabal-hoogle
cd cabal-hoogle
cabal install exe:cabal-hoogle
```

and then run `cabal-hoogle`:

```bash
cabal-hoogle generate
cabal-hoogle run -- server --local
```

This will fire a `hoogle` server at https://localhost:8080/ with the local
packages and their dependencies.

# Contributing to the code

The following sections contain some guidelines that should be followed when
Expand Down
48 changes: 46 additions & 2 deletions cabal.project
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ repository cardano-haskell-packages
-- update either of these.
index-state:
-- Bump this if you need newer packages from Hackage
, hackage.haskell.org 2024-12-10T16:20:07Z
, hackage.haskell.org 2025-01-14T03:16:27Z
-- Bump this if you need newer packages from CHaP
, cardano-haskell-packages 2025-01-04T13:50:25Z

Expand Down Expand Up @@ -46,4 +46,48 @@ if(os(windows))
bitvec -simd

-- https://github.com/ulidtko/cabal-doctest/issues/85
constraints: Cabal < 3.13
constraints:
quickcheck-lockstep <0.6.0

-- mempack support
source-repository-package
type: git
location: https://github.com/IntersectMBO/cardano-base.git
tag: fb9b71f3bc33f8de673c6427736f09bf7972e81f
--sha256: sha256-ExQ497FDYlmQyZaXOTddU+KraAUHnTAqPiyt055v0+M=
subdir:
cardano-crypto-class

-- mempack support
source-repository-package
type: git
location: https://github.com/IntersectMBO/cardano-ledger
tag: 5e9799940b05af8b04812bc828b50a4848e17c93
--sha256: sha256-G0pz2z1hvg32OBnrrhEgAomAs3liPrOsaslpSGR9nWM=
subdir:
eras/allegra/impl
eras/alonzo/impl
eras/alonzo/test-suite
eras/babbage/impl
eras/babbage/test-suite
eras/conway/impl
eras/conway/test-suite
eras/mary/impl
eras/shelley/impl
eras/shelley/test-suite
eras/shelley-ma/test-suite
libs/cardano-ledger-api
libs/cardano-ledger-core
libs/cardano-ledger-binary
libs/cardano-protocol-tpraos
libs/non-integral
libs/small-steps
libs/cardano-data
libs/set-algebra
libs/vector-map
eras/byron/chain/executable-spec
eras/byron/ledger/executable-spec
eras/byron/ledger/impl
eras/byron/ledger/impl/test
eras/byron/crypto
eras/byron/crypto/test
92 changes: 1 addition & 91 deletions docs/tech-reports/report/chapters/storage/ledgerdb.tex
Original file line number Diff line number Diff line change
@@ -1,98 +1,8 @@
\chapter{Ledger Database}
\label{ledgerdb}

The Ledger DB is responsible for the following tasks:

\begin{enumerate}
\item \textbf{Maintaining the ledger state at the tip}: Maintaining the ledger
state corresponding to the current tip in memory. When we try to extend our
chain with a new block fitting onto our tip, the block must first be validated
using the right ledger state, i.e., the ledger state corresponding to the tip.
The current ledger state is needed for various other purposes.

\item \textbf{Maintaining the past $k$ ledger states}: As discussed in
\cref{consensus:overview:k}, we might roll back up to $k$ blocks when
switching to a more preferable fork. Consider the example below:
%
\begin{center}
\begin{tikzpicture}
\draw (0, 0) -- (50pt, 0) coordinate (I);
\draw (I) -- ++(20pt, 20pt) coordinate (C1) -- ++(20pt, 0) coordinate (C2);
\draw (I) -- ++(20pt, -20pt) coordinate (F1) -- ++(20pt, 0) coordinate (F2) -- ++(20pt, 0) coordinate (F3);
\node at (I) {$\bullet$};
\node at (C1) {$\bullet$};
\node at (C2) {$\bullet$};
\node at (F1) {$\bullet$};
\node at (F2) {$\bullet$};
\node at (F3) {$\bullet$};
\node at (I) [above left] {$I$};
\node at (C1) [above] {$C_1$};
\node at (C2) [above] {$C_2$};
\node at (F1) [below] {$F_1$};
\node at (F2) [below] {$F_2$};
\node at (F3) [below] {$F_3$};
\draw (60pt, 50pt) node {$\overbrace{\hspace{60pt}}$};
\draw (60pt, 60pt) node[fill=white] {$k$};
\draw [dashed] (30pt, -40pt) -- (30pt, 45pt);
\end{tikzpicture}
\end{center}
%
Our current chain's tip is $C_2$, but the fork containing blocks $F_1$, $F_2$,
and $F_3$ is more preferable. We roll back our chain to the intersection point
of the two chains, $I$, which must be not more than $k$ blocks back from our
current tip. Next, we must validate block $F_1$ using the ledger state at
block $I$, after which we can validate $F_2$ using the resulting ledger state,
and so on.

This means that we need access to all ledger states of the past $k$ blocks,
i.e., the ledger states corresponding to the volatile part of the current
chain.\footnote{Applying a block to a ledger state is not an invertible
operation, so it is not possible to simply ``unapply'' $C_1$ and $C_2$ to
obtain $I$.}

Access to the last $k$ ledger states is not only needed for validating candidate
chains, but also by the:
\begin{itemize}
\item \textbf{Local state query server}: To query any of the past $k$ ledger
states (\cref{servers:lsq}).
\item \textbf{Chain sync client}: To validate headers of a chain that
intersects with any of the past $k$ blocks
(\cref{chainsyncclient:validation}).
\end{itemize}

\item \textbf{Storing on disk}: To obtain a ledger state for the current tip of
the chain, one has to apply \emph{all blocks in the chain} one-by-one to the
initial ledger state. When starting up the system with an on-disk chain
containing millions of blocks, all of them would have to be read from disk and
applied. This process can take tens of minutes, depending on the storage and
CPU speed, and is thus too costly to perform on each startup.

For this reason, a recent snapshot of the ledger state should be periodically
written to disk. Upon the next startup, that snapshot can be read and used to
restore the current ledger state, as well as the past $k$ ledger states.
\end{enumerate}

Note that whenever we say ``ledger state'', we mean the
\lstinline!ExtLedgerState blk! type described in \cref{storage:extledgerstate}.

The above duties are divided across the following modules:

\begin{itemize}
\item \lstinline!LedgerDB.InMemory!: this module defines a pure data structure,
named \lstinline!LedgerDB!, to represent the last $k$ ledger states in memory.
Operations to validate and append blocks, to switch to forks, to look up
ledger states, \ldots{} are provided.
\item \lstinline!LedgerDB.OnDisk!: this module contains the functionality to
write a snapshot of the \lstinline!LedgerDB! to disk and how to restore a
\lstinline!LedgerDB! from a snapshot.
\item \lstinline!LedgerDB.DiskPolicy!: this module contains the policy that
determines when a snapshot of the \lstinline!LedgerDB! is written to disk.
\item \lstinline!ChainDB.Impl.LgrDB!: this module is part of the Chain DB, and
is responsible for maintaining the pure \lstinline!LedgerDB! in a
\lstinline!StrictTVar!.
\end{itemize}

We will now discuss the modules listed above.
THIS PART WAS PORTED TO THE HADDOCKS

\section{In-memory representation}
\label{ledgerdb:in-memory}
Expand Down
28 changes: 28 additions & 0 deletions docs/website/contents/for-developers/utxo-hd/Overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# High level overview of UTxO-HD

UTxO-HD is an internal rework of the Consensus layer that features a hybrid
database for Ledger State data. UTxOs are stored in a separate database that
can be backed by an on-disk database or with an in-memory implementation.

Each of those backends have specific behaviors and implications, so we will
refer to them individually by `InMemory` and `OnDisk`.

End-users of the `InMemory` backend (the default one) should not appreciate any
major difference in behavior and performance with respects to a pre-UTxO-HD
node.

End-users of the `OnDisk` backend will observe a regression in performance. For
now the `OnDisk` backend is implemented via LMDB and not optimal in terms of
performance, but we plan on making use of the LSM trees library that Well-Typed
is developing for a much better performance. In particular operations that need
UTxOs (applying blocks/transactions) will have the overhead of a trip to the
disk storage plus some calculations to bring the disk values up to date to the
tip of the chain.

In exchange for that performance regression, a Cardano node using the `OnDisk`
backend can run with much more modest memory requirements than a pre-UTxO-HD
node.

In terms of functionality, both backends are fully functional.

For a more extensive description of UTxO-HD, see [the full documentation](./utxo-hd-in-depth).
Loading