Resource consumption bottlenecks #11385

turbolay · 2023-08-28T07:43:41Z

turbolay
Aug 28, 2023
Maintainer

Background

It's simple: We currently keep everything in memory (except filters) and create potentially insanely big data structures that we access through algorithms of O(N) or even O(N²). I will take the coordinator's wallet for my examples because it's perfect: It participated in every coinjoins (18.500 according to dumplings). Using Wasabi for such big wallets is really annoying for various reasons: Long synchronization, constant lag, insane memory usage... This note aims to identify the bottlenecks that create important resource consumption and see what we can do about them. It's more or less ordered by bigger to lower impact. I will edit as I discover new ones.

`BuildHistorySummary`

This method is a computation and memory monster. It creates new objects for every input and every output of every transaction. It has no pagination, no "checkpoints". It reorders all txs at every iteration. Everything is recomputed at every update. The easy way to see its impact on a wallet such as coordinator's wallet is to run it first on the Daemon, check how "fast" the wallet is usable and its memory usage (around 6 GB), then run the same wallet with the GUI and see how it's a nightmare to launch and check memory usage (around 13 GB). It single-handedly makes the wallet close to unusable. I would suggest to create a pagination of some sort. Also, it makes the GUI unresponsive during the whole time of its execution.

Look at this case, I'm trying to open my wallet. The history took a whopping 6 minutes to build (!!!), and once it finally opened, it was already invalidated by a filter that I received while building it 😆

In general, it's updated all. the. time. Even several times in parallel. It consumes all my CPU, and bloat everything else in the wallet.

I wouldn't be surprised if the difference in memory usage from v2.0.3 to v2.0.4 lies in here.

`AllTransactionsStore`

18.500 coinjoins txs, each with a whopping 250 inputs and 250 outputs, out of which we store absolutely every information (SmartTransaction data structure). The serialized data structure in the file system is about 700 MB, so about 37.8 kb of data for each tx. Once deserialized, a whopping ~3.5 GB is used for this data structure, about 190 kb of data per tx. Therefore, every month, the coordinator wallet takes ~300 Mb more memory to run, only accounting for this data structure.

In many cases, we are accessing or updating a single tx, but let's see when exactly we are using all the txs:

Building transaction history list
In UpdateLabels to get every label in the LabelEntryDialogViewModel
During Initial Transaction Processing to loop through all our stored txs in Wallet.LoadWalletState
In case of a reorg, we will "unconfirm" txs based on a block hash by calling ReleaseToMempoolFromBlock
In some cases during TransactionStore.InitializeTransactionsNoLockAsync, we write all the transactions to the file system
Many many many tests

Well, all of these cases have something in common: They don't justify having all transactions in memory. In fact, it's really similar to what we are doing with the filters, and the SQLite database would be a perfect way to store all of these transactions.

We could use the same serialization that is already used for the file system, with columns for each : separator. All cases would be easily covered: Getting all labels is trivial as they have their own column, same goes for ReleaseToMempoolFromBlock. Getting all txs for initial transaction processing, transaction history or for tests could be done using a yield similar to what is done for filters.

Getting rid of this object wouldn't instantly free all the memory it's used because other data structures referencing them such as Coins

`AsAllCoinsView`

This is a data structure that is not that big, it will have one entry per UTXO that was ever received by the wallet. The problem is the complexity. Each time it's called, it's enumerated twice. It's called twice for every tx processed, and after call there is an extra enum so 2 more. Of course, every tx is processed every time we open the wallet. Therefore, while opening the coordinator's wallet, AsAllCoinsView is enumerated... ~18.500*2*2*2=150.000 times. This takes a lot of time. I won't extend too much on this because #11298 does a great job fixing it.

`CoinsByOutpoint`

This is a Dictionary<OutPoint, HashSet<SmartCoin>>. The Keys are every Prevout of every Input of every Tx incoming to our wallet, and the values are HashSets of every Output belonging to the wallet in these transactions (so every UTXOs that we have is present number of inputs in its creating TX times in this cache).

Let's take a coinjoin with 250 inputs and 5 outputs belonging to my wallet. This will create 250 KeyValuePair in the dictionary, each having a HashSet of 5 elements. So 18.500 coinjoins of 250 inputs = 4.625.000 KeyValuePair and the same number of HashSets. This is an ever-growing cache as we only remove in case of double spends.

So one might say, "Ok but these are only references". This is true, only references to already existing objects. But what about the overhead? It's huge. This object alone takes ~1.2 GB of memory for that number of coinjoins. Also, creating all these objects take quite an extensive amount of time.

One solution is to stop checking for replacements for transactions that are already mature, this results in a dramatic reduction of the data stored in this object. This is the approach of #11374. The idea is that the nodes should reject double spends anyway, and the client shouldn't have to take care of that. This can have consequences I'm not sure about.

Another solution is to try to reduce the size of this cache, and specifically the overhead of the hash sets, by using the same HashSet for each prevout from the same transactions. This is the approach of #11384

ichthus1604 · 2023-08-28T12:38:16Z

ichthus1604
Aug 28, 2023

var found = txRecordList.FirstOrDefault(x => x.TransactionId == coin.TransactionId);

I'm pretty sure this is not the root of the problem, but the first thing that comes to my mind whenever I see these kind of lookups in a hot loop code path is to use a dictionary and reduce the O(n) lookups to O(1). Just an improvement that can be done amongst many others.

0 replies

nopara73 · 2023-08-29T03:20:59Z

nopara73
Aug 29, 2023

Bottlenecks in Bird Eye View:

1. Wallet Startup

Goal: Instant

2. Wallet Load

Goal: large wallet load to be instant

3. Wallet Memory Usage

Goal: large wallet to use under 1 GB memory

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wallet Wasabi

Resource consumption bottlenecks #11385

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Wallet Wasabi

Resource consumption bottlenecks #11385

turbolay Aug 28, 2023 Maintainer

Background

BuildHistorySummary

AllTransactionsStore

AsAllCoinsView

CoinsByOutpoint

Replies: 2 comments

ichthus1604 Aug 28, 2023

nopara73 Aug 29, 2023

Bottlenecks in Bird Eye View:

1. Wallet Startup

2. Wallet Load

3. Wallet Memory Usage

turbolay
Aug 28, 2023
Maintainer

`BuildHistorySummary`

`AllTransactionsStore`

`AsAllCoinsView`

`CoinsByOutpoint`

ichthus1604
Aug 28, 2023

nopara73
Aug 29, 2023