Getting bitcoin balances efficiently and with privacy is still hard in 2025.
Btc core:
- indexing only for wallets, wallets has to be pre-defined or rescan happens
- rescan can only happen on full node
- => no way to have pruned node
- (-blockfilterindex=1 might help in 2026...)
Electrs:
- needs full node...
- cannot work just from the UTXOs
Solution 0:
- pruned btc core + dumptxoutset + sqlite generated by python
- works, see deprecated/, has nice tricks in it, but sqlite+python is overkill
- overall solution slow and ugly
- tried to improve on it with leveldb, that was better, but we are on unix!
- these DB based solutions were all blowing up the data size to 7-8GB
Solution 1a:
- talk to (pruned) btc core and ask for all UTXOs
../cli.sh dumptxoutset $PWD/txoutset
time on @2024-11-01: 6m (TODO check again!)- record only the ones relevant (bc1...)
- merge multiple coins into indexed balances during collection
- write out sorted text file
- use /usr/bin/look for query (binary search)
- query: 1ms
- data size: 1GB (not compressible for binary search)
- generation time once dumptxoutset ready (or streamed): 2m
- git hash: a9a87e3
Solution 1b:
- write out .zst files
- query is simply "zstdcat xxxxxx.zst | grep ^script"
- data can be compressed (only 500M @2024-11-01 for all bc1*)
- query is still fast (10 ms)
- zstd splitting: awk '{print $0 > substr($0,0,6)}' ; zstd ?????? ; rm ??????
Solution 2:
- same as 1a, but no need for in-memory merging
- simply put all utxos on stdout as is
- use unix sort after
- data size: 3g (compressed 500m)
- generation time: 1.5m
- sort time: 1.5m
- query: /usr/bin/look, 1 ms
- parse.c is super simple now, no glibc tsearch
Solution 3:
- same as solution 2, but no need to sort if we split and zstd anyway
- ../parse <../txoutset | awk '{print $0 > substr($0,0,6)}' ; zstd ?????? ; rm ??????
- generation time (including zstd): 1m50s
- data size: 666m compressed (bigger than 500m, because: scripts not sorted)
- query: zstdcat 0014f9*.zst | grep '^0014f9...'
- query time: 10 ms
Document how to run btccore with minimal settings and how to sync it fast (utxo snapshot).
Ideas: bitcoind -datadir=somewhere -server=1 -txindex=0 -blocksonly=1 -blockfilterindex=1 -disablewallet -prune=4096
Even starting the dump is 3 minutes, what if we just use btrfs reflink always + leveldb dump instead...
Would save the initial wait. Would still have to iterate all utxos, as utxos are not indexed or sorted by script.
Talk to btc core and get all transactions from blocks higher than output/metadata.json
base_height.
Write out the used up UTXOs and the newly created UTXOs script balance changes after that block.
Write query tool that takes this into account.
This way, we only have to run cron during the night and during the day we get incremental quick updates.
How hard it is to rewrite script2addr.py
in C and is it fast enough to be part of the generation?