Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More search tx optimizations (internal_commands_aggregated, zkapp_commands_aggregated) #63

Merged
merged 6 commits into from
Nov 8, 2024

Conversation

piotr-iohk
Copy link
Collaborator

Follow up after #58.

Performance Test Results

QUERY Rosetta mina-mesh serve mina-mesh serve --use-search-tx-optimizations (#58) mina-mesh serve --use-search-tx-optimizations (this pr)
"address":"B62qiburnzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzmp7r7UN6X", "limit":5000 12-16s 12-14s 3.8-4.1s 3,2
"address":"B62qpXXYbzeZkXrpa3EuZcXgqFSsBsSWrrvi16GJnXLhaqELBSfbnGF" 10-12s 10-12s 3.1s 2,33
"address":"B62qowpMhZ2Ww7b8xQxcK7rrpfsL5Nt5Yz5uxaizUBKqpeZUqBETa31","status":"applied","limit":100 10-12s 11-14s 3.1s 2,39
"max_block":394837,"status":"failed","limit":1000 2.4s 2.7s 0.9s 0,83
"transaction_identifier":{"hash":"5JvFj6DJh1dnMnLPki9ZnmgbxcgfNZCc6hRs8FhvVhaautt84EpY"} 0.46s 0.47s 0.1s 0,03
"address":"B62qq3TQ8AP7MFYPVtMx5tZGF3kWLJukfwG1A1RGvaBW1jfTPTkDBW6","limit":500 9,92 12,72 3,24 2,36
"address":"B62qjwDWxjf4LtJ4YWJQDdTNPqZ69ZyeCzbpAFKN7EoZzYig5ZRz8JE","limit":500 10,21 13,78 3,12 2,94

Storage Overhead

The storage overhead with all additional tables:

  • in postgres
SELECT pg_size_pretty(pg_total_relation_size('user_commands_aggregated')) AS total_size;
"3126 MB"

SELECT pg_size_pretty(pg_total_relation_size('internal_commands_aggregated')) AS total_size;
"426MB"

SELECT pg_size_pretty(pg_total_relation_size('zkapp_commands_aggregated')) AS total_size;
"19MB"

  • on database dump:
$ du -h archive-*
3,1G	archive-no-optimizations.sql
5,3G	archive-optimizations.sql

@piotr-iohk piotr-iohk force-pushed the more-search-tx-optimizations branch from c52e919 to 52593dc Compare November 7, 2024 09:11
Copy link
Member

@joaosreis joaosreis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be looking good!
Just some comments for further improvements:

  • Would it produce meaningful performance improvements if we cache the coinbase_receiver and account_creation_fee fields as well?
  • I see queries with the addresses are still taking some seconds. What's the impact (overhead, performance) of adding yet three other tables linking *_commands to accounts in them?

@piotr-iohk
Copy link
Collaborator Author

Thanks @joaosreis! I'll try your suggestions in separate PR.

@piotr-iohk piotr-iohk merged commit c30b90e into main Nov 8, 2024
6 checks passed
@piotr-iohk piotr-iohk deleted the more-search-tx-optimizations branch November 8, 2024 09:56
@piotr-iohk
Copy link
Collaborator Author

@joaosreis thanks again for suggestions, I've looked into them and put some notes below:

  • Would it produce meaningful performance improvements if we cache the coinbase_receiver and account_creation_fee fields as well?

I tried to make coinbase_receiver_info as additional table and used it in the query and from what I see the performance gain is not really noticable. That's perhaps the table is not very big on mainnet (only 11k records). As the table has considerable amount of INNER JOINS the updating function would be a bit complex and also no real performance gain I think we can drop it for now.

  • I see queries with the addresses are still taking some seconds. What's the impact (overhead, performance) of adding yet three other tables linking *_commands to accounts in them?

For that I've tried adding to INNER JOIN public_keys condition user_commands_aggregated, but given the nature of the condition:

INNER JOIN public_keys AS pk ON u.fee_payer_id=pk.id
  OR (
    buc.status='applied'
    AND (
      u.source_id=pk.id
      OR u.receiver_id=pk.id
    )
  )

It actually doubles the size of the user_commands_aggregated such that it sits at 6.3GB... the performance gain though is just another 0.3-0.4s so I suppose it is not quite worth it.

If you have any further thoughts or ideas please share.

@joaosreis
Copy link
Member

For the second suggestion, if we add an INNER JOIN to the user_commands_aggregated table then we are multiplying the number of records of that table by 2 (assuming source and fee_payer are the same), which agrees with the sizes you posted.
I was considering creating a new table with rows user_command_id, public_key for each public key mentioned in a command, with a composed primary key. This other table, by my calculations (they can be wrong 😄), should occupy ~950MB, which is 1/3 of the 3GB overhead of adding the INNER JOIN to user_commands_aggregated. But then we would have to JOIN this other table when computing the indexer query, which may decrease the already not so great 0.3-0.4s improvement 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants