Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testnet indexer request takes multiple seconds #4516

Closed
lzpap opened this issue Dec 17, 2024 · 5 comments · Fixed by #4586
Closed

Testnet indexer request takes multiple seconds #4516

lzpap opened this issue Dec 17, 2024 · 5 comments · Fixed by #4586
Assignees
Labels
devops-admin Issues related to the DevOps team infrastructure Issues related to the Infrastructure Team

Comments

@lzpap
Copy link
Member

lzpap commented Dec 17, 2024

Bug description

I noticed that transactions are loaded pretty slowly on the explorer, one has to wait usually multiple seconds (5-10-15) for them to appear on the bottom of either the Pacakge/Object view page or on the Checkpoint page.

Looking at the monitoring dashboard, the cause of the slow loading is the huge response time from the indexer instance. On the attached screenshot you can see the spike that occured when this explorer page was opened:
Image

  • The red line marks the indexer metric.
  • The blue line is the access node with JSON RPC, not relevant for this problem.
  • What takes time during the explorer page load is loading transactions that are part of the checkpoint. As you can see, these are simple consensus txs, nothing fancy, and there are only 4, so I suspect that the problem is either some setup/config on the machines or the lack of pruning.
@lzpap lzpap added infrastructure Issues related to the Infrastructure Team devops-admin Issues related to the DevOps team labels Dec 17, 2024
@kodemartin
Copy link
Contributor

A first inquiry suggests the following:

  • iotax_queryTransactionBlocks (filters txs by checkpoint id) performs consistently with what is reported here (~14,15 sec)
  • A naive 1+N query (iota_getCheckpoint + iota_getTransactionBlocks) performs much better (~4s) but still has large latency.
  • GraphQL responds rapidly
    {
      checkpoint(id: {sequenceNumber: 10843159}) {
        transactionBlocks {
          nodes {
            digest
            sender {
              address
            }
            gasInput {
              gasPrice
              gasBudget
            }
            effects {
              status
              errors
              timestamp
              objectChanges {
                edges {
                  node {
                    idCreated
                    idDeleted
                  }
                }
              }
            }
          }
        }
      }
    }    

So it seems that we can seek to optimize the query in the indexer JSON-RPC server, following the call in the graphQL server.

@tomxey
Copy link
Contributor

tomxey commented Dec 18, 2024

Replicated the slowness on local setup by artificially increasing indexer DB size to contain ~20M records (I believe it's a bit above 40M on testnet now) in transactions table.

Calling:

curl 'localhost:9005/'   -H 'content-type: application/json'   --data-raw '{
      "jsonrpc":"2.0",
      "id":4,
      "method":"iotax_queryTransactionBlocks",
      "params":[{"filter":{"Checkpoint":"10"},"options":{"showEffects":true,"showInput":true}},null,20,true]}'

Makes diesel generate and execute a query similar to the following:

SELECT *
FROM transactions
WHERE checkpoint_sequence_number = 10
ORDER by tx_sequence_number DESC;

which is slow (~40s) because of missing index on checkpoint_sequence_number (the only index is tx_sequence_number).

I didn't deeply investigate what graphql does internally, but since the entrypoint of the query is the checkpoint object, I assume that it may start looking at the checkpoints table, get transaction ids from there and then go for transactions table.
Such approach would be very fast, even without additional indexes.

The following query takes only 0.1s, producing the same results as the query above.

SELECT *
FROM transactions
WHERE tx_sequence_number between (
	select min_tx_sequence_number from checkpoints where sequence_number = 10
) and (
	select max_tx_sequence_number from checkpoints where sequence_number = 10
)
ORDER BY tx_sequence_number DESC;

Creating the index:

CREATE INDEX transactions_checkpoint_sequence_number ON public.transactions USING btree (checkpoint_sequence_number)

makes the original query perform really fast (<0.01s) and also speeds up iotax_queryTransactionBlocks endpoint to around 0.3s.

@kodemartin
Copy link
Contributor

kodemartin commented Dec 18, 2024

I didn't deeply investigate what graphql does internally, but since the entrypoint of the query is the checkpoint object, I assume that it may start looking at the checkpoints table, get transaction ids from there and then go for transactions table.
Such approach would be very fast, even without additional indexes.

Ideally we would like to avoid adding a new index, as it will affect insertion time and hence transaction effects latency with the current setting.

So this solution seems like a good candidate. At any case, I would suggest that you resolve the exact query used by GraphQL. I would assume to be the optimal solution.

@kodemartin
Copy link
Contributor

kodemartin commented Dec 18, 2024

@tomxey this upstream patch might be relevant too.

@tomxey
Copy link
Contributor

tomxey commented Dec 19, 2024

What graphql is doing is:

First it gets the range of transaction ids for given checkpoint via query:

SELECT network_total_transactions
FROM checkpoints
WHERE sequence_number IN (16, 17)
ORDER BY network_total_transactions ASC;

It queries for network_total_transactions of checkpoint that is being queried, and one checkpoint before it.

Which is followed by a second query:

SELECT *
FROM transactions
WHERE tx_sequence_number >= 63
  AND tx_sequence_number < 67
ORDER BY tx_sequence_number ASC
LIMIT 22;

which is using values from the previous one to filter tx_sequence_number.

I wonder why it's not using the min_tx_sequence_number and max_tx_sequence_number, but otherwise the approach seems to be very similar to the one suggested in #4516 (comment) and in the upstream patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devops-admin Issues related to the DevOps team infrastructure Issues related to the Infrastructure Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants