Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fast catchup (goal node catchup) hangs when running with EnableP2P: true #6265

Open
algorandskiy opened this issue Feb 28, 2025 · 1 comment
Assignees
Labels
bug Something isn't working p2p Work related to the p2p project

Comments

@algorandskiy
Copy link
Contributor

goal node  -d mainnet-p2p catchup
Fast catchup to 47620000#AJQDWWQWWJ6D6Q74AIH26HCYZLZRT2AV2K6RCI54DQI43R3OPIWQ is about to start.
Using external catchpoints is not a secure practice and should not be done for consensus participating nodes.
Type 'yes' to accept the risk and continue: yes

Config:

{
        "EnableP2P": true,
        "BaseLoggerDebugLevel": 5,
        "EnableDHTProviders": true
}

Observed behavior:

  1. Node is non-responsible - goal node -d mainnet-p2p status not working either with pending goal node catchup in a console, or after quitting it
  2. node.log indicates it is catching up
@algorandskiy algorandskiy self-assigned this Feb 28, 2025
@algorandskiy algorandskiy added bug Something isn't working p2p Work related to the p2p project labels Feb 28, 2025
@algorandskiy
Copy link
Contributor Author

Intermediate analysis:

  1. algod collects available p2p nodes with archival=true from DHT since there is no DNS published p2p archival nodes
  2. some of these nodes allow tcp/4190 connection but fail to respond to identity traffic. Then libp2p code continues its new stream logic and hangs indefinitely.

Here is is a log excerpt using one of such nodes (this is non-archival but illustrates the issue well):

export GOLOG_LOG_LEVEL=debug
export GOLOG_OUTPUT=stdout
go run ./cmd/catchpointdump net -n mainnet.algorand.network -p /dns4/r-pn.algorand-mainnet.network/tcp/4190/p2p/12D3KooWAaPiWHx1CZcpZ4rQSDcpzsfMN5ecReLpWh5ZuReoxcnM -r 47750000

downloading catchpoint from /dns4/r-pn.algorand-mainnet.network/tcp/4190/p2p/12D3KooWAaPiWHx1CZcpZ4rQSDcpzsfMN5ecReLpWh5ZuReoxcnM /v1/mainnet-v1.0/ledger/sfg4w
2025-03-05T10:40:33.925-0500    DEBUG    basichost       basic/basic_host.go:805 host 12D3KooWNVdvivtrsbdyuLJjM1gvH2z8zSD7F4egbiGszz2nMwjZ dialing 12D3KooWAaPiWHx1CZcpZ4rQSDcpzsfMN5ecReLpWh5ZuReoxcnM
2025-03-05T10:40:33.971-0500    DEBUG    swarm2  swarm/swarm_dial.go:593 12D3KooWNVdvivtrsbdyuLJjM1gvH2z8zSD7F4egbiGszz2nMwjZ swarm dialing 12D3KooWAaPiWHx1CZcpZ4rQSDcpzsfMN5ecReLpWh5ZuReoxcnM /ip4/167.235.65.180/tcp/4190
2025-03-05T10:41:04.459-0500    INFO     net/identify    identify/id.go:457      failed negotiate identify protocol with peer    {"peer": "12D3KooWAaPiWHx1CZcpZ4rQSDcpzsfMN5ecReLpWh5ZuReoxcnM", "error": "i/o deadline reached"}
2025-03-05T10:41:04.459-0500    WARN     net/identify    identify/id.go:431      failed to identify 12D3KooWAaPiWHx1CZcpZ4rQSDcpzsfMN5ecReLpWh5ZuReoxcnM: i/o deadline reached
2025-03-05T10:41:04.460-0500    DEBUG    basichost       basic/basic_host.go:822 host 12D3KooWNVdvivtrsbdyuLJjM1gvH2z8zSD7F4egbiGszz2nMwjZ finished dialing 12D3KooWAaPiWHx1CZcpZ4rQSDcpzsfMN5ecReLpWh5ZuReoxcnM
2025-03-05T10:41:04.460-0500    DEBUG    swarm2  swarm/swarm.go:476      [12D3KooWNVdvivtrsbdyuLJjM1gvH2z8zSD7F4egbiGszz2nMwjZ] opening stream to peer [12D3KooWAaPiWHx1CZcpZ4rQSDcpzsfMN5ecReLpWh5ZuReoxcnM]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working p2p Work related to the p2p project
Projects
None yet
Development

No branches or pull requests

1 participant