Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional CCQ Logs #4128

Open
corpocott opened this issue Oct 2, 2024 · 7 comments
Open

Additional CCQ Logs #4128

corpocott opened this issue Oct 2, 2024 · 7 comments
Labels
bug Something isn't working guardian-support

Comments

@corpocott
Copy link

Description

Looking for additional logging to be added to the https call to the ccq proxy to submit the payload. Running into an issue where our submission is not making it to the other guardians and Liston and I are not able to identify the point of failure. Would be helpful to get a debug log added for the https call status and if it is non-200 what the error is. Only seeing this at the current time without any errors

2024-10-01T15:42:32.321Z INFO root.p2p published signed query response {"component": "ccqp2p", "requestSignature": "c025a38804cc2147aa2e445551567bd5c142329823eca8a91db49871a619ee9f1032925fccada70118ad77f08c6139fb911fb11714caa9d68ec610003875e97800", "query_response": {"Request":{"query_request":"xxx","signature":"xxx"},"PerChainResponses":[{"ChainId":2,"Response":{"BlockNumber":20871609,"Hash":"0x31c85dae667b8df8835b2069547959bd91870bd27cfef659f6cbb084a0b94836","Time":"2024-10-01T15:42:23Z","Results":["AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADVdyYXBwZWQgRXRoZXIAAAAAAAAAAAAAAAAAAAAAAAAA"]}}]}, "signature": "xxx"}

Recommendation

Just output the call status and if non-200 the error

@corpocott corpocott added bug Something isn't working guardian-support labels Oct 2, 2024
@evan-gray
Copy link
Contributor

@corpocott, these responses are published via a libp2p pubsub topic using UDP QUIC v1, not HTTPS. The default port is 8996. libp2p debug messages should be enabled when --logLevel=debug. These are quite chatty but should contain some subscriber and messaging information. In many cases, this comes down to a firewall or networking issue which may not be possible to debug via these logs. I'm not sure there are more logs to provide beyond this.

If you are trying to debug Queries, I would recommend using the ccqlistener tool. First following the command to listen to responses from any guardian, then again to listen to responses from your guardian. If the first succeeds and the second fails, try running the tool inside the same network (or better yet, the same box) as your guardian. If it then succeeds on both, you likely have a firewall or networking issue preventing your responses from egressing. If it fails for your guardian still, you can try overriding the --bootstrap to your guardian address. You can see examples of bootstrap strings here.

@corpocott
Copy link
Author

corpocott commented Oct 3, 2024

Oh alright, was told it was being submitted over https. Will see if i can get another guardian to run in debug to have them check if they are receiving our heartbeats. It sounds like the p2p gossip is working as expected, but ccq isn't even though they use the same library? Our egress firewall is totally open, so not sure why it isn't getting through. Would think if it was a firewall issue our p2p would be broken too.

@corpocott
Copy link
Author

any ideas on why p2p gossip would work as expected but ccq wouldn't? not sure why if they use the same library one would work and the other one wouldn't.

Logs in debug provide a heartbeat that is pretty helpful to know that we are receiving heartbeats from other guardians and they can see ours. Any chance of adding something similar to the ccq code?

@lrogana
Copy link

lrogana commented Oct 24, 2024

Hey @evan-gray how is ccq traffic different from the gossip p2p traffic?

@corpocott
Copy link
Author

to add some context we swapped the p2p gossip and ccq ports to use the heartbeats to debug ingress/egress on that specific port. p2p gossip seemed to work as expected using 8996. We were receiving heartbeats and the dashboard was able to see our node's heartbeats. So if both protocols use the same library to communicate wondering how p2p gossip works when using 8996 but not ccq. Any differences would help me further dig into the problem on our side.

@evan-gray
Copy link
Contributor

Agree that sounds like a good test. Both protocols do use the same library. As far as I understand, p2p and ccq_p2p use the exact same parameters except for the port and peers.

ccq := newCcqRunP2p(logger, params.ccqAllowedPeers, params.components)
if err := ccq.run(ctx, params.priv, params.guardianSigner, params.networkID, params.ccqBootstrapPeers, params.ccqPort, params.signedQueryReqC, params.queryResponseReadC, ccqErrC); err != nil {

ccq_p2p uses the same NewHost method.

ccq.h, err = NewHost(ccq.logger, ctx, networkID, bootstrapPeers, components, priv)

One thing it does differently is to only allow peer connections from the allowed peers and the guardians, but if you are seeing inbound requests from those peers, I don't have a reason to believe anything here would restrict outbound requests.

if _, found := ccq.allowedPeers[peerID.String()]; found {

@corpocott
Copy link
Author

corpocott commented Oct 24, 2024

anything else come to mind that I should be checking? Any way to tell on the ccq proxy if our responses are being rejected for some reason? Here is a stripped down example of our logs:

root received a query requestroot forwarded query request to watcher 
root.solana-finalized_watch CONCURRENT: processing query request
root.solana-finalized_watch received a sol_account query 
root.solana-finalized_watch CONCURRENT: finished processing query request 
root.solana-finalized_watch minimum context slot has not been reached, will retry shortly
root.solana-finalized_watch initiating fast retry 
root.solana-finalized_watch published query response to handler
root received final per chain query response, ready to publish
root forwarded query response to p2p 
root.p2p published signed query response

It appears as though we are publishing without any errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working guardian-support
Projects
None yet
Development

No branches or pull requests

3 participants