Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] GethInstance dropping connections with the underlying node #2091

Open
gnapoli23 opened this issue Feb 21, 2025 · 5 comments · May be fixed by #2104
Open

[Bug] GethInstance dropping connections with the underlying node #2091

gnapoli23 opened this issue Feb 21, 2025 · 5 comments · May be fixed by #2104
Labels
bug Something isn't working c-node-bindings Pertaining to the node-bindings implementations such as anvil, reth and geth P-normal Normal Priority

Comments

@gnapoli23
Copy link
Contributor

gnapoli23 commented Feb 21, 2025

Component

node-bindings

What version of Alloy are you on?

v0.11.1

Operating System

Linux

Describe the bug

Description

I have a test suite which needs to setup 2 sets of wallets (each set has both EOAs and Smart Contract Accounts) after the generation of their related seeds.

The 1st set of wallets is created in a certain amount of time X, and if the seed is related to the a Smart Contract Account then the deployment through a GethInstance is done right after the generation it.

The 2nd set of wallets requires an amount of time Y > X, since the seed generation process needs more computation time, but the procedure then follows the same logic ad the previous case (if SCA, deploy).

These sets of wallets are generated (and eventually deployed) sequentially, first going with set 1 and then with 2.

While for set 1 everything goes smoothly, for set 2 it always happens that when I reach the point in which I need to deploy a SCA I got the following error message from the underlying GethInstance:

 Unable to send raw request: server returned an error response: error code -32002: request timed ou

This error comes from checking the status of the underlying GethInstance by sending a raw request to it for the net_listening RPC method. Basically, I create the connection to the GethInstance and I do a first check as mentioned, then I start generating the seeds (which requires ~20s) and then I do further checks with net_listening each time I need to deploy a SCA.

I ran the same test flow with both an AnvilInstance and a GethInstance, but for Anvil it always goes to completion while for Geth doesn't.
More over, I have an implementation of geth command binding (let's call it GethD) that tears up the geth instance and peforms the operations mentioned before, and also with this one it doesn't fail at all.

To me, it looks like the first connection to the GethInstance works good during the init phase while after waiting for the process of generating the seeds the connection drops for some reason and so that error is returned.

Details

For all the 3 cases mentioned (AnvilInstance, GethInstance and my own GethD binding) I have the following setup:

  • Dev mode
  • chain_id = 1337 (ofc)
  • block_time = 1s
@gnapoli23 gnapoli23 added the bug Something isn't working label Feb 21, 2025
@mattsse
Copy link
Member

mattsse commented Feb 21, 2025

do you have any logs of the gethinstance?

it could be useful to launch the geth instance separately and take a look at the output

@gnapoli23
Copy link
Contributor Author

gnapoli23 commented Feb 21, 2025

do you have any logs of the gethinstance?

it could be useful to launch the geth instance separately and take a look at the output

No, but I will find a way to log its output. geth goes on stderr, right? Is there any simple way to do so?
I was thinking to take out the ChildStderr from it during the execution of the part in which it hangs and let a separate task log whatever happens on a file.

@gnapoli23
Copy link
Contributor Author

gnapoli23 commented Feb 21, 2025

do you have any logs of the gethinstance?
it could be useful to launch the geth instance separately and take a look at the output

No, but I will find a way to log its output. geth goes on stderr, right? Is there any simple way to do so? I was thinking to take out the ChildStderr from it during the execution of the part in which it hangs and let a separate task log whatever happens on a file.

@mattsse

Ok, I went for this, and pulled the ChildStderr in the exact point where it was failing, here's the file of geths execution for that section.

Now, surprise: if I pull out that stream (and so, I consume it) it doesn't fail, indeed the logfile does not have a single ERROR entry. I tried to run it many times, and it always succeeded. As soon as I not consume that stream, the error comes back.

geth.txt

@gnapoli23
Copy link
Contributor Author

@mattsse

Ok, I was able to reproduce it.

The 1st test (test_gethinstance) does not succeed, and indeed at a certain point it panics with

thread 'tests::test_gethinstance' panicked at src/main.rs:46:18:
called `Result::unwrap()` on an `Err` value: ErrorResp(ErrorPayload { code: -32002, message: "request timed out", data: None })

The 2nd test (test_gethinstance_stderr) in which I consume the ChildStderr works fine.

Here the snippet:

#[cfg(test)]
mod tests {
    use tracing::info;

    use std::{
        io::{BufRead, BufReader},
        time::Duration,
    };

    use alloy::{
        node_bindings::Geth,
        providers::{Provider, ProviderBuilder},
    };
    use tracing_test::traced_test;

    #[tokio::test]
    #[traced_test]
    async fn test_gethinstance() {
        // Create GethInstance
        info!("Create geth instance");
        let geth = Geth::new().dev().block_time(1).spawn();

        // Get provider
        let provider = ProviderBuilder::new().on_http(geth.endpoint_url());

        info!("Check connection with `net_listening`");
        let resp = provider
            .raw_request::<_, bool>("net_listening".into(), ())
            .await
            .unwrap();
        assert!(resp);

        info!("Simulate seed generation");
        tokio::time::sleep(Duration::from_secs(20)).await;

        // Interact with node by checking each time `net_listening`
        for i in 0..=999 {
            info!("Attempt #{i}");
            let resp = provider
                .raw_request::<_, bool>("net_listening".into(), ())
                .await
                .unwrap();
            assert!(resp);
        }
    }

    #[tokio::test]
    #[traced_test]
    async fn test_gethinstance_stderr() {
        // Create GethInstance
        info!("Create geth instance");
        let mut geth = Geth::new().dev().block_time(1).spawn();

        // Get provider
        let provider = ProviderBuilder::new().on_http(geth.endpoint_url());

        info!("Check connection with `net_listening`");
        let resp = provider
            .raw_request::<_, bool>("net_listening".into(), ())
            .await
            .unwrap();
        assert!(resp);

        info!("Simulate seed generation");
        tokio::time::sleep(Duration::from_secs(20)).await;

        info!("Pulling out stderr from GethInstance");
        let mut stderr = geth.stderr().unwrap();
        let _ = std::thread::spawn(move || {
            let mut buf = String::new();
            let mut reader = BufReader::new(&mut stderr);

            loop {
                buf.clear();
                reader.read_line(&mut buf).unwrap();
            }
        });

        // Interact with node by checking each time `net_listening`
        for i in 0..=999 {
            info!("Attempt #{i}");
            let resp = provider
                .raw_request::<_, bool>("net_listening".into(), ())
                .await
                .unwrap();
            assert!(resp);
        }
    }
}

@mattsse mattsse linked a pull request Feb 22, 2025 that will close this issue
@mattsse
Copy link
Member

mattsse commented Feb 22, 2025

ah I think this is the same problem as #1985

@yash-atreya yash-atreya added P-normal Normal Priority c-node-bindings Pertaining to the node-bindings implementations such as anvil, reth and geth labels Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working c-node-bindings Pertaining to the node-bindings implementations such as anvil, reth and geth P-normal Normal Priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants