Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add initial healthcheck endpoint (ampd) #14

Closed
wants to merge 16 commits into from

Conversation

eloylp
Copy link
Member

@eloylp eloylp commented Apr 3, 2024

Description

At Eiger we are working on the Axelar Solana implementation. During this work, we realised ampd doesn't have a defined way to check the liveness of the process from an external perspective. This is specially useful when deploying workloads in i.e Kubernetes, as the cluster can query the process in order to check if its alive or not.

This is a PR proposal for adding an HTTP /status endpoint to the ampd process, that can be used by external systems to determine the liveness of the process.

Closes #12

Todos

  • Unit tests
  • Manual tests See this comment .
  • Documentation
  • Connect epics/issues

Steps to Test

Manual

  1. Start the ampd server.
  2.  $ curl localhost:3000/status

Unit tests

$ cd ampd && cargo test

Expected Behaviour

Once ampd starts its activity, an HTTP request to the /status endpoint should return a 200 OK response in json format:

{
 ok: true
}

Other Notes

This is just a dummy HTTP health check endpoint that only provides a probe of liveness of the process. It doesn't query any subsystem in order to check their internal health. In a future, such option could be revisited.

@eloylp eloylp changed the base branch from starknet to main April 3, 2024 04:43
@eloylp eloylp changed the title Add initial healthcheck endpoint (ampd) feat: Add initial healthcheck endpoint (ampd) Apr 3, 2024
@eloylp eloylp force-pushed the implement-basic-healthcheck branch from 6e76637 to 14a53b4 Compare April 3, 2024 21:41
.change_context(HealthCheckError::Error(format!(
"Failed binding to addr: {}",
bind_addr
)))?,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally i do not like to use new() for doing things like binding sockets. Initially i consider this approach doable taking into account how the rest of the codebase interacts with this server.

@eloylp eloylp changed the title feat: Add initial healthcheck endpoint (ampd) feat: add initial healthcheck endpoint (ampd) Apr 3, 2024
@eloylp eloylp marked this pull request as ready for review April 3, 2024 21:48
@eloylp eloylp closed this Apr 3, 2024
@eloylp eloylp reopened this Apr 3, 2024
@eloylp eloylp linked an issue Apr 3, 2024 that may be closed by this pull request
@eloylp eloylp force-pushed the implement-basic-healthcheck branch from 14a53b4 to c1c2795 Compare April 3, 2024 23:08
@eloylp eloylp self-assigned this Apr 3, 2024
@eloylp eloylp added the enhancement New feature or request label Apr 3, 2024
ampd/src/health_check.rs Outdated Show resolved Hide resolved
@eloylp
Copy link
Member Author

eloylp commented Apr 10, 2024

Manual testing

In order to provide certain confidence the healthcheck endpoint is not disturbing other components, the ampd daemon was tested by:

  1. Bringing up locally the tofnd (pre-requisite). We used the docker-compose way sharing the port with the host.

  2. Configure the ~/.ampd/config.toml file with the following content:

    tm_jsonrpc = "https://axelar-testnet-rpc.qubelabs.io:443"
    tm_grpc = "http://axelar-testnet-grpc.qubelabs.io:9090"
    event_buffer_capacity = 1000
    health_check_bind_addr = "0.0.0.0:3000"
    
    [service_registry]
    cosmwasm_contract="axelar1hrv2zf8xsfsey2umewng68c8pwsyktq4sm8u6k5sch4gqkn0qxrsg28xnx"
    
    [broadcast]
    batch_gas_limit="10000000"
    broadcast_interval="1s"
    chain_id="devnet-amplifier"
    gas_adjustment="2"
    gas_price="0.00005uamplifier"
    queue_cap="1000"
    tx_fetch_interval="600ms"
    tx_fetch_max_retries="10"
    
    [tofnd_config]
    key_uid="axelar"
    party_uid="ampd"
    url="http://127.0.0.1:50051"
    
    [[handlers]]
    type = 'MultisigSigner'
    cosmwasm_contract = 'axelar1lz2hevr93dwa3l86n7hnyy36age39jsk0e8nsj804xr3n8hh958q79m2sq'
  3. Execute cd ampd && cargo run :

    ❯ cargo run
        Finished dev [unoptimized + debuginfo] target(s) in 1.19s
         Running `/home/eloylp/projects/eiger/axelar-amplifier/target/debug/ampd`
    2024-04-10T13:15:35.305195Z  INFO ampd: found config file /home/eloylp/.ampd/config.toml
    2024-04-10T13:15:35.308031Z  INFO ampd: starting daemon args=Args { config: ["~/.ampd/config.toml", "config.toml"], state: "~/.ampd/state.json", output: Output::Text, cmd: () }
    2024-04-10T13:15:35.308152Z  INFO ampd::state: loading state from disk
    2024-04-10T13:15:37.850389Z  INFO ampd::health_check: Starting health check server at: 0.0.0.0:3000
    2024-04-10T13:15:40.417094Z  INFO ampd::state: state updated handler="multisig-signer" height=12887365

    We can see the output Starting health check server at: 0.0.0.0:3000 , which announces the healthcheck endpoint should be reachable at this point.

  4. Reach the health endpoint:

    ❯ curl localhost:3000/status
    {"ok":true}

@eloylp eloylp force-pushed the implement-basic-healthcheck branch from be8095a to 198838a Compare April 11, 2024 13:57
…ld (axelarnetwork#336)

* feat(minor-multisig-prover)!: allow dynamic update of signing threshold
@eloylp eloylp force-pushed the implement-basic-healthcheck branch from 0933b08 to b683961 Compare April 11, 2024 18:02
Use enum variant and make all errors transparent to application.
@eloylp eloylp force-pushed the implement-basic-healthcheck branch from d8f8e64 to a541b94 Compare April 15, 2024 13:48
@eloylp
Copy link
Member Author

eloylp commented Apr 15, 2024

The work of this PR is being upstreamed at axelarnetwork#344 . We are waiting for feedback there. Closing this one.

@eloylp eloylp closed this Apr 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create an HTTP /status endpoint for healthcheck probing (ampd)
6 participants