Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

State Sync Health Check #1892

Closed
wants to merge 21 commits into from
Closed

Conversation

Elvis339
Copy link

@Elvis339 Elvis339 commented Jan 24, 2025

Closes: #1878

This patch set introduces a new health check mechanism for the Snow VM, which monitors VM readiness and tracks vacuously verified blocks.

@Elvis339 Elvis339 marked this pull request as ready for review January 27, 2025 17:44
snow/block.go Outdated Show resolved Hide resolved
snow/vm.go Outdated Show resolved Hide resolved
snow/vm.go Outdated
Comment on lines 444 to 451
stateSync, err := v.stateSyncHealthChecker().HealthCheck(ctx)
if err != nil {
return nil, fmt.Errorf("state sync health check failed: %w", err)
}

return map[string]interface{}{
"stateSync": stateSync,
}, nil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will ignore any health checks registered through AddHealthCheck.

Could we maintain a map of named health checkers, include all of the results when HealthCheck is called, and return errors.Join(...) of any errors that occur?

snow/vm.go Outdated
Comment on lines 514 to 516
func (v *VM[I, O, A]) AddHealthCheck(name string, healthChecker health.Checker) {
v.healthCheckers[name] = healthChecker
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a lock around healthCheckers. HealthCheck may be called async by AvalancheGo, so we need to support concurrent R/W.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we return an error if there's a name collision?

snow/vm.go Outdated
Comment on lines 534 to 542
func (v *VM[I, O, A]) stateSyncHealthChecker() *stateSyncHealthChecker[I, O, A] {
if checker, ok := v.healthCheckers[StateSyncHealthCheckerNamespace].(*stateSyncHealthChecker[I, O, A]); ok {
return checker
}
newChecker := newStateSyncHealthChecker(v)
v.healthCheckers[StateSyncHealthCheckerNamespace] = newChecker

return newChecker
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this function, or could we just register this check when we transition from dynamic state sync to normal operation and not provide a helper to access it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed, removed. Thanks!

"sync"
)

const StateSyncHealthCheckerNamespace = "stateSyncHealthChecker"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we update the string to snowStateSync ?

// accepted during state sync will eventually be rejected in favor of valid blocks.
type stateSyncHealthChecker[I, O, A Block] struct {
vm *VM[I, O, A]
unresolvedBlocks map[ids.ID]struct{}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use set.Set[ids.ID] here? This is just a convenience that we commonly use, so we can keep the code consistent.

return s
}

func (s *stateSyncHealthChecker[I, O, A]) HealthCheck(_ context.Context) (interface{}, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: health check will be called async periodically from AvalancheGo, so this may be racy

snow/health_check.go Outdated Show resolved Hide resolved
snow/health_check.go Outdated Show resolved Hide resolved
@Elvis339 Elvis339 closed this Jan 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add State Sync Transition Health Check to Snow VM
2 participants