Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(parachain): send requests from subsystems #4475

Open
wants to merge 11 commits into
base: feat/parachain
Choose a base branch
from

Conversation

haikoschol
Copy link
Contributor

Changes

This PR adds support for sending requests of request/response network protocols via overseer/network bridge from parachain subsystems.

Tests

go test -run TestSendRequests ./dot/parachain/network-bridge

Issues

#4448

@haikoschol haikoschol self-assigned this Jan 16, 2025
@haikoschol
Copy link
Contributor Author

As we agreed on a recent call, this approach is using SendRequests messages passed through overseer to the network bridge. Although the plan was to propose types/structure before implementing, I had to code it to figure whether it would actually work. One issue I ran into repeatedly was cyclic imports. To get unstuck, I moved a few things into dot/parachain/network-bridge/messages that don't really belong there. That's a minor thing that can be changed later of course.

I've used the following existing types from the network package for the implementation:

  • Message for requests
  • ResponseMessage for responses
  • the RequestMaker interface
  • RequestResponseProtocol returned by network.Service.GetRequestResponseProtocol

I've added the following types:

// ReqProtocolMessage is a network message that can be sent over a request response protocol.
type ReqProtocolMessage interface {
	network.Message
	// Response returns an instance of the response type for this message, for the purpose of decoding into it.
	Response() network.ResponseMessage
	Protocol() ReqProtocolName
}

// ReqRespResult is the result of sending a request over a request response protocol. It contains either a response
// message or an error.
type ReqRespResult struct {
	Response network.ResponseMessage
	Error    error
}

// OutgoingRequest contains all data required to send a request over a request response protocol and receive the result.
type OutgoingRequest struct {
	Recipient peer.ID // TODO use a type that can contain either a peer ID or an authority ID
	Payload   ReqProtocolMessage
	Result    chan ReqRespResult
}

type IfDisconnectedBehavior int

const (
	TryConnect     IfDisconnectedBehavior = iota
	ImmediateError                        // TODO not implemented
)

// SendRequests is a subsystem message for sending requests over a request response protocol.
type SendRequests struct {
	Requests       []*OutgoingRequest
	IfDisconnected IfDisconnectedBehavior
}

Usage from a subsystem looks roughly as follows:

request := networkbridgemessages.NewOutgoingRequest(
	"recipient",
	networkbridgemessages.ChunkFetchingRequest{
		CandidateHash: parachaintypes.CandidateHash{Value: common.Hash{1}},
		Index:         42,
	})

sendRequests := networkbridgemessages.SendRequests{
	Requests:       []*networkbridgemessages.OutgoingRequest{request},
	IfDisconnected: networkbridgemessages.TryConnect,
}

ss.subsystemToOverseer <- sendRequests

result := <- request.Result

There are a few downsides/issues to discuss in this approach:

  1. Concurrency

The goroutine in NetworkBridgeSender that handles overseer messages currently handles requests sequentially and blocks on each request until error/timeout/response. It's easy enough to start a new goroutine for each request, but unclear how cancellation/shutdown can be handled properly. RequestResponseProtocol itself does not have a mechanism for aborting requests.

  1. Parameterization

RequestResponseProtocol requires setting a value for timeouts and the maximum response size. Right now, this is set to hard-coded values in NetworkBridgeSender. Ideally, subsystems would set these values according to their requirements. This could be accomplished by adding these parameters to the OutgoingRequest struct

  1. Efficiency

With the current implementation, an instance of RequestResponseProtocol is created for each request. This causes a large buffer to be allocated every time, which was supposed to be avoided by keeping this buffer in the objects state.

@haikoschol haikoschol marked this pull request as ready for review January 23, 2025 09:03
Copy link
Member

@EclesioMeloJunior EclesioMeloJunior left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a small observation on context creation

Copy link
Member

@EclesioMeloJunior EclesioMeloJunior left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just few comments

dot/parachain/network-bridge/sender.go Outdated Show resolved Hide resolved

// PoV is probably the largest message and is currently set at 5MB, but will likely be increased to 10MB in the future.
// see: https://github.com/paritytech/polkadot-sdk/issues/5334
// Maybe message types should have a MaxSize() method instead of using the same value for all messages.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I would say you can attach to the protocol string the max response size

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question then becomes what the values for requests other than PoVFetchingV1 should be. Would you be ok with deferring this change to a later PR, given that using smaller values here amounts to a potential performance optimization?


func requireClosed(t *testing.T, ch chan networkbridgemessages.ReqRespResult) {
select {
case <-ch:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to add an extra layer, when the channel is closed ok is false. Actually, you don't need the select when you retrieve a tuple from the channel (even if the channel is not closed)

Suggested change
case <-ch:
_, ok := <-ch
require.False(t, ok)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the channel is empty but not closed, the receive will block though. The ok merely indicates whether the received value is the zero value for the type of the channel, i.e. whether the channel was empty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants