Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rfc42: add credit based flow control for channels #427

Merged
merged 2 commits into from
Oct 25, 2024

Conversation

garlick
Copy link
Member

@garlick garlick commented Sep 24, 2024

Problem: rfc42 does not specify anything about channel flow control, so it is difficult to design clients and servers that operate without dropping data.

Add credit based flow control to the protocol.

The client receives credit for writes to the subprocess via the exec add-credit response.

The server receives credit for output to the client via the add-credit RPC.

@chu11
Copy link
Member

chu11 commented Oct 11, 2024

just so i don't forget later, gotta add a "forwarding flag" for credits here

  .. object:: flags                                                                                                                         
                                                                                                                                            
    (*integer*, REQUIRED) A bitfield comprised of zero or more flags:                                                                       
                                                                                                                                            
    stdout (1)                                                                                                                              
      Forward standard output to the client.                                                                                                
                                                                                                                                            
    stderr (2)                                                                                                                              
      Forward standard error to the client.                                                                                                 
                                                                                                                                            
    channel (4)                                                                                                                             
      Forward auxiliary channel output to the client.                                                                                       

@chu11
Copy link
Member

chu11 commented Oct 11, 2024

in my prototyping I've determined there's two "styles" of using credits

  • the user doesn't want to write anything to stdin/channel until it receives its initial credits. Will likely write to stdin via the "credit callback".
  • The other style is that the user wants to write data right away without waiting receiving credits. This makes sense b/c initial credits are known ahead of time. So we do not want to send the subprocess to send initial credits.

This will necessitate sending a flag to the remote subprocess to tell it to send/not-send the initial credits.

Any opinions on how to add this to the exec request.

A) new flag that specifies subprocess flags (could also add NO_SETPGRP, FORK_EXEC, etc. flags in there)
B) add it into to the current flags we have here.

My anal retentiveness likes to keep this stuff separate and wants to do 'A'. But that is more bytes in the payload.

@garlick
Copy link
Member Author

garlick commented Oct 11, 2024

It seems like the server should always send credits to the client and the client should wait until it gets that response to send any data. What I was on about earlier was that the server should be able to send that response immediately rather than wait for the local reactor loop to tell it the initial value, since the initial value should already be known to the server.

It would be worrisome to have the client make any assumptions about the size of the server's buffer IMHO.

@chu11
Copy link
Member

chu11 commented Oct 11, 2024

What I was on about earlier was that the server should be able to send that response immediately rather than wait for the local reactor loop to tell it the initial value, since the initial value should already be known to the server.

So for a remote subprocess I did have to add a flag for the server to tell its local subprocess not to send the "initial credits", since the server can handle that themselves.

It would be worrisome to have the client make any assumptions about the size of the server's buffer IMHO.

We could certainly leave this off until a time it becomes necessary. Atleast w/ my flux-exec prototype, this style was not needed.

@garlick
Copy link
Member Author

garlick commented Oct 11, 2024

Maybe we should establish a minimum buffer size of 4K or so and say that if a subprocess client wants to ignore credits (not register the callback) they can behave as before, but if they send more than 4K of data, it will not be reliable?

I hate to make every client register a credit callback.

@chu11
Copy link
Member

chu11 commented Oct 11, 2024

I hate to make every client register a credit callback.

They way I initially implemented things, a client does require credits to call flux_subprocess_write(), it's basically used as "information" for the client. They can choose to ignore it if they want and the API is just like it was before. If they don't specify the credit callback, everything stays the same in the API as it was before.

@chu11
Copy link
Member

chu11 commented Oct 15, 2024

the rune to allow me to push to this branch probably isn't setup, but this is so tiny, perhaps just easiest to manually add this to PR?

diff --git a/spec_42.rst b/spec_42.rst
index ed4d649..37412ab 100644
--- a/spec_42.rst
+++ b/spec_42.rst
@@ -123,6 +123,9 @@ are defined as follows:
     channel (4)
       Forward auxiliary channel output to the client.
 
+    credit (8)
+      Forward channel credits to the client.
+
 Several response types are distinguished by the type key:
 
 .. object:: exec started response

@garlick
Copy link
Member Author

garlick commented Oct 15, 2024

Oh so that's how you got the sdexec tests to pass :-)

This seems reasonable though there may need to be some text added to say what happens with and without this flag. I'll play with it.

@garlick
Copy link
Member Author

garlick commented Oct 15, 2024

Playing devils advocate: is there a reason not to keep it simple and always send the add-credit responses, and let the client ignore them if it doesn't care rather than requiring it to declare its intentions with a flag?

@garlick
Copy link
Member Author

garlick commented Oct 15, 2024

OK, I added the flag renaming it to write-credit (anticipating a flag for reads).

I also got rid of the read direction language since we are not there yet.

Finally, I reworded some of the text concerning credits in the write description, indicating that credit flow control is optional.

@chu11
Copy link
Member

chu11 commented Oct 15, 2024

Playing devils advocate: is there a reason not to keep it simple and always send the add-credit responses, and let the client ignore them if it doesn't care rather than requiring it to declare its intentions with a flag?

It's a lot of processing and extra traffic that serves no purpose. I could argue why not do that with the output too? If the caller doesn't want the stdout just drop it?

chu11
chu11 previously approved these changes Oct 15, 2024
Copy link
Member

@chu11 chu11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this LGTM given flux-framework/flux-core#6353 work. Not approving at the moment, we probably want to hold off merging until we review that PR a little bit more.

@chu11 chu11 dismissed their stale review October 15, 2024 21:25

oops, meant just a comment

@garlick
Copy link
Member Author

garlick commented Oct 15, 2024

It's a lot of processing and extra traffic that serves no purpose. I could argue why not do that with the output too? If the caller doesn't want the stdout just drop it?

This allows us to reject an sdexec request with this flag for now, which is appropriate while it's unsupported. That seems like a good enough reason to me for :-) If that weren't the case I'd probably argue that you ignore these messages at your peril, unless you're not using stdin or you know you are writing less than the buffer size, in which case it won't be very many messages wasted.

Problem: rfc42 does not specify anything about channel flow control,
so it is difficult to design clients and servers that operate without
dropping data.

Add credit based flow control to the protocol in the write direction.

The client receives credit for writes to the subprocess via the
exec add-credit response.
Problem: clients cannot send data to a remote subprocess before
receiving the first exec add-credit response unless they know how
big the remote buffer is.

State that the minimum default buffer size is 4096 bytes.
The default may be greater, but an eager client that doesn't know the
remote buffer size should know it can safely send that much data
initially without receiving credit.
@garlick
Copy link
Member Author

garlick commented Oct 23, 2024

Just pushed the following changes

  • moved input flow control description to a subsection under the write RPC
  • described borrowing credit before the initial add-credit response is received
  • add JSON example add-credit response
  • stated a minimum default input buffer size (4K), so at least that amount can be borrowed if the channel buffer size is not requested to be something different. The actual buffer size can still be less than that (useful in test) on request.

Copy link
Member

@chu11 chu11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Only thing is do we want to say buffer size shall be >= 4096 in the general sense (i.e. input & output). Or we can deal with that tweak when we begin to implement the stdout/output side of this.

@garlick
Copy link
Member Author

garlick commented Oct 25, 2024

Thanks! I vote we cross that bridge when we come to it. Setting MWP.

@mergify mergify bot merged commit 1602d14 into flux-framework:master Oct 25, 2024
7 checks passed
@garlick garlick deleted the rfc42_flowcontrol branch October 30, 2024 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants