Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Pushing blob chunks could have a max chunk size hint #485

Open
gabivlj opened this issue Oct 30, 2023 · 28 comments
Open

Proposal: Pushing blob chunks could have a max chunk size hint #485

gabivlj opened this issue Oct 30, 2023 · 28 comments

Comments

@gabivlj
Copy link

gabivlj commented Oct 30, 2023

Description

Hi! We've encountered a use-case we'd love to discuss over here. We're proposing a new header that could be useful for registries that are behind proxies. Those proxies usually buffer requests so for big layers it degrades performance and memory usage of the server. It would be nice if the registry, either on upload creation (POST /v2/:name/blobs/uploads/) or in another way to hint the client the preferred max chunk size.

For example, I've seen places that document the header OCI-Chunk-Min-Length, the purpose is hinting a minimum chunk size for clients. It would be amazing to see the counter-part OCI-Chunk-Max-Length.

Is it something interesting to see in https://distribution.github.io/distribution/spec/api/?

I feel like this would be incredibly useful for registries that have limited resources.

@sudo-bmitch
Copy link
Contributor

Does this affect Docker's streaming method for a blob push (not defined by OCI yet, but since it's used by docker it's well supported by registries), non-chunked pushes (used by a lot of other tools), or the final chunk size?

@gabivlj
Copy link
Author

gabivlj commented Oct 31, 2023

This affects mainly the POST and PATCH method on chunked uploads. It would work the same as OCI-Chunk-Min-Length, but by defining a maximum chunk size to send in PATCH to not overwhelm the server.

I sent a PR to the distribution repository showcasing how this change would look like.

@sudo-bmitch
Copy link
Contributor

Docker's streaming API uses a patch request without a content-length value. It's effectively the same as a post/put, but the digest is not known in advance, so they push the full blob with a single patch.

How do proxies deal with very large blobs uploaded with a post/put?

@gabivlj
Copy link
Author

gabivlj commented Oct 31, 2023

Docker's streaming API uses a patch request without a content-length value. It's effectively the same as a post/put, but the digest is not known in advance, so they push the full blob with a single patch.

How do proxies deal with very large blobs uploaded with a post/put?

For proxies or platforms like Workers they buffer the request body. So it would be nice to tell the client (like Docker) to limit the request body to a certain size, as there might be a limit in the server on how many bytes it can buffer.

@sudo-bmitch
Copy link
Contributor

For proxies or platforms like Workers they buffer the request body. So it would be nice to tell the client (like Docker) to limit the request body to a certain size, as there might be a limit in the server on how many bytes it can buffer.

I think that makes my point. The proposed change fixes one of 4 blob upload methods, only 3 of which are defined by OCI, and it's the least used method due to poor support. So I'm unsure how this will solve the underlying issue for proxies.

@gabivlj
Copy link
Author

gabivlj commented Oct 31, 2023

For proxies or platforms like Workers they buffer the request body. So it would be nice to tell the client (like Docker) to limit the request body to a certain size, as there might be a limit in the server on how many bytes it can buffer.

I think that makes my point. The proposed change fixes one of 4 blob upload methods, only 3 of which are defined by OCI, and it's the least used method due to poor support. So I'm unsure how this will solve the underlying issue for proxies.

Please correct me if my understanding here is incorrect, but what I'm proposing is that we do this for all workflows that involve the POST and PATCH and then PUT uploads. I think that involves the chunked upload and the streaming way of Docker as we would also implement in Docker a way for it to respect the limit.

Monolithic uploads and POST PUT only uploads are the only ones that we are not trying to support here.

It would benefit the services that have this limitation as we would atleast have a way to upload layers that surpass this limit.

@sudo-bmitch
Copy link
Contributor

I believe ordered by popularity, there's:

  1. POST/PUT
  2. Docker's streaming PATCH
  3. Chunked POST/PATCH/PUT
  4. Single POST

The only method covered by this PR would be 3. OCI hasn't standardized 2 so I don't see how this would apply to that. And if we did define the streaming PATCH, I believe it would have to be excluded from this by its definition.

And 1, being one of the more popular methods, would continue to be an issue for proxies. Given the number of issues I had with registries when I implemented 3, I suspect it's a very small percentage of blob uploads (I'd guess less than 1%). I never tried to implement 4, but I'm guessing a lot of registries using an external blob store would have support issues with that.

Given that close to 100% of blob uploads wouldn't be affected by this change, I'm worried the change to the spec won't have the desired result for your use case.

@gabivlj
Copy link
Author

gabivlj commented Oct 31, 2023

I believe ordered by popularity, there's:

  1. POST/PUT

  2. Docker's streaming PATCH

  3. Chunked POST/PATCH/PUT

  4. Single POST

The only method covered by this PR would be 3. OCI hasn't standardized 2 so I don't see how this would apply to that. And if we did define the streaming PATCH, I believe it would have to be excluded from this by its definition.

And 1, being one of the more popular methods, would continue to be an issue for proxies. Given the number of issues I had with registries when I implemented 3, I suspect it's a very small percentage of blob uploads (I'd guess less than 1%). I never tried to implement 4, but I'm guessing a lot of registries using an external blob store would have support issues with that.

Given that close to 100% of blob uploads wouldn't be affected by this change, I'm worried the change to the spec won't have the desired result for your use case.

Interesting. I think this would benefit us as we would have a justification for a Docker PR where we support this functionality for the 2nd use case (see PR mentioned on top). We are also fine supporting only 3rd case outside of Docker. We are not targeting 1st and 4th at the moment.

@sudo-bmitch
Copy link
Contributor

For reference, the streaming patch is defined in the distribution project. In practice, this was implemented as a POST / single PATCH (no length or range header) / PUT. I believe this is done for packing the layer content on the fly, where the output of the tar + compress of the files is not known in advance and they didn't want to write to a temp file or memory.

Supporting more than one PATCH in a stream request is interesting, and looks like it's technically allowed in the distribution project spec, even if it was not previously implemented that way in practice. Since OCI hasn't defined that API yet, I'd be curious to hear from registry operators if they have any concerns with supporting multiple PATCH requests on that API (trying to think ahead for if/when we do add that API).

@gabivlj
Copy link
Author

gabivlj commented Oct 31, 2023

To clarify nomenclature, stream request you mean the way docker pushes a layer to PATCH?

From what I've seen in the code, I think it's doable to have multiple PATCH calls from Docker if we add a limit reader like here:
distribution/distribution#4144

What I want to get here is just have a way for clients/registries to agree on a max request body size for more constrained environments, seems like a good place to put this is what I'm proposing here:
#486

@sudo-bmitch
Copy link
Contributor

Yes, I believe docker was using the API that I linked above. Changing the behavior of the docker client or the distribution registry should be easy. Making sure that such a change doesn't impact any other registry implementations (ecr, gar, acr, quay, harbor, zot, etc) is where I'm looking for feedback from registry operators. It also means OCI, if/when we added that API, would need to support a looser implementation where multiple PATCH requests without a range or length would be allowed.

@gabivlj
Copy link
Author

gabivlj commented Oct 31, 2023

I'm looking to document this and making it an optional field for registries operators to implement, I think it wouldn't be a breaking change for existing registries AFAIK?

@gabivlj
Copy link
Author

gabivlj commented Nov 1, 2023

I also want to add that this seems similar to #391. Who can chime in this thread that could review the change and bring more to the discussion?

Thanks for the comments on this @sudo-bmitch btw, I appreciate the input on this!

@sudo-bmitch
Copy link
Contributor

The difference between the two PRs is the problem being solved with the feature, which APIs it applies to, and how that change will affect existing implementations.

#391 was focused on just the chunked upload, which is relatively unused, and only needed to fix a problem with how that API interacted with backends. Since it was so unused, there was little impact to clients, and the only reason it was noticed at all was because the conformance tests used unusually small chunk sizes.

This PR is trying to standardize a change on how every Docker push is run since it's including the streaming API, which isn't even part of the OCI spec. And it's trying to fix a problem with proxies, which will only be solved for those using specific APIs.

As a result, it's both a much larger change, that may affect existing registries, and one which I expect the partial implementation will result in users complaining of broken tooling depending on which push method is used by their specific tool (I've looked at more than a few requests from users complaining about broken tools when the issue is tracked down to something on their network). So I believe waiting for feedback from the community would be valuable in this scenario.

@gabivlj
Copy link
Author

gabivlj commented Nov 1, 2023

I agree that a change like this should be methodical. However, this code path is only activated by the few registries that implement this OCI-Chunked-Max-Length header.
Docker will keep doing the same path of upload in a single patch the layer. If that is still not OK I understand, but it's a code path that is only activated by registries that need it.
Maybe we should advise that it's an experimental feature or something the registry should avoid if it cans?

@sudo-bmitch
Copy link
Contributor

sudo-bmitch commented Nov 1, 2023

Could it be activated by a user controlled http proxy on all outbound connections from their network, going to a general cloud registry? Since it's configured for proxies, and I'm assuming by proxies, does the registry have any say on whether users enable it on their side?

@gabivlj
Copy link
Author

gabivlj commented Nov 1, 2023

The idea here is that this is mainly designed for serverless platforms like Workers or registries that are hosted by the user behind a proxy that buffers requests.
These platforms have the limitation that the request body cannot be bigger than X. This is configured by the user that owns the registry, so the path is activated only by the user that maintains the registry. Shouldn't be the default by any case for normal registries that dont have these restrictions.

@sudo-bmitch
Copy link
Contributor

sudo-bmitch commented Nov 2, 2023

How does this prevent networks with a forward proxy from adding this header to responses from third party registries? Wouldn't those networks also want the ability to limit the size of a blob chunks so they can inspect the content before it leaves their network? That's a fairly common network model for large organizations with security or regulation requirements.

Be careful not to assume that users will only use the feature for a single use case. I doubt the original image spec authors thought we would hijack their media types for arbitrary artifacts. 😄

@gabivlj
Copy link
Author

gabivlj commented Nov 2, 2023

Sure! That's also a legitimate use-case. I don't see a problem with that. I feel like that's also an opt-in behaviour from the users of the registries. What I mean is that the person that is using the registry, either to intercept requests by adding this header, or by choosing to maintain it themselves activate this behaviour path.

@sudo-bmitch
Copy link
Contributor

The user with an arbitrary client tool, the network operator of the proxy, and the registry owner, are typically 3 different groups in environments I've seen. In those scenarios, the client tooling didn't opt into it, the registry operator didn't either, and the user will often report issues to both of them rather than work with their own network team.

@gabivlj
Copy link
Author

gabivlj commented Nov 2, 2023

If the client didn't opt-in (chooses to ignore the header) then it will work the same as always. The behaviour will be the same as before because the proxy will still buffer the big request body. If it chooses to opt-in to the behaviour (implements the header) it will chunk the request.

@sudo-bmitch
Copy link
Contributor

I'm going to take a step back from this issue because I don't feel my concerns are being addressed and I don't think the people that should be addressing them are going to read the full discussion.

@gabivlj
Copy link
Author

gabivlj commented Nov 10, 2023

Hi, sorry to circle back on this. Is there any folks that are able to add more context to discussion? Thank you!

@agahkarakuzu
Copy link

I believe the issue I am encountering is relevant to this.

I am hosting a private registry and have tried two deployments: one with NGINX (I tried all possible configs) and one with Traefik. In both cases, the push of a large layer (e.g., 5GB) fails unless I turn off the Cloudflare proxy (free plan, 100MB upload limit). On the other hand, some other relatively large layers (e.g., 1.5GB) can be pushed without a problem.

I don't know how chunk sizes are determined, but I assume that they can sporadically be higher than a desired limit. Therefore, having something like the following (as mentioned here) would be useful for people who would like to take advantage of a free CDN plan for better performance:

// maxRange is a way to control the maximum size of the chunk that 
httpBlobUpload will send to the registry.

I'd be glad if you can confirm whether this enhancement pertains to and may resolve the problem I described. Thanks!

@gabivlj
Copy link
Author

gabivlj commented Feb 19, 2024

Hi @agahkarakuzu, this is one of the problems this proposal is trying to address. Thank you for adding more examples!
I think docker doesn't really have a way to decide chunk sizes, from my previous tests with it I've seen it try to push a layer in a single chunk no matter what's the size.

@agahkarakuzu
Copy link

Thanks for the prompt response @gabivlj!

from my previous tests with it I've seen it try to push a layer in a single chunk no matter what's the size.

I guess this addition would enforce a chunked upload? Is there an approximate timeline to merge it? I guess my only option to keep the proxy on, then turn it off to save the day once in a while.

@harris2012
Copy link

considering someone is pushing a very large image like 1000M.

Without the OCI-Chunk-Max-Length.
any little network failure will let him patch whole image from 0%.

With the OCI-Chunk-Max-Length.
He can patch the image every 10M. When networkle failure at 40%, he will continue to retry from 40%.

@sudo-bmitch
Copy link
Contributor

considering someone is pushing a very large image like 1000M.

Without the OCI-Chunk-Max-Length.
any little network failure will let him patch whole image from 0%.

With the OCI-Chunk-Max-Length.
He can patch the image every 10M. When networkle failure at 40%, he will continue to retry from 40%.

That's supported today with chunked uploads. But few use chunked uploads (they are a lot slower with the added round trips, less supported by registries, and more complicated # to implement). So this issue wouldn't help anyone experiencing the issue you describe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants