Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ociclient.makeError obscures StatusCode in a basic stringified error unless registry response is specifically-crafted JSON #26

Closed
tianon opened this issue Feb 7, 2024 · 4 comments

Comments

@tianon
Copy link
Contributor

tianon commented Feb 7, 2024

I don't know if it's intentional (it seems like it's not, given how the function is structured), but ociclient.makeError does a lot of return fmt.Errorf(...) for many cases that seem like they should be returning a wrapped wireError, wireErrors, or ociregistry.Error instead so consumers can still access the original status code.

// makeError forms an error from a non-OK response.
func makeError(resp *http.Response) error {
if resp.Request.Method == "HEAD" {
// When we've made a HEAD request, we can't see any of
// the actual error, so we'll have to make up something
// from the HTTP status.
var err error
switch resp.StatusCode {
case http.StatusNotFound:
err = ociregistry.ErrNameUnknown
case http.StatusUnauthorized:
err = ociregistry.ErrUnauthorized
case http.StatusForbidden:
err = ociregistry.ErrDenied
case http.StatusTooManyRequests:
err = ociregistry.ErrTooManyRequests
case http.StatusBadRequest:
err = ociregistry.ErrUnsupported
default:
return fmt.Errorf("error response: %v", resp.Status)
}
return fmt.Errorf("error response: %v: %w", resp.Status, err)
}
if !isJSONMediaType(resp.Header.Get("Content-Type")) || resp.Request.Method == "HEAD" {
// TODO include some of the body in this case?
data, _ := io.ReadAll(resp.Body)
return fmt.Errorf("error response: %v; body: %q", resp.Status, data)
}
data, err := io.ReadAll(io.LimitReader(resp.Body, errorBodySizeLimit+1))
if err != nil {
return fmt.Errorf("%s: cannot read error body: %v", resp.Status, err)
}
if len(data) > errorBodySizeLimit {
// TODO include some part of the body
return fmt.Errorf("error body too large")
}
var errs wireErrors
if err := json.Unmarshal(data, &errs); err != nil {
return fmt.Errorf("%s: malformed error response: %v", resp.Status, err)
}
if len(errs.Errors) == 0 {
return fmt.Errorf("%s: no errors in body (probably a server issue)", resp.Status)
}
errs.httpStatusCode = resp.StatusCode
return &errs
}

I don't think there's anything in the distribution spec that mandates the structure of non-200 status codes from a registry, but I might be very wrong there! (from what I can see, this matches the implementation over in the server half of this library, but I'm using ociclient against non-cue-labs registries 😇)

The specific case I'm running into is a request from a registry that's returning a 404, but it becomes a basic fmt.Errorf("404 Not Found: ...") (with the added bug that I think my registry must be returning a "JSON-y" Content-Type: value because the ociclient.makeError code is trying to Unmarshal it 😅), so I can't identify/adjust my client behavior based on the 404 without doing a string match, which isn't ideal (and seems counter to the design of ociregistry.ErrNameUnknown and friends, and even the wireError/wireErrors wrappers that allow direct access to the HTTP status code).

At first, I thought the Errors slice on wrapErrors was generic enough that I would've proposed moving up the var errs wireErrors (and population of httpStatusCode on it) to the top of the function, and then every return fmt.Errorf(...) instead be something like errs.Errors = append(errs.Errors, fmt.Errorf(...)); return &errs or something (perhaps even the HEAD error translation code for the !isJSONMediaType(resp.Header.Get("Content-Type")) case, maybe with a second error added to the Errors slice if you wanted to still include the errant body), but I see it's not a generic errors wrapper (which is fair, since it then wouldn't Unmarshal correctly), so I'm not sure what to propose here except that it'd be really nice if there were some way to access StatusCode on errors from the registry in ociclient.

@rogpeppe
Copy link
Contributor

rogpeppe commented Feb 8, 2024

Yup. I've definitely been thinking that the errors support needs to be improved. Thanks for creating the issue!

Some random thoughts:

  • it should be straightforward for HTTP middleware to construct a well-formed OCI error
  • we should make it possible to access the original HTTP status code when applicable (e.g. when invoking ociclient)
  • the spec doesn't make it entirely clear what the behaviour should be when faced with arbitrary HTTP status codes but no JSON error body (or an incompatible JSON body)
  • perhaps it should be possible for a registry Interface implementation to return an error that will result in a particular HTTP status code when ociserver sees it. Although this might be considered to be abstraction-breaking: thought required.
  • it would be nice to share a uniform error type (including the wire representation) between ociclient and ociserver

@tianon
Copy link
Contributor Author

tianon commented Feb 14, 2024

I don't think there's anything in the distribution spec that mandates the structure of non-200 status codes from a registry, but I might be very wrong there!

Just to correct myself here -- this is related to opencontainers/distribution-spec#178 + opencontainers/distribution-spec#280 (ala opencontainers/distribution-spec#443).

If we go back in time a bit to before the spec got "cleaned up", we get:

https://github.com/opencontainers/distribution-spec/blob/a6e5b091b1468662730ab1e5be55c61838643ab4/spec.md#on-failure-bad-request

(Which clearly describes the structured errors that ociclient is reading/parsing 😄 ❤️)

All that got removed before 1.0 of the distribution-spec, for reasons that apparently weren't made clear in writing anywhere (as seen in issue 443 there).

@mvdan
Copy link
Contributor

mvdan commented Feb 14, 2024

Should the spec add those structured errors back, or should we remove support for them if the spec dropped them on purpose?

@tianon
Copy link
Contributor Author

tianon commented Feb 14, 2024

IMO the spec should add them back, even if only as informational/SHOULD/MAY since they are in common use and documenting common use for interoperability is the literal point of the spec existing in the first place, but I'm not a maintainer of the distribution spec (only image and runtime). 🙈

However, @jonjohnsonjr there is a maintainer of the distribution spec, and I don't think I'm putting words in his mouth if I say he does think the spec would have value in having things like those structured errors back (opencontainers/distribution-spec#496 being some pretty compelling evidence towards that -- presumably closed because he was just restoring previously in-the-spec content and didn't think it should have too many cycles spent nit-picking, but that's definitely me projecting / trying to read between the lines). ❤️

@myitcv myitcv moved this to Backlog in Modules Roadmap Feb 15, 2024
porcuepine pushed a commit that referenced this issue Mar 22, 2024
TODO more explanation

Fixes #26.

Signed-off-by: Roger Peppe <[email protected]>
Change-Id: I5a79c1c6fec9f22c1f565830d73486b406dd181d
porcuepine pushed a commit that referenced this issue Mar 22, 2024
TODO more explanation

Fixes #26.

Signed-off-by: Roger Peppe <[email protected]>
Change-Id: I5a79c1c6fec9f22c1f565830d73486b406dd181d
porcuepine pushed a commit that referenced this issue Mar 22, 2024
TODO more explanation

Fixes #26.

Signed-off-by: Roger Peppe <[email protected]>
Change-Id: I5a79c1c6fec9f22c1f565830d73486b406dd181d
porcuepine pushed a commit that referenced this issue Mar 25, 2024
Currently the standard wire representation of an error is
duplicated across both client and server, and there is
no way for:
- a client to access the actual HTTP status of a response
- an `Interface` implementation to cause the `ociserver` to return
a specific HTTP error code for a error code that it doesn't
know about.

This change addresses that by moving the wire representation
into the top level `ociregistry` package, splitting `HTTPError`
into its own type, and making both `ociclient` and `ociserver`
aware of them.

Fixes #26.

Signed-off-by: Roger Peppe <[email protected]>
Change-Id: I5a79c1c6fec9f22c1f565830d73486b406dd181d
porcuepine pushed a commit that referenced this issue Mar 25, 2024
Currently the standard wire representation of an error is
duplicated across both client and server, and there is
no way for:
- a client to access the actual HTTP status of a response
- an `Interface` implementation to cause the `ociserver` to return
a specific HTTP error code for a error code that it doesn't
know about.

This change addresses that by moving the wire representation
into the top level `ociregistry` package, splitting `HTTPError`
into its own type, and making both `ociclient` and `ociserver`
aware of them.

One deliberate decision made here guards against some
scenarios when nesting an `ociclient` implementation inside
`ociserver`. There is a risk that, due to the fact we're using the
same `HTTPError` in `httpclient` and `httpserve, if `ociclient` talks to a misbehaving
server that returns inappropriate HTTP response codes, those
codes could propagate back through `ociserver`, causing that
to return inappropriate codes too.

So in `ociserver` we only use the `HTTPError` code if there
is no known HTTP status for the error code. This seems like
a reasonable half-way-house compromise.

Fixes #26.

Signed-off-by: Roger Peppe <[email protected]>
Change-Id: I5a79c1c6fec9f22c1f565830d73486b406dd181d
porcuepine pushed a commit that referenced this issue Mar 26, 2024
Currently the standard wire representation of an error is duplicated
across both client and server, and there is no way for:
- a client to access the actual HTTP status of a response
- an `Interface` implementation to cause the `ociserver` to return
  a specific HTTP error code for a error code that it doesn't know
  about.

This change addresses that by moving the wire representation into the
top level `ociregistry` package, splitting `HTTPError` into its own
type, and making both `ociclient` and `ociserver` aware of them.

One deliberate decision made here guards against some scenarios when
nesting an `ociclient` implementation inside `ociserver`. There is a
risk that, due to the fact we're using the same `HTTPError` in
`httpclient` and `httpserve, if `ociclient` talks to a misbehaving
server that returns inappropriate HTTP response codes, those codes could
propagate back through `ociserver`, causing that to return inappropriate
codes too.

So in `ociserver` we only use the `HTTPError` code if there is no known
HTTP status for the error code. This seems like a reasonable
half-way-house compromise.

Fixes #26.

Signed-off-by: Roger Peppe <[email protected]>
Change-Id: I5a79c1c6fec9f22c1f565830d73486b406dd181d
porcuepine pushed a commit that referenced this issue Mar 26, 2024
Currently the standard wire representation of an error is duplicated
across both client and server, and there is no way for:
- a client to access the actual HTTP status of a response
- an `Interface` implementation to cause the `ociserver` to return
  a specific HTTP error code for a error code that it doesn't know
  about.

This change addresses that by moving the wire representation into the
top level `ociregistry` package, splitting `HTTPError` into its own
type, and making both `ociclient` and `ociserver` aware of them.

One deliberate decision made here guards against some scenarios when
nesting an `ociclient` implementation inside `ociserver`. There is a
risk that, due to the fact we're using the same `HTTPError` in
`httpclient` and `httpserve, if `ociclient` talks to a misbehaving
server that returns inappropriate HTTP response codes, those codes could
propagate back through `ociserver`, causing that to return inappropriate
codes too.

So in `ociserver` we only use the `HTTPError` code if there is no known
HTTP status for the error code. This seems like a reasonable
half-way-house compromise.

Fixes #26.

Signed-off-by: Roger Peppe <[email protected]>
Change-Id: I5a79c1c6fec9f22c1f565830d73486b406dd181d
porcuepine pushed a commit that referenced this issue Mar 26, 2024
Currently the standard wire representation of an error is duplicated
across both client and server, and there is no way for:
- a client to access the actual HTTP status of a response
- an `Interface` implementation to cause the `ociserver` to return
  a specific HTTP error code for a error code that it doesn't know
  about.

This change addresses that by moving the wire representation into the
top level `ociregistry` package, splitting `HTTPError` into its own
type, and making both `ociclient` and `ociserver` aware of them.

One deliberate decision made here guards against some scenarios when
nesting an `ociclient` implementation inside `ociserver`. There is a
risk that, due to the fact we're using the same `HTTPError` in
`httpclient` and `httpserve, if `ociclient` talks to a misbehaving
server that returns inappropriate HTTP response codes, those codes could
propagate back through `ociserver`, causing that to return inappropriate
codes too.

So in `ociserver` we only use the `HTTPError` code if there is no known
HTTP status for the error code. This seems like a reasonable
half-way-house compromise.

Fixes #26.

Signed-off-by: Roger Peppe <[email protected]>
Change-Id: I5a79c1c6fec9f22c1f565830d73486b406dd181d
porcuepine pushed a commit that referenced this issue Mar 26, 2024
Currently the standard wire representation of an error is duplicated
across both client and server, and there is no way for:
- a client to access the actual HTTP status of a response
- an `Interface` implementation to cause the `ociserver` to return
  a specific HTTP error code for a error code that it doesn't know
  about.

This change addresses that by moving the wire representation into the
top level `ociregistry` package, splitting `HTTPError` into its own
type, and making both `ociclient` and `ociserver` aware of them.

One deliberate decision made here guards against some scenarios when
nesting an `ociclient` implementation inside `ociserver`. There is a
risk that, due to the fact we're using the same `HTTPError` in
`httpclient` and `httpserve, if `ociclient` talks to a misbehaving
server that returns inappropriate HTTP response codes, those codes could
propagate back through `ociserver`, causing that to return inappropriate
codes too.

So in `ociserver` we only use the `HTTPError` code if there is no known
HTTP status for the error code. This seems like a reasonable
half-way-house compromise.

Fixes #26.

Signed-off-by: Roger Peppe <[email protected]>
Change-Id: I5a79c1c6fec9f22c1f565830d73486b406dd181d
porcuepine pushed a commit that referenced this issue Mar 26, 2024
Currently the standard wire representation of an error is duplicated
across both client and server, and there is no way for:
- a client to access the actual HTTP status of a response
- an `Interface` implementation to cause the `ociserver` to return
  a specific HTTP error code for a error code that it doesn't know
  about.

This change addresses that by moving the wire representation into the
top level `ociregistry` package, splitting `HTTPError` into its own
type, and making both `ociclient` and `ociserver` aware of them.

One deliberate decision made here guards against some scenarios when
nesting an `ociclient` implementation inside `ociserver`. There is a
risk that, due to the fact we're using the same `HTTPError` in
`httpclient` and `httpserve, if `ociclient` talks to a misbehaving
server that returns inappropriate HTTP response codes, those codes could
propagate back through `ociserver`, causing that to return inappropriate
codes too.

So in `ociserver` we only use the `HTTPError` code if there is no known
HTTP status for the error code. This seems like a reasonable
half-way-house compromise.

Fixes #26.

Signed-off-by: Roger Peppe <[email protected]>
Change-Id: I5a79c1c6fec9f22c1f565830d73486b406dd181d
porcuepine pushed a commit that referenced this issue Mar 26, 2024
Currently the standard wire representation of an error is duplicated
across both client and server, and there is no way for:
- a client to access the actual HTTP status of a response
- an `Interface` implementation to cause the `ociserver` to return
  a specific HTTP error code for a error code that it doesn't know
  about.

This change addresses that by moving the wire representation into the
top level `ociregistry` package, splitting `HTTPError` into its own
type, and making both `ociclient` and `ociserver` aware of them.

One deliberate decision made here guards against some scenarios when
nesting an `ociclient` implementation inside `ociserver`. There is a
risk that, due to the fact we're using the same `HTTPError` in
`httpclient` and `httpserve, if `ociclient` talks to a misbehaving
server that returns inappropriate HTTP response codes, those codes could
propagate back through `ociserver`, causing that to return inappropriate
codes too.

So in `ociserver` we only use the `HTTPError` code if there is no known
HTTP status for the error code. This seems like a reasonable
half-way-house compromise.

Fixes #26.

Signed-off-by: Roger Peppe <[email protected]>
Change-Id: I5a79c1c6fec9f22c1f565830d73486b406dd181d
porcuepine pushed a commit that referenced this issue Mar 26, 2024
Currently the standard wire representation of an error is duplicated
across both client and server, and there is no way for:
- a client to access the actual HTTP status of a response
- an `Interface` implementation to cause the `ociserver` to return
  a specific HTTP error status for an error code that it doesn't know
  about.

This change addresses that by moving the wire representation into the
top level `ociregistry` package, splitting `HTTPError` into its own
type, and making both `ociclient` and `ociserver` aware of them.

One deliberate decision made here guards against some scenarios when
nesting an `ociclient` implementation inside `ociserver`. There is a
risk that, due to the fact we're using the same `HTTPError` in
`httpclient` and `httpserve, if `ociclient` talks to a misbehaving
server that returns inappropriate HTTP response codes, those codes could
propagate back through `ociserver`, causing that to return inappropriate
codes too.

So in `ociserver` we only use the `HTTPError` code if there is no known
HTTP status for the error code. This seems like a reasonable
half-way-house compromise.

Fixes #26.

Signed-off-by: Roger Peppe <[email protected]>
Change-Id: I5a79c1c6fec9f22c1f565830d73486b406dd181d
porcuepine pushed a commit that referenced this issue Mar 27, 2024
Currently the standard wire representation of an error is duplicated
across both client and server, and there is no way for:
- a client to access the actual HTTP status of a response
- an `Interface` implementation to cause the `ociserver` to return
  a specific HTTP error status for an error code that it doesn't know
  about.

This change addresses that by moving the wire representation into the
top level `ociregistry` package, splitting `HTTPError` into its own
type, and making both `ociclient` and `ociserver` aware of them.

One deliberate decision made here guards against some scenarios when
nesting an `ociclient` implementation inside `ociserver`. There is a
risk that, due to the fact we're using the same `HTTPError` in
`httpclient` and `httpserve, if `ociclient` talks to a misbehaving
server that returns inappropriate HTTP response codes, those codes could
propagate back through `ociserver`, causing that to return inappropriate
codes too.

So in `ociserver` we only use the `HTTPError` code if there is no known
HTTP status for the error code. This seems like a reasonable
half-way-house compromise.

We also change the `ociregistry.Error` interface so that `Detail` returns
`json.RawMessage` instead of `any`. This means that the error detail
is consistent no matter if the error has come from `ociclient` or from
a Go-implemented `ociregistry.Interface` implementation, and
means we can use `ociregistry.WireError` as the base error implementation
used by all the exported specific error values. This in turn means
that the error message changes because `WireError` includes the error
code (lower-cased) in the message, but that consistency seems like a
good thing.

Fixes #26.

Signed-off-by: Roger Peppe <[email protected]>
Change-Id: I5a79c1c6fec9f22c1f565830d73486b406dd181d
@github-project-automation github-project-automation bot moved this from Backlog to Done in Modules Roadmap Mar 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants