Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

azure: retry HTTP requests on codes 404, 410, and 429 #1807

Merged
merged 1 commit into from
Feb 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/release-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ nav_order: 9
### Bug fixes

- Fix failure when config only disables units already disabled
- Retry HTTP requests on Azure on status codes 404, 410, and 429


## Ignition 2.17.0 (2023-11-20)
Expand Down
11 changes: 10 additions & 1 deletion internal/providers/azure/azure.go
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,16 @@ func fetchFromIMDS(f *resource.Fetcher) ([]byte, error) {
headers := make(http.Header)
headers.Set("Metadata", "true")

data, err := f.FetchToBuffer(imdsUserdataURL, resource.FetchOptions{Headers: headers})
// Azure IMDS expects some codes <500 to still be retried...
// Here, we match the cloud-init set.
// https://github.com/canonical/cloud-init/commit/c1a2047cf291
// https://github.com/coreos/ignition/issues/1806
retryCodes := []int{
404, // not found
410, // gone
429, // rate-limited
Copy link

@SudoBrendan SudoBrendan Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idea: we should consider respecting the Retry-After header here so we don't cascade 429 errors (if this isn't already being done somewhere - I see a duration value that could be played with).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, yes. We currently do an exponential backoff which will help a lot here at least.

}
data, err := f.FetchToBuffer(imdsUserdataURL, resource.FetchOptions{Headers: headers, RetryCodes: retryCodes})
if err != nil {
return nil, fmt.Errorf("fetching to buffer: %w", err)
}
Expand Down
17 changes: 16 additions & 1 deletion internal/resource/http.go
Original file line number Diff line number Diff line change
Expand Up @@ -263,6 +263,21 @@ func (f *Fetcher) newHttpClient() error {
return nil
}

func shouldRetryHttp(statusCode int, opts FetchOptions) bool {
// we always retry 500+
if statusCode >= 500 {
return true
}

for _, retryCode := range opts.RetryCodes {
if statusCode == retryCode {
return true
}
}

return false
}

// httpReaderWithHeader performs an HTTP request on the provided URL with the
// provided request header & method and returns the response body Reader, HTTP
// status code, a cancel function for the result's context, and error (if any).
Expand Down Expand Up @@ -298,7 +313,7 @@ func (c HttpClient) httpReaderWithHeader(opts FetchOptions, url string) (io.Read

if err == nil {
c.logger.Info("%s result: %s", opts.HTTPVerb, http.StatusText(resp.StatusCode))
if resp.StatusCode < 500 {
if !shouldRetryHttp(resp.StatusCode, opts) {
return resp.Body, resp.StatusCode, cancelFn, nil
}
resp.Body.Close()
Expand Down
4 changes: 4 additions & 0 deletions internal/resource/url.go
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,10 @@ type FetchOptions struct {
// LocalPort is a function returning a local port used to establish the TCP connection.
// Most of the time, letting the Kernel choose a random port is enough.
LocalPort func() int

// List of HTTP codes to retry that usually would be considered as complete.
// Status codes >= 500 are always retried.
RetryCodes []int
}

// FetchToBuffer will fetch the given url into a temporary file, and then read
Expand Down
Loading