Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPA: improve error messages returned to the end user #796

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion atc/api/policychecker/policychecker_suite_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ import (
"testing"
)

func TestAccessor(t *testing.T) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems as copy pasta...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes. We contributing upstream, we can even split this PR in 2. The first one with this fix and the equivalent you did below. At least such PR should be accepted without problems.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My plan:

  • merge it to our fork as a single PR to our master (and release branch)... I will now read our instructions on release procedure
  • after we release and deploy I plan to prepare the PRs to upstream
  • then cherry pick specific commits and open 2 PRs upstream if your think its necessary to split this changes

func TestPolicyChecker(t *testing.T) {
RegisterFailHandler(Fail)
RunSpecs(t, "API PolicyChecker Suite")
}
Expand Down
2 changes: 1 addition & 1 deletion atc/policy/opa/agent_suite_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ import (
. "github.com/onsi/gomega"
)

func TestVault(t *testing.T) {
aliculPix4D marked this conversation as resolved.
Show resolved Hide resolved
func TestPolicyAgent(t *testing.T) {
RegisterFailHandler(Fail)
RunSpecs(t, "Policy Agent Suite")
}
8 changes: 4 additions & 4 deletions atc/policy/opa/opa.go
Original file line number Diff line number Diff line change
Expand Up @@ -60,23 +60,23 @@ func (c opa) Check(input policy.PolicyCheckInput) (policy.PolicyCheckResult, err
client.Timeout = c.config.Timeout
resp, err := client.Do(req)
if err != nil {
return nil, err
return nil, fmt.Errorf("OPA server: connecting: %w", err)
}
defer resp.Body.Close()

statusCode := resp.StatusCode
if statusCode != http.StatusOK {
return nil, fmt.Errorf("opa returned status: %d", statusCode)
return nil, fmt.Errorf("OPA server returned status: %d", statusCode)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to do something with the response body here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be useful e.g. when you configure wrong value for:
CONCOURSE_OPA_URL: http://opa:8181/v1/data/concourse/decision
and opa is there but the path is wrong or something?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine if we do as in other places and add also the body to the error.

}

body, err := io.ReadAll(resp.Body)
if err != nil {
return nil, fmt.Errorf("opa returned no response: %s", err.Error())
return nil, fmt.Errorf("OPA server: reading response body: %s", err)
}

result, err := ParseOpaResult(body, c.config)
if err != nil {
return nil, fmt.Errorf("parsing opa results: %w", err)
aliculPix4D marked this conversation as resolved.
Show resolved Hide resolved
return nil, fmt.Errorf("parsing OPA results: %w", err)
}

return result, nil
Expand Down
6 changes: 3 additions & 3 deletions atc/policy/opa/opa_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,7 @@ var _ = Describe("OPA Policy Checker", func() {
It("should return error", func() {
result, err := agent.Check(policy.PolicyCheckInput{})
Expect(err).To(HaveOccurred())
Expect(err.Error()).To(MatchRegexp("connection refused"))
Expect(err.Error()).To(MatchRegexp("OPA server: connecting: .* connection refused"))
Expect(result).To(BeNil())
})
})
Expand All @@ -190,7 +190,7 @@ var _ = Describe("OPA Policy Checker", func() {
It("should return error", func() {
result, err := agent.Check(policy.PolicyCheckInput{})
Expect(err).To(HaveOccurred())
Expect(err.Error()).To(Equal("opa returned status: 404"))
Expect(err.Error()).To(Equal("OPA server returned status: 404"))
Expect(result).To(BeNil())
})
})
Expand All @@ -205,7 +205,7 @@ var _ = Describe("OPA Policy Checker", func() {
It("should return error", func() {
result, err := agent.Check(policy.PolicyCheckInput{})
Expect(err).To(HaveOccurred())
Expect(err.Error()).To(ContainSubstring("invalid character 'h' looking for beginning of value"))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so ironic I thought you invented it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I know it's even worse... The code from current master would craft this error msg:

parsing opa results: invalid character 'h' looking for beginning of value

and this is not the error msg end-user was getting... but instead end user saw only:

invalid character 'p' looking for beginning of value

because another JSON parser got the body starting with parsing...
:)

Unbelievable...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because another JSON parser got the body starting with parsing...

Ah. Can we fix also this properly ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did... 4th commit...

Expect(err.Error()).To(ContainSubstring("parsing JSON: invalid character 'h' looking for beginning of value"))
Expect(result).To(BeNil())
})
})
Expand Down
2 changes: 1 addition & 1 deletion atc/policy/opa/result.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ func ParseOpaResult(bytesResult []byte, opaConfig OpaConfig) (opaResult, error)
var results vars.StaticVariables
err := json.Unmarshal(bytesResult, &results)
if err != nil {
return opaResult{}, err
return opaResult{}, fmt.Errorf("parsing JSON: %w", err)
Copy link
Author

@aliculPix4D aliculPix4D Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I am still not 100% happy with the error msg user sees:

policy check error: parsing OPA results: parsing JSON: invalid character 'h' looking for beginning of value

for sure its better than before:

invalid character 'p' looking for beginning of value

because I am somehow lacking the information that this is because the response body from the OPA server is not a valid JSON.. or in other words that the problem is because OPA server being misconfigured and sending something unexpected...

Can we append something like: Hint: targeted OPA server seems to be misconfigured

Just as an example, I will give the error msg from the fly when Concourse is not reachable...

could not reach the Concourse server called developers:

    Get "https://concourse-web/api/v1/info": dial tcp XX.XX.XX.XX:443: i/o timeout

is the targeted Concourse running? better go catch it lol

e.g. this seems quite clear. Can we do something like that?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think about it, from the point of view of the end user (the dev setting the pipeline), nothing changes:

  • policy check error: parsing OPA results: parsing JSON: invalid character 'h' looking for beginning of value
  • policy check error: parsing OPA results: parsing JSON: invalid character 'h' looking for beginning of value. Hint: targeted OPA server seems to be misconfigured

In both cases, the user will ask for help on the channel: he cannot do anything about it and he is still confused.

I really resist this "Hint" because it is adding a patch to something that is broken already.

If we really want to give a better message to the user, I think we can, because we control the system (Concourse). This is different from cogito: the system is GitHub that we do not control.

So we should be able to rework some more logic how Concourse interacts with the OPA server and detect the real problem. Once we do that, then we could report something like

  • policy check error: OPA server misconfigured

Now, I don't even understand what we mean by "misconfigured".:

  • Do we mean that the OPA server configuration is wrong? (so: independently from Concourse).
  • Do we mean that the Concourse settings about the OPA server is wrong?

To summarize, I think that this PR already made the situation way better and we should merge and deploy.

I am OK if we want to create a ticket, with some of the discussion here, to go more at the root of the problem (assuming I am not misunderstanding something).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Said in another way: this PR is too big to follow in details, at least for me. All the discussions here are useful, but we can still do the secondary improvements in a separate ticket, I think.

}

var allowed, shouldBlock, ok bool
Expand Down
6 changes: 5 additions & 1 deletion go-concourse/concourse/configs.go
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,11 @@ func (team *team) CreateOrUpdatePipelineConfig(pipelineRef atc.PipelineRef, conf
var validationErr atc.SaveConfigResponse
err = json.Unmarshal(body, &validationErr)
if err != nil {
return false, false, []ConfigWarning{}, err
return false, false, []ConfigWarning{}, internal.UnexpectedResponseError{
Copy link
Author

@aliculPix4D aliculPix4D Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marco-m-pix4d I am curious to hear/read your opinion on this part...

if you follow closely:

  • before we were returning the err which is a useless error from the JSON parse on L97...
  • now we return internal.UnexpectedResponseError which I craft using response body and status code, but I never actually use the useless err anywhere... I think it is not needed to append this err.Error() because this error msg would always look the same invalid character.

Personally, I was uncomfortable just ignoring the err like that... We usually don't have this practice... but I don't think it would bring any value appending it.

What do you think?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the reasoning. As usual, the answer is that it depends.

The problem is that json.Unmarshal can, at least in theory, return any type of error, not only parse errors.
If the json package had an error sentinel, say ErrParse, then one could say:
I check for json.ErrParse. If it matches, then I do not pass it to internal.UnexpectedResponseError{}.
If on the other hand it doesn't match, then it can be helpful to know what has happened, so in this case I pass it to internal.UnexpectedResponseError{}.

Also if json.ErrParse existed, doing the reasoning just above would still be risky, because it is possible that a new version of the json package added another error sentinel, and so on...

So my approach is to almost always include errors. If I know like in this case that it can be confusing, I add it in parenthesis "(%s)" or similar...

StatusCode: response.StatusCode,
Status: response.Status,
Body: string(body),
}
}
return false, false, []ConfigWarning{}, InvalidConfigError{Errors: validationErr.Errors}
case http.StatusForbidden:
Expand Down
1 change: 1 addition & 0 deletions go-concourse/concourse/configs_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -378,6 +378,7 @@ var _ = Describe("ATC Handler Configs", func() {
It("returns an error", func() {
_, _, _, err := team.CreateOrUpdatePipelineConfig(pipelineRef, expectedVersion, expectedConfig, checkCredentials)
Expect(err).To(HaveOccurred())
Expect(err.Error()).To(ContainSubstring("bad-json"))
})
})
})
Expand Down
5 changes: 4 additions & 1 deletion go-concourse/concourse/internal/connection.go
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,10 @@ func (connection *connection) populateResponse(response *http.Response, returnRe
}

if response.StatusCode == http.StatusForbidden {
return ErrForbidden
body, _ := io.ReadAll(response.Body)
return ForbiddenError{
Reason: string(body),
}
}

if response.StatusCode < 200 || response.StatusCode >= 300 {
Expand Down
30 changes: 29 additions & 1 deletion go-concourse/concourse/internal/connection_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -353,7 +353,7 @@ var _ = Describe("ATC Connection", func() {
})
})

Describe("403 response", func() {
Describe("HTTPAgent: 403 response", func() {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they had a test for 403 response for what they call HTTPagent client, but not for what they call Connection client

BeforeEach(func() {
atcServer = ghttp.NewServer()

Expand Down Expand Up @@ -381,6 +381,34 @@ var _ = Describe("ATC Connection", func() {
})
})

Describe("Connection: 403 response", func() {
BeforeEach(func() {
atcServer = ghttp.NewServer()

connection = NewConnection(atcServer.URL(), nil, tracing)

atcServer.AppendHandlers(
ghttp.CombineHandlers(
ghttp.VerifyRequest("DELETE", "/api/v1/teams/main/pipelines/foo"),
ghttp.RespondWith(http.StatusForbidden, "problem"),
),
)
})

It("returns back ForbiddenError", func() {
err := connection.Send(Request{
RequestName: atc.DeletePipeline,
Params: rata.Params{
"pipeline_name": "foo",
"team_name": atc.DefaultTeamName,
},
}, nil)
fe, ok := err.(ForbiddenError)
Expect(ok).To(BeTrue())
Expect(fe.Reason).To(Equal("problem"))
})
})

Describe("404 response", func() {
Context("when the response does not contain JSONAPI errors", func() {
BeforeEach(func() {
Expand Down