Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go-concourse: SaveConfig endpoint: improve error message given to the end user #797

Merged
merged 1 commit into from
Oct 8, 2024

Conversation

aliculPix4D
Copy link

@aliculPix4D aliculPix4D commented Oct 2, 2024

go-concourse: configs: JSON parsing: improve error message given to the end user

Example:
BEFORE:

$ /home/alicul/go/bin/fly -t main set-pipeline -c pipeline.yml -p test
jobs:
  job job-1 has been added:
+ name: job-1
+ plan: null
  
pipeline name: test

apply configuration? [yN]: y

error: invalid character 'p' looking for beginning of value

NOW:

$ /home/alicul/go/bin/fly -t main set-pipeline -c pipeline.yml -p test
jobs:
  job job-1 has been added:
+ name: job-1
+ plan: null
  
pipeline name: test

apply configuration? [yN]: y
error: Unexpected Response
Status: 400 Bad Request
Body: < reason for the error from the server >

@aliculPix4D
Copy link
Author

@Pix4D/integration fishing for comments...

@aliculPix4D aliculPix4D changed the title go-concourse: configs: JSON parsing: improve error message given to the end user go-concourse: SaveConfig endpoint: improve error message given to the end user Oct 2, 2024
@marco-m-pix4d
Copy link

I understand what this PR is about because we discussed about it, but for somebody else, the PR description is not enough. You could add an example of what the user sees before and after the PR, for example.

@aliculPix4D
Copy link
Author

I understand what this PR is about because we discussed about it, but for somebody else, the PR description is not enough. You could add an example of what the user sees before and after the PR, for example.

It was already here but as a PR comment not the description.. I now updated also the PR comment.

@aliculPix4D aliculPix4D marked this pull request as ready for review October 3, 2024 08:04
@aliculPix4D aliculPix4D requested a review from a team as a code owner October 3, 2024 08:04
go-concourse/concourse/configs.go Outdated Show resolved Hide resolved
go-concourse/concourse/configs.go Outdated Show resolved Hide resolved
go-concourse/concourse/configs.go Outdated Show resolved Hide resolved
return false, false, []ConfigWarning{}, internal.UnexpectedResponseError{
StatusCode: response.StatusCode,
Status: response.Status,
Body: string(body),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this means it will try to convert the body to a string and as such assume the bytes in body to be a valid UTF-8 string. What if it's not the case? After all, this is not some application/json so there is no reason it be valid UTF-8 either.

Or does this generate a string representation of any byte sequence using some kind of escaping?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://go.dev/blog/strings#what-is-a-string

In Go, a string is in effect a read-only slice of bytes.

It’s important to state right up front that a string holds arbitrary bytes. It is not required to hold Unicode text, UTF-8 text, or any other predefined format. As far as the content of a string is concerned, it is exactly equivalent to a slice of bytes.

Here is a string literal (more about those soon) that uses the \xNN notation to define a string constant holding some peculiar byte values. (Of course, bytes range from hexadecimal values 00 through FF, inclusive.)

    const sample = "\xbd\xb2\x3d\xbc\x20\xe2\x8c\x98"

Printing strings

Because some of the bytes in our sample string are not valid ASCII, not even valid UTF-8, printing the string directly will produce ugly output. The simple print statement

    fmt.Println(sample)

produces this mess (whose exact appearance varies with the environment):

��=� ⌘

Copy link

@marco-m-pix4d marco-m-pix4d Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no reason to be concerned, nor to discover what is the encoding. If is not UTF-8, no harm is done, and is by definition a bug on the other end of the communication link.

For the history, the inventor of UTF-8 is the same person who invented Go and that wrote the blog post cited above, Rob Pike.

Copy link
Author

@aliculPix4D aliculPix4D Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there is no chance this can panic string([]byte(30, 210, 0, 115, 171, 50, 85, 149, 235, 145))? and that is because the input is always a slice of bytes?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is impossible to panic due to conversion of a byte slice to a string.

if we had the possibility for i=1; string(i), that would be an issue?

that would be string([]byte{1}), a string like another.

I strongly suggest to read the blog post above.

There’s more. The %q (quoted) verb will escape any non-printable byte sequences in a string so the output is unambiguous.

    fmt.Printf("%q\n", sample)

This technique is handy when much of the string is intelligible as text but there are peculiarities to root out; it produces:

"\xbd\xb2=\xbc ⌘"

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Thanks.

Just curious, does that mean that go has no native way to manipulate text? Or does the standard lib has at least a bunch of function to properly handle de/en-coding and glyphs count / manipulation operations?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From that blog post again:

To summarize, here are the salient points:

Go source code is always UTF-8.
A string holds arbitrary bytes.
A string literal, absent byte-level escapes, always holds valid UTF-8 sequences.
Those sequences represent Unicode code points, called runes.
No guarantee is made in Go that characters in strings are normalized.
...
there’s really only one way that Go treats UTF-8 specially, and that is when using a for range loop on a string.
...
A for range loop, by contrast, decodes one UTF-8-encoded rune on each iteration.

const nihongo = "日本語"
    for index, runeValue := range nihongo {
        fmt.Printf("%#U starts at byte position %d\n", runeValue, index)
    }

The output shows how each code point occupies multiple bytes:

U+65E5 '日' starts at byte position 0
U+672C '本' starts at byte position 3
U+8A9E '語' starts at byte position 6

[Exercise: Put an invalid UTF-8 byte sequence into the string. (How?) What happens to the iterations of the loop?]

See the Exercise! :-)

Go’s standard library provides strong support for interpreting UTF-8 text. If a for range loop isn’t sufficient for your purposes, chances are the facility you need is provided by a package in the library.

The most important such package is unicode/utf8, which contains helper routines to validate, disassemble, and reassemble UTF-8 strings.

Here is a program equivalent to the for range example above, but using the DecodeRuneInString function

const nihongo = "日本語"
    for i, w := 0, 0; i < len(nihongo); i += w {
        runeValue, width := utf8.DecodeRuneInString(nihongo[i:])
        fmt.Printf("%#U starts at byte position %d\n", runeValue, i)
        w = width
    }

To answer the question posed at the beginning: Strings are built from bytes so indexing them yields bytes, not characters. A string might not even hold characters. In fact, the definition of “character” is ambiguous and it would be a mistake to try to resolve the ambiguity by defining that strings are made of characters.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary I remember by heart:

Strings are built from bytes so indexing them yields bytes, not characters. A string might not even hold characters. In fact, the definition of “character” is ambiguous and it would be a mistake to try to resolve the ambiguity by defining that strings are made of characters.

go-concourse/concourse/configs_test.go Show resolved Hide resolved
@aliculPix4D aliculPix4D force-pushed the pci-3852-client-fix-save-config branch 2 times, most recently from c707e28 to 424e986 Compare October 4, 2024 08:13
@aliculPix4D aliculPix4D changed the base branch from master to release/7.11.x-pix4d October 4, 2024 14:08
@aliculPix4D aliculPix4D force-pushed the pci-3852-client-fix-save-config branch from 424e986 to d4f93a5 Compare October 4, 2024 14:41
@aliculPix4D
Copy link
Author

@marco-m-pix4d @odormond I had to add 1 more commit: 305ad9c because tests were failing...

I think we should extend also the quick tests here: https://github.com/Pix4D/ci-for-concourse/blob/32e10b73505e24c830ac3bc2338256323dee0502/ci/tasks/scripts/quick-tests.sh#L12 to include them...

@marco-m-pix4d
Copy link

@marco-m-pix4d @odormond I had to add 1 more commit: 305ad9c because tests were failing...

I think we should extend also the quick tests here: https://github.com/Pix4D/ci-for-concourse/blob/32e10b73505e24c830ac3bc2338256323dee0502/ci/tasks/scripts/quick-tests.sh#L12 to include them...

Could you please explain? If a test was failing, it means it has been ran, so why do we need to include it ?

@aliculPix4D
Copy link
Author

aliculPix4D commented Oct 7, 2024

Could you please explain? If a test was failing, it means it has been ran, so why do we need to include it ?

Ups sorry, my bad, I should have been clearer...

Its failing in: upstream-unit job because this job runs much more comprehensive set of tests... Our quick tests include only a small subset (only 2-3 subdirectories)... It turns out that the changes I did in this PR actually affect also what they call "fly integration test" were they compile fly and run fly binary against also the HTTP test server... but this time its fly binary not only the HTTP client that is tested...

So my question is: should I extend our quick-test.sh to include also these additional test?

@marco-m-pix4d
Copy link

Its failing in: upstream-unit job
...
So my question is: should I extend our quick-test.sh to include also these additional test?

I am still confused. This means that we will run the same tests twice...

@aliculPix4D
Copy link
Author

I am still confused. This means that we will run the same tests twice...

Yes, but we already do that...

Compare:
https://github.com/Pix4D/ci-for-concourse/blob/32e10b73505e24c830ac3bc2338256323dee0502/ci/tasks/scripts/upstream-unit.sh#L24
https://github.com/Pix4D/ci-for-concourse/blob/32e10b73505e24c830ac3bc2338256323dee0502/ci/tasks/scripts/quick-tests.sh#L12

There will be an overlap between these two jobs... one runs almost all of them (upstream-tests) and one only due to the code changes we did...

@marco-m-pix4d
Copy link

There will be an overlap between these two jobs... one runs almost all of them (upstream-tests) and one only due to the code changes we did...

Now I understand. This is a tough one 🤔
The more we work on Concourse, the more the two tests will overlap. So at a certain point we could remove the quicktest job. But since upstream-unit invokes ginko, I do not want to wait 10 minutes to reproduce a failing test.
For the time being, I would leave the two test jobs as-is.
If you have a reason to change now, let's discuss offline this PR. I would like to see this PR merged.
Merci!

@@ -204,6 +205,10 @@ var _ = Describe("ATC Handler Configs", func() {

Expect(receivedConfig).To(Equal(expectedConfig))

if contentType != "" {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marco-m-pix4d In one comment you basically said to remove this check and just do:

w.Header().Set("Content-Type", "application/json")

this would require adding new separate tests for the case where content type is not set or set to something else...

I do understand your point and I agree it is clearer that way and more in line with how we test with go test, but this would lead to a lot of duplication and would go against how the tests are written now... I mean just look how they overwrite the returnBody or returnHeader in each test... The approach I took here is more in line what they currently have.. by overwriting a single parameter in each test.

I just wanted to confirm before committing (I already did the changes) that this is what we want:

  • all existing test will test the path of the code when content type header is set to application/json

  • remove 3 current tests with context when response contains bad JSON because we are not testing the JSON parser

  • create 3 new tests to test the path when content type header is left empty or set to some other value different than application/json

  • price to pay is mostly duplication and going against the current flow of how the these tests are written...

  • we gain more clarity by not putting to much stuff in a single test

I agree with you that this is the direction we should take, I just wanted to highlight the duplication part and going against the current flow...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your summary of my suggestion is correct.
As usual, it is a trade-off and also at least in part a question of taste.
I have only a high-level view of that file, I did not analyze in detail the amount of duplication. As we already discussed in the past, DRY for the sake of DRY can make things worse.

You have seen both approaches, you are in the best position to take a decision. As usual the criteria are:

  • if a test case fails, will the next person understand better what is happening with the current approach, or with separate tests?
  • if tomorrow somebody has to add another test case, will the addition be less error-prone, and in general easier, with the current approach or with separate tests?
  • is the duplication causing or risking to cause any problem?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marco-m-pix4d done... please have a final pass before I merge...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. The length of the existing tests, and the nested Contexts, make my head spin :-/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aliculPix4D please try to squash some commits if possible.

@aliculPix4D aliculPix4D force-pushed the pci-3852-client-fix-save-config branch 2 times, most recently from c2486ba to d5a5639 Compare October 8, 2024 11:42
@marco-m-pix4d marco-m-pix4d self-requested a review October 8, 2024 12:42
@aliculPix4D aliculPix4D force-pushed the pci-3852-client-fix-save-config branch from d5a5639 to ac16027 Compare October 8, 2024 12:46
@aliculPix4D aliculPix4D merged commit 955cdea into release/7.11.x-pix4d Oct 8, 2024
8 checks passed
@aliculPix4D aliculPix4D deleted the pci-3852-client-fix-save-config branch October 8, 2024 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants