go-concourse: SaveConfig endpoint: improve error message given to the end user #797

aliculPix4D · 2024-10-02T12:16:09Z

go-concourse: configs: JSON parsing: improve error message given to the end user

Example:
BEFORE:

$ /home/alicul/go/bin/fly -t main set-pipeline -c pipeline.yml -p test
jobs:
  job job-1 has been added:
+ name: job-1
+ plan: null
  
pipeline name: test

apply configuration? [yN]: y

error: invalid character 'p' looking for beginning of value

NOW:

$ /home/alicul/go/bin/fly -t main set-pipeline -c pipeline.yml -p test
jobs:
  job job-1 has been added:
+ name: job-1
+ plan: null
  
pipeline name: test

apply configuration? [yN]: y
error: Unexpected Response
Status: 400 Bad Request
Body: < reason for the error from the server >

aliculPix4D · 2024-10-02T12:16:25Z

@Pix4D/integration fishing for comments...

go-concourse/concourse/configs.go

marco-m-pix4d · 2024-10-03T08:00:44Z

I understand what this PR is about because we discussed about it, but for somebody else, the PR description is not enough. You could add an example of what the user sees before and after the PR, for example.

aliculPix4D · 2024-10-03T08:02:27Z

I understand what this PR is about because we discussed about it, but for somebody else, the PR description is not enough. You could add an example of what the user sees before and after the PR, for example.

It was already here but as a PR comment not the description.. I now updated also the PR comment.

go-concourse/concourse/configs.go

odormond · 2024-10-03T13:56:26Z

go-concourse/concourse/configs.go

+			return false, false, []ConfigWarning{}, internal.UnexpectedResponseError{
+				StatusCode: response.StatusCode,
+				Status:     response.Status,
+				Body:       string(body),


I guess this means it will try to convert the body to a string and as such assume the bytes in body to be a valid UTF-8 string. What if it's not the case? After all, this is not some application/json so there is no reason it be valid UTF-8 either.

Or does this generate a string representation of any byte sequence using some kind of escaping?

https://go.dev/blog/strings#what-is-a-string

In Go, a string is in effect a read-only slice of bytes.

It’s important to state right up front that a string holds arbitrary bytes. It is not required to hold Unicode text, UTF-8 text, or any other predefined format. As far as the content of a string is concerned, it is exactly equivalent to a slice of bytes.

Here is a string literal (more about those soon) that uses the \xNN notation to define a string constant holding some peculiar byte values. (Of course, bytes range from hexadecimal values 00 through FF, inclusive.)

const sample = "\xbd\xb2\x3d\xbc\x20\xe2\x8c\x98"

Printing strings

Because some of the bytes in our sample string are not valid ASCII, not even valid UTF-8, printing the string directly will produce ugly output. The simple print statement

fmt.Println(sample)

produces this mess (whose exact appearance varies with the environment):

��=� ⌘

There is no reason to be concerned, nor to discover what is the encoding. If is not UTF-8, no harm is done, and is by definition a bug on the other end of the communication link.

For the history, the inventor of UTF-8 is the same person who invented Go and that wrote the blog post cited above, Rob Pike.

So there is no chance this can panic string([]byte(30, 210, 0, 115, 171, 50, 85, 149, 235, 145))? and that is because the input is always a slice of bytes?

It is impossible to panic due to conversion of a byte slice to a string.

if we had the possibility for i=1; string(i), that would be an issue?

that would be string([]byte{1}), a string like another.

I strongly suggest to read the blog post above.

There’s more. The %q (quoted) verb will escape any non-printable byte sequences in a string so the output is unambiguous.

fmt.Printf("%q\n", sample)

This technique is handy when much of the string is intelligible as text but there are peculiarities to root out; it produces:

"\xbd\xb2=\xbc ⌘"

OK. Thanks.

Just curious, does that mean that go has no native way to manipulate text? Or does the standard lib has at least a bunch of function to properly handle de/en-coding and glyphs count / manipulation operations?

From that blog post again:

To summarize, here are the salient points:

Go source code is always UTF-8.
A string holds arbitrary bytes.
A string literal, absent byte-level escapes, always holds valid UTF-8 sequences.
Those sequences represent Unicode code points, called runes.
No guarantee is made in Go that characters in strings are normalized.
...
there’s really only one way that Go treats UTF-8 specially, and that is when using a for range loop on a string.
...
A for range loop, by contrast, decodes one UTF-8-encoded rune on each iteration.

const nihongo = "日本語" for index, runeValue := range nihongo { fmt.Printf("%#U starts at byte position %d\n", runeValue, index) }

The output shows how each code point occupies multiple bytes:

U+65E5 '日' starts at byte position 0 U+672C '本' starts at byte position 3 U+8A9E '語' starts at byte position 6

[Exercise: Put an invalid UTF-8 byte sequence into the string. (How?) What happens to the iterations of the loop?]

See the Exercise! :-)

Go’s standard library provides strong support for interpreting UTF-8 text. If a for range loop isn’t sufficient for your purposes, chances are the facility you need is provided by a package in the library.

The most important such package is unicode/utf8, which contains helper routines to validate, disassemble, and reassemble UTF-8 strings.

Here is a program equivalent to the for range example above, but using the DecodeRuneInString function

const nihongo = "日本語" for i, w := 0, 0; i < len(nihongo); i += w { runeValue, width := utf8.DecodeRuneInString(nihongo[i:]) fmt.Printf("%#U starts at byte position %d\n", runeValue, i) w = width }

To answer the question posed at the beginning: Strings are built from bytes so indexing them yields bytes, not characters. A string might not even hold characters. In fact, the definition of “character” is ambiguous and it would be a mistake to try to resolve the ambiguity by defining that strings are made of characters.

Summary I remember by heart:

Strings are built from bytes so indexing them yields bytes, not characters. A string might not even hold characters. In fact, the definition of “character” is ambiguous and it would be a mistake to try to resolve the ambiguity by defining that strings are made of characters.

go-concourse/concourse/configs_test.go

go-concourse/concourse/configs.go

go-concourse/concourse/configs_test.go

aliculPix4D · 2024-10-07T12:49:31Z

@marco-m-pix4d @odormond I had to add 1 more commit: 305ad9c because tests were failing...

I think we should extend also the quick tests here: https://github.com/Pix4D/ci-for-concourse/blob/32e10b73505e24c830ac3bc2338256323dee0502/ci/tasks/scripts/quick-tests.sh#L12 to include them...

marco-m-pix4d · 2024-10-07T13:25:35Z

@marco-m-pix4d @odormond I had to add 1 more commit: 305ad9c because tests were failing...

I think we should extend also the quick tests here: https://github.com/Pix4D/ci-for-concourse/blob/32e10b73505e24c830ac3bc2338256323dee0502/ci/tasks/scripts/quick-tests.sh#L12 to include them...

Could you please explain? If a test was failing, it means it has been ran, so why do we need to include it ?

aliculPix4D · 2024-10-07T13:29:51Z

Could you please explain? If a test was failing, it means it has been ran, so why do we need to include it ?

Ups sorry, my bad, I should have been clearer...

Its failing in: upstream-unit job because this job runs much more comprehensive set of tests... Our quick tests include only a small subset (only 2-3 subdirectories)... It turns out that the changes I did in this PR actually affect also what they call "fly integration test" were they compile fly and run fly binary against also the HTTP test server... but this time its fly binary not only the HTTP client that is tested...

So my question is: should I extend our quick-test.sh to include also these additional test?

marco-m-pix4d · 2024-10-07T13:36:45Z

Its failing in: upstream-unit job
...
So my question is: should I extend our quick-test.sh to include also these additional test?

I am still confused. This means that we will run the same tests twice...

aliculPix4D · 2024-10-07T13:40:22Z

I am still confused. This means that we will run the same tests twice...

Yes, but we already do that...

Compare:
https://github.com/Pix4D/ci-for-concourse/blob/32e10b73505e24c830ac3bc2338256323dee0502/ci/tasks/scripts/upstream-unit.sh#L24
https://github.com/Pix4D/ci-for-concourse/blob/32e10b73505e24c830ac3bc2338256323dee0502/ci/tasks/scripts/quick-tests.sh#L12

There will be an overlap between these two jobs... one runs almost all of them (upstream-tests) and one only due to the code changes we did...

marco-m-pix4d · 2024-10-07T14:00:59Z

There will be an overlap between these two jobs... one runs almost all of them (upstream-tests) and one only due to the code changes we did...

Now I understand. This is a tough one 🤔
The more we work on Concourse, the more the two tests will overlap. So at a certain point we could remove the quicktest job. But since upstream-unit invokes ginko, I do not want to wait 10 minutes to reproduce a failing test.
For the time being, I would leave the two test jobs as-is.
If you have a reason to change now, let's discuss offline this PR. I would like to see this PR merged.
Merci!

aliculPix4D · 2024-10-08T07:36:09Z

go-concourse/concourse/configs_test.go

@@ -204,6 +205,10 @@ var _ = Describe("ATC Handler Configs", func() {

 						Expect(receivedConfig).To(Equal(expectedConfig))

+						if contentType != "" {


@marco-m-pix4d In one comment you basically said to remove this check and just do:

w.Header().Set("Content-Type", "application/json")

this would require adding new separate tests for the case where content type is not set or set to something else...

I do understand your point and I agree it is clearer that way and more in line with how we test with go test, but this would lead to a lot of duplication and would go against how the tests are written now... I mean just look how they overwrite the returnBody or returnHeader in each test... The approach I took here is more in line what they currently have.. by overwriting a single parameter in each test.

I just wanted to confirm before committing (I already did the changes) that this is what we want:

all existing test will test the path of the code when content type header is set to application/json

remove 3 current tests with context when response contains bad JSON because we are not testing the JSON parser

create 3 new tests to test the path when content type header is left empty or set to some other value different than application/json

price to pay is mostly duplication and going against the current flow of how the these tests are written...

we gain more clarity by not putting to much stuff in a single test

I agree with you that this is the direction we should take, I just wanted to highlight the duplication part and going against the current flow...

Your summary of my suggestion is correct.
As usual, it is a trade-off and also at least in part a question of taste.
I have only a high-level view of that file, I did not analyze in detail the amount of duplication. As we already discussed in the past, DRY for the sake of DRY can make things worse.

You have seen both approaches, you are in the best position to take a decision. As usual the criteria are:

if a test case fails, will the next person understand better what is happening with the current approach, or with separate tests?

if tomorrow somebody has to add another test case, will the addition be less error-prone, and in general easier, with the current approach or with separate tests?

is the duplication causing or risking to cause any problem?

@marco-m-pix4d done... please have a final pass before I merge...

Looks good to me. The length of the existing tests, and the nested Contexts, make my head spin :-/

@aliculPix4D please try to squash some commits if possible.

…he end user Signed-off-by: aliculPix4D <[email protected]>

aliculPix4D commented Oct 2, 2024

View reviewed changes

go-concourse/concourse/configs.go Outdated Show resolved Hide resolved

aliculPix4D commented Oct 2, 2024

View reviewed changes

go-concourse/concourse/configs.go Outdated Show resolved Hide resolved

aliculPix4D commented Oct 2, 2024

View reviewed changes

go-concourse/concourse/configs.go Show resolved Hide resolved

aliculPix4D changed the title ~~go-concourse: configs: JSON parsing: improve error message given to the end user~~ go-concourse: SaveConfig endpoint: improve error message given to the end user Oct 2, 2024

aliculPix4D marked this pull request as ready for review October 3, 2024 08:04

aliculPix4D requested a review from a team as a code owner October 3, 2024 08:04

aliculPix4D requested review from odormond, iAmoric, SpectreVert and marco-m-pix4d October 3, 2024 08:04