Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.10 backports #1285

Merged
merged 5 commits into from
Jul 28, 2023
Merged

v0.10 backports #1285

merged 5 commits into from
Jul 28, 2023

Conversation

@kkourt kkourt requested a review from a team as a code owner July 28, 2023 12:20
@kkourt kkourt requested review from olsajiri and removed request for a team July 28, 2023 12:20
[ upstream commit cba6fc7 ]

[ backporter's note: test had to be adjusted because DataEventDesc does not have Pad/Size ]

BPF/kernel strings are C strings which are a sequence of null-terminated
bytes. They may or may not be valid utf-8 strings.

For example, execve() can accept invalid utf-8 strings. Similarly,
filenames are not required to be utf-8 encoded.

This becomes an issue because we define fields like Binary, Arguments,
and CWD as strings in the proto descriptions.

According to the protobuf spec:
 "A string must always contain UTF-8 encoded or 7-bit ASCII text, and
  cannot be longer than 2^32."

As a result, when passing non-utf8 data as strings, protobuf clients
(e.g., the JSON writer and the gRPC client) cannot handle the event. For
example, running `tetra getevents` and executing a program with invalid
utf8 arguments leads the following error:
msg="Failed to receive events" error="rpc error: code = Internal desc = grpc: error while marshaling: string field contains invalid UTF-8"

There are a number of different approaches we can use to address this
problem. One is that we can try to encode arbitrary bytes in the
string. In the past, we have used base64 to do that (specifically, for
bytes arguments). An alternative (better?) solution would be to quote
the string, using strconv.Quote() or similar.

Quoting, however, means that we need to always parse the data and quote
them, which has a performance cost. To avoid this cost this patch
uses strings.ToValidUTF8 instead, which has small overhead (minimal in
case where the data are actually valid utf8), to replace invalid runes
with "�". This means that we loose information (what the actual bytes
were) but we also keep backwards compatibility and remain close to the
existing behavior.

In the future, we can modify this behavior (e.g., via a command line
switch) to quote the arguments instead, so that all bytes are preserved
for non-utf8 data encoded as strings.

A more radical change (which seems like the right thing to do) is to
change the gRPC fields that are not actually strings to bytes, and let
the clients do the decoding.

The patch also adds a unit test to check for this case.

Reported-by: Гаврилов Иван Сергеевич <[email protected]>
Signed-off-by: Kornilios Kourtis <[email protected]>
[ upstream commit c32895c ]

handleGenericKprobeString returns "/" in case of an error. This happens
to accommodate issues when traversing paths. This patch modifies
handleGenericKprobeString so that it accepts a default value to return
as an error. The intention is to use it for other types, which be done
in the next patch.

No functional changes.

Signed-off-by: Kornilios Kourtis <[email protected]>
[ upstream commit 43db99b ]

Use handleGenericKprobeString for GenericFilenameType and
GenericStringType as well. The code is the same, so no functional changes.

Signed-off-by: Kornilios Kourtis <[email protected]>
[ upstream commit 7c7ed6a ]

Kprobe arguments that include strings are generated using
handleGenericKprobeString. Because protobufs only support strings that
are valid utf-8, we need to ensure that the strings we get from bpf are
valid utf-8.

Also, ensure that Path in loader is valid utf-8.

Signed-off-by: Kornilios Kourtis <[email protected]>
[ upstream commit eebb6ba ]

This code is seemingly not used. Remove it.

Signed-off-by: Kornilios Kourtis <[email protected]>
Suggested-by: Anastasios Papagiannis <[email protected]>
Signed-off-by: Kornilios Kourtis <[email protected]>
@kkourt kkourt force-pushed the pr/kkourt/v0.10-backports branch from a7d5642 to a423159 Compare July 28, 2023 12:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants