Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory: Swap probe_read to kernel or user version #2213

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

kevsecurity
Copy link
Contributor

@kevsecurity kevsecurity commented Mar 11, 2024

We should always use the probe_read_kernel or probe_read_user helpers over the probe_read helper (ditto for _str versions).

This series changes all probe_read to either probe_read_kernel or probe_read_user (ditto for _str versions).

Note: bpf/libbpf/bpf_core_read.h was updated. It fails some of our static tests which I think can be ignored (the last one did too – it stops checking if the file isn't in the PR).

@kevsecurity kevsecurity added the release-note/misc This PR makes changes that have no direct user impact. label Mar 11, 2024
@kevsecurity kevsecurity requested a review from a team as a code owner March 11, 2024 16:30
@kevsecurity kevsecurity force-pushed the pr/kevsecurity/add-probe-read-kernel-str branch 3 times, most recently from 8c89a01 to 321abe4 Compare March 11, 2024 18:14
@jrfastab
Copy link
Contributor

jrfastab commented Mar 12, 2024

In general this seems the right direction. Does BTF report if its a user ptr or not? The kernel annotates these so we could learn it from the BTF type I think. Might need some BTF extensions to do I'm not sure off-hand.

Looks like it would require some BTF updates, but likely is the right solution IMO.

@kevsecurity kevsecurity force-pushed the pr/kevsecurity/add-probe-read-kernel-str branch from 321abe4 to 885041d Compare March 18, 2024 15:25
Copy link

netlify bot commented Mar 18, 2024

Deploy Preview for tetragon ready!

Name Link
🔨 Latest commit b6c6b94
🔍 Latest deploy log https://app.netlify.com/sites/tetragon/deploys/66155a5ffdbbf20008cdbda2
😎 Deploy Preview https://deploy-preview-2213--tetragon.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@kevsecurity kevsecurity force-pushed the pr/kevsecurity/add-probe-read-kernel-str branch 6 times, most recently from a1da1b6 to 6484d96 Compare March 19, 2024 10:50
@kevsecurity kevsecurity force-pushed the pr/kevsecurity/add-probe-read-kernel-str branch 3 times, most recently from aafefc2 to c297931 Compare April 4, 2024 15:46
Copy link
Contributor

@kkourt kkourt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general idea makes sense to me but there are lot of changes in this PR and it's difficult to review it. Can you please split it into multiple patches and add commit messages on why each change is needed?

I would also like to avoid introducing yet another variant of our object files if we can avoid it.

pkg/sensors/tracing/kprobe_test.go Outdated Show resolved Hide resolved
bpf/process/bpf_generic_retkprobe.c Show resolved Hide resolved
bpf/Makefile Outdated Show resolved Hide resolved
@kkourt kkourt removed the request for review from tpapagian April 5, 2024 13:10
@kevsecurity kevsecurity force-pushed the pr/kevsecurity/add-probe-read-kernel-str branch 2 times, most recently from 2b81a3c to 78ea803 Compare April 9, 2024 15:08
When we swap all probe_read()s to probe_read_kernel() or
probe_read_user() (ditto for _str) users will need to be able to specify
if an argument is userspace data or not.

Signed-off-by: Kevin Sheldrake <[email protected]>
Add the UserspaceData flag to the meta value, taking into account the
default value if it is missing from the arg in the policy.

Modify how tracepoints use the meta value to permit adding this flag.

Signed-off-by: Kevin Sheldrake <[email protected]>
@kevsecurity kevsecurity force-pushed the pr/kevsecurity/add-probe-read-kernel-str branch from 78ea803 to b6c6b94 Compare April 9, 2024 15:10
We should always use the probe_read_kernel or probe_read_user helpers
over the probe_read helper (ditto for _str versions).

This commit changes most probe_read to either probe_read_kernel or
probe_read_user (ditto for _str versions).

Signed-off-by: Kevin Sheldrake <[email protected]>
Some calls to probe_read could be reading either kernel or user memory.
Add some helpers that take a userspace boolean to determine which type
of memory to read. Also introduce a masked version that masks the size
variable with a given mask immediately before (using inline asm) the
probe_read_*. This will prevent the test for the userspace boolean
spilling our registers after the masking and so will retain their
bounds.

Signed-off-by: Kevin Sheldrake <[email protected]>
We copy bpf_core_read.h from libbpf. We now need to copy it again to
take in updates that let us use probe_read helpers with CORE. This will
fail our static tests because they warn about register reuse. While this
is indeed a potential problem, it is unlikely to affect us in reality.
It is better to keep bpf_core_read.h the same as the upstream version
than to fix it locally (which is tricky). We could potentially attempt
to fix the warnings and upstream the changes, but this is outside the
scope of this PR. Hence, accept the warnings.

Changes introduced by the updated bpf_core_read.h require macros for
bpf_probe_read_kernel and bpf_probe_read_user. These were introduced in
the previous commit in the new probe_read_kernel_or_user.h. This commit
adds that include to source files that require access to these macros.

Signed-off-by: Kevin Sheldrake <[email protected]>
The argument meta value indicates (among others) whether an argument is
in userspace memory or in kernel memory. This generally works for kprobes,
tracepoints and uprobes, but raw_syscall tracepoints are slightly
different. The raw_syscall argument (an array of uint64) exists in
kernel memory but the buffers the pointers reference might be userspace.

This commit adds a raw_syscall flag to indicate to the BPF programs that
the pointers are in kernel memory even if the buffers they point to are
in userspace.

Signed-off-by: Kevin Sheldrake <[email protected]>
Having introduced helpers that can read kernel or user memory depending
on a boolean, and having also provided information over location of
buffers, including in raw_syscalls, to the BPF programs, this commit
uses the probe_read_kernel_or_user helpers to read the memory from the
correct location.

Signed-off-by: Kevin Sheldrake <[email protected]>
The key for the retprobe map includes the pid_tgid, but in cases where
this is not available, it uses the FP register in its place. The
retprobe_map_get_key function is used to get this ID. This commit
changes the filter_args_reject function to use retprobe_map_get_key
instead of just using the pid_tgid.

Signed-off-by: Kevin Sheldrake <[email protected]>
The return arguments can be buffers and these can reside in kernel or
userspace. This commit adds meta information to the return arguments
(both function return arg and any params that should be read on return)
so that the BPF programs know whether the data resides in kernel or user
memory.

Signed-off-by: Kevin Sheldrake <[email protected]>
The addition of code to optionally read memory from kernel or user
addresses caused the retkprobe program to exceed the 4096 instruction
limit required by some older kernels. This commit divides one retkprobe
program into two and connects them together with a tailcal.

Signed-off-by: Kevin Sheldrake <[email protected]>
In some raw_syscalls policies the selector indices are set to the
list index of the relevant argument instead of the actual index of the
relevant argument. This usually doesn't break anything because these are
the actual indices that are used in the config with the BPF programs.
There is a potential for error, however, as the code tries to match the
selector index number with an argument's index number; on a match it
rewrites the selector index with the argument's list index. Therefore if
the first argument to a function isn't listed in the args section then
an explicit list index in the selectors section could reference the
wrong argument.

In order to discourage the direct use of list indices, the examples and
test code have been changed so the selectors use the arg index and not
its list index.

Signed-off-by: Kevin Sheldrake <[email protected]>
@kevsecurity kevsecurity force-pushed the pr/kevsecurity/add-probe-read-kernel-str branch from b6c6b94 to 179404a Compare April 10, 2024 12:47
@kevsecurity kevsecurity requested a review from kkourt April 10, 2024 13:25
Copy link
Contributor

@kkourt kkourt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for splitting the PR. FWIW, I still find it hard to navigate, but it's definitely better.
I'm adding some first comments. I did not go over the full PR and plan to do another pass soon.

static inline __attribute__((always_inline)) bool
is_userspace_data(unsigned long argm)
{
return (argm & ARGM_USERSPACE_DATA) != 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, just use argm & ARGM_USERSPACE_DATA. No need for != 0


if arg.SizeArgIndex > 0 {
if arg.SizeArgIndex > 15 {
return 0, fmt.Errorf("invalid SizeArgIndex value (>15): %v", arg.SizeArgIndex)
}
meta = int(arg.SizeArgIndex)
meta = meta | int(arg.SizeArgIndex)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: meta |= ...

if arg.IsUserspaceData == nil {
// If not set in policy, use the default.
if userspaceDataDefault {
meta = meta | argUserspaceDataBit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: meta |= ...

} else {
// Otherwise, use the provided value.
if *arg.IsUserspaceData {
meta = meta | argUserspaceDataBit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: meta |= ...

}
// set MetaArg's upper half-word equal to the number of bytes we need to copy
out.MetaArg = out.MetaArg & 0xffff
out.MetaArg = out.MetaArg | (nbytes << 16)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a check and return an error if nbytes is cannot fit in the number of bits we have.

bpf/process/types/basic.h Show resolved Hide resolved
bpf/process/types/probe_read_kernel_or_user.h Show resolved Hide resolved
bpf/process/types/probe_read_kernel_or_user.h Show resolved Hide resolved
bpf/process/types/probe_read_kernel_or_user.h Show resolved Hide resolved
Copy link
Contributor

@kkourt kkourt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did another pass of the patches I had not reviewed before. Thanks!

@@ -34,6 +34,7 @@ const (
argReturnCopyBit = 1 << 4
argMaxDataBit = 1 << 5
argUserspaceDataBit = 1 << 6
argRawSyscallsBit = 1 << 7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a different flag?

From what I can tell from the code the flag is used in

// the const_buf_type that represents an array of arguments for raw_syscalls
// is special, as the array contents are in kernel memory, but they point
// to userspace memory. In this case, they will be marked as userspace, but
// we actually want to read kernel memory. The raw_syscalls bit of the meta
// value indicates when the arg is special in this way.
probe_read_kernel_or_user_masked(args, size, 0x3ff, (char *)arg, userspace && !raw_syscalls);

And set in

metaTp, err := getMetaValue(specArg, syscall, rawSyscalls && specArg.Index == 5 && (specArg.Type == ""))

Cant' we just do

userspaceDataDefault = syscall &&  !(rawSyscalls && specArg.Index == 5 && (specArg.Type == ""))
getMetaValue(arg, userSpaceDataDefault)

And use only one flag?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was to prepare for being able to filter on arguments in raw_syscalls, and therefore arrays of arguments which could potentially be pointers. I thought it was a bit much to include that in this PR, but I was already thinking ahead for a following PR. Can do as you suggest if you prefer though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do as you suggest if you prefer though.

Thanks! I would prefer to keep the code as simple as possible, even if we end up extending this for filtering.

I've also thought about filtering raw_syscalls.

One way would be to do as you suggest, and use the arguments but this seems tricky because we need to pass types for everything.

I think a better approach would be to do two-level filtering where we in raw_syscall event we to filter the type of the syscall and then we enable hooks on syscalls/ for filtering on the arguments. This seems simpler to me and it re-uses the arg filtering we already have. We would need to have a way to pass information from the raw_syscall hook to the specific syscall hook, but we already have something similar Jiri's notify enforcer action.

In any case, that's for another PR.

bpf/lib/generic.h Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants