Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LX libaudit clients cause kernel memory usage to expand until exhausted #366

Open
arekinath opened this issue Jun 3, 2021 · 0 comments
Open
Assignees

Comments

@arekinath
Copy link

The Linux libaudit library uses NETLINK sockets to send audit events. It opens a netlink socket, then calls sendto() to write the event out (see https://github.com/linux-audit/audit-userspace/blob/master/lib/netlink.c#L239) and then it checks for an ACK from the kernel.

When checking for an ACK, libaudit looks specifically for an NLMSG_ERROR message: https://github.com/linux-audit/audit-userspace/blob/master/lib/netlink.c#L287 It does recvfrom() with MSG_PEEK to look for the NLMSG_ERROR code, and then only does a real read without MSG_PEEK once it's seen it. If any other message arrives, it will never read from the netlink socket again.

Unfortunately, in e.g. lx_netlink_au_um, we just call lx_netlink_reply, which sets up an NLMSG_DONE message (with NLM_F_MULTI in the header). This is not the kind of ACK which libaudit is expecting, and so libaudit never actually reads from its netlink socket on LX.

When libaudit stops reading from the netlink socket, replies start to queue up. Normally this backpressure is handled by the layer calling the su_recv callback (e.g. the TCP/IP stack) -- you're meant to watch for ENOSPC from that socket upcall and set a flag to stop sucking in new messages until the downcall comes to tell you things are unblocked again.

Unfortunately, in lx_netlink_reply_sendup, after we call su_recv, we have:

	if (error != 0)
		lx_netlink_flowctrld++;

And that's it. End of function. We don't set any flags, we just increment a global counter (which is never read anywhere in the code). This means that we can accumulate replies on the socket queue of a netlink socket indefinitely.

Now, this might not sound like a big deal: each netlink reply is ~20 bytes long, you may say we can accumulate an awful lot of them before this becomes a critical issue. Alas, in lx_netlink_reply_msg we always call allocb() with lxns_bufsize which is set to 4096. Because of the header on the front, this actually results in an allocation from the kmem_alloc_8192 cache. For each one of these replies on the queue we are setting aside a bit over 8k of memory.

What's even better is that amongst libaudit's clients is the ever-wonderful systemd. It runs for a very long time, and it produces one of these audit events every time a unit (service) changes state.

On a machine with ~150 LX zones running, I am currently allocating a bit over 1GB per day of these buffers due to systemd alone, which will persist until the machine or the zones are rebooted. Eventually, the kernel memory usage expands and pushes out ARC, causes kmem_reap to kick in, and the machine grinds to a halt and never recovers.

Netlink should be replying to audit requests with single-part NLMSG_ERROR responses to be compatible with real Linux, and the LX netlink code needs to correctly handle ENOSPC from su_recv.

arekinath added a commit to eait-itig/illumos-gate that referenced this issue Jun 22, 2021
… expand until exhausted

TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath added a commit to eait-itig/illumos-gate that referenced this issue Jun 22, 2021
… expand until exhausted

TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath added a commit to eait-itig/illumos-gate that referenced this issue Jun 22, 2021
… expand until exhausted

TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath added a commit to eait-itig/illumos-gate that referenced this issue Apr 26, 2022
… expand until exhausted

TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath added a commit to eait-itig/illumos-gate that referenced this issue Apr 26, 2022
… expand until exhausted

TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath added a commit to eait-itig/illumos-gate that referenced this issue May 16, 2022
… expand until exhausted

TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath added a commit to eait-itig/illumos-gate that referenced this issue Mar 27, 2023
… expand until exhausted

TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath added a commit to eait-itig/illumos-gate that referenced this issue May 2, 2023
… expand until exhausted

TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath added a commit to eait-itig/illumos-gate that referenced this issue Nov 21, 2023
… expand until exhausted

TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath added a commit to eait-itig/illumos-gate that referenced this issue Dec 4, 2023
… expand until exhausted

TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath added a commit to eait-itig/illumos-gate that referenced this issue Dec 12, 2023
… expand until exhausted

TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath added a commit to eait-itig/illumos-gate that referenced this issue Jun 23, 2024
… expand until exhausted

TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath added a commit to eait-itig/illumos-gate that referenced this issue Jun 24, 2024
… expand until exhausted

TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath added a commit to eait-itig/illumos-gate that referenced this issue Nov 6, 2024
… expand until exhausted

TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant