forked from illumos/illumos-gate
-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LX libaudit clients cause kernel memory usage to expand until exhausted #366
Comments
arekinath
added a commit
to eait-itig/illumos-gate
that referenced
this issue
Jun 22, 2021
… expand until exhausted TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath
added a commit
to eait-itig/illumos-gate
that referenced
this issue
Jun 22, 2021
… expand until exhausted TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath
added a commit
to eait-itig/illumos-gate
that referenced
this issue
Jun 22, 2021
… expand until exhausted TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath
added a commit
to eait-itig/illumos-gate
that referenced
this issue
Apr 26, 2022
… expand until exhausted TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath
added a commit
to eait-itig/illumos-gate
that referenced
this issue
Apr 26, 2022
… expand until exhausted TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath
added a commit
to eait-itig/illumos-gate
that referenced
this issue
May 16, 2022
… expand until exhausted TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath
added a commit
to eait-itig/illumos-gate
that referenced
this issue
Mar 27, 2023
… expand until exhausted TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath
added a commit
to eait-itig/illumos-gate
that referenced
this issue
May 2, 2023
… expand until exhausted TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath
added a commit
to eait-itig/illumos-gate
that referenced
this issue
Nov 21, 2023
… expand until exhausted TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath
added a commit
to eait-itig/illumos-gate
that referenced
this issue
Dec 4, 2023
… expand until exhausted TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath
added a commit
to eait-itig/illumos-gate
that referenced
this issue
Dec 12, 2023
… expand until exhausted TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath
added a commit
to eait-itig/illumos-gate
that referenced
this issue
Jun 23, 2024
… expand until exhausted TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath
added a commit
to eait-itig/illumos-gate
that referenced
this issue
Jun 24, 2024
… expand until exhausted TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
arekinath
added a commit
to eait-itig/illumos-gate
that referenced
this issue
Nov 6, 2024
… expand until exhausted TritonDataCenter#367 LX libaudit wants LX_AUDIT_USER
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The Linux
libaudit
library usesNETLINK
sockets to send audit events. It opens a netlink socket, then callssendto()
to write the event out (see https://github.com/linux-audit/audit-userspace/blob/master/lib/netlink.c#L239) and then it checks for an ACK from the kernel.When checking for an ACK, libaudit looks specifically for an
NLMSG_ERROR
message: https://github.com/linux-audit/audit-userspace/blob/master/lib/netlink.c#L287 It doesrecvfrom()
withMSG_PEEK
to look for theNLMSG_ERROR
code, and then only does a real read withoutMSG_PEEK
once it's seen it. If any other message arrives, it will never read from the netlink socket again.Unfortunately, in e.g.
lx_netlink_au_um
, we just calllx_netlink_reply
, which sets up anNLMSG_DONE
message (withNLM_F_MULTI
in the header). This is not the kind of ACK which libaudit is expecting, and so libaudit never actually reads from its netlink socket on LX.When libaudit stops reading from the netlink socket, replies start to queue up. Normally this backpressure is handled by the layer calling the
su_recv
callback (e.g. the TCP/IP stack) -- you're meant to watch forENOSPC
from that socket upcall and set a flag to stop sucking in new messages until the downcall comes to tell you things are unblocked again.Unfortunately, in
lx_netlink_reply_sendup
, after we callsu_recv
, we have:And that's it. End of function. We don't set any flags, we just increment a global counter (which is never read anywhere in the code). This means that we can accumulate replies on the socket queue of a netlink socket indefinitely.
Now, this might not sound like a big deal: each netlink reply is ~20 bytes long, you may say we can accumulate an awful lot of them before this becomes a critical issue. Alas, in
lx_netlink_reply_msg
we always callallocb()
withlxns_bufsize
which is set to 4096. Because of the header on the front, this actually results in an allocation from thekmem_alloc_8192
cache. For each one of these replies on the queue we are setting aside a bit over 8k of memory.What's even better is that amongst libaudit's clients is the ever-wonderful
systemd
. It runs for a very long time, and it produces one of these audit events every time a unit (service) changes state.On a machine with ~150 LX zones running, I am currently allocating a bit over 1GB per day of these buffers due to systemd alone, which will persist until the machine or the zones are rebooted. Eventually, the kernel memory usage expands and pushes out ARC, causes
kmem_reap
to kick in, and the machine grinds to a halt and never recovers.Netlink should be replying to audit requests with single-part
NLMSG_ERROR
responses to be compatible with real Linux, and the LX netlink code needs to correctly handleENOSPC
fromsu_recv
.The text was updated successfully, but these errors were encountered: