Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sysmon throwing many SysmonError drop events with EventID of "QUEUE" #813

Open
branchnetconsulting opened this issue Jun 24, 2024 · 10 comments

Comments

@branchnetconsulting
Copy link

Across many client sites and a variety of different Windows versions where we have Sysmon 15.14 running, we are seeing diverse Sysmon error events (255) with this description:
Events dropped from driver queue...
This has been happening on dozens of Windows systems, both end user and server system, including at least Windows 11 and Server 2019, perhaps others as well. We have been using Sysmon broadly for years and previous to upgrading to 15.14 we do not recall ever seeing "Events dropped" complaints from Sysmon.
Frequently individual systems make multi-hour sustained bursts of these error events exceeding 100 per second, and we are concerned that system performance may be getting degraded by the issue. During such a sustained burst of errors, the Windows system involved is seen to be producing other normal Sysmon events at normal low-volume levels.

We use a lightly customized version of the "balanced" config from the Sysmon Modular project.
Any advise about how to diagnose or treat this issue further would be sincerely appreciated.

Kevin Branch

@wzr
Copy link

wzr commented Aug 22, 2024

Same here. On v15.12. It seems to happen mostly on idle machines, without any user logged in. A high numbers of hosts with this error are recently installed laptops that stay plugged in the network until the users pick them up without any human interaction.

@foxmsft
Copy link
Collaborator

foxmsft commented Aug 22, 2024

v15.1 is the first version that prints that message. That error is an FYI, before that it was silently ignored.

@wzr
Copy link

wzr commented Aug 22, 2024

The immediate challenge is that it is very chatty. We get upwards of 20M events/day for machines with the issue (~20-30 out of ~6600). In some cases up to 200M events/day. We can adjust the WEC XPath to ignore all EventID 255 in the subscription, but that does create a blind spot for actual Sysmon issues, and it still adds a lot of backpressure in the Sysmon Eventlog.

@foxmsft
Copy link
Collaborator

foxmsft commented Aug 22, 2024

My initial assumption was that the default event queue size would be "enough for any number of events". I'm not that concerned about it being chatty, as that can be tuned.

I'm more interested in why it's dropping the events, and avoiding it altogether. Can you please send me an email at hotmail? I want to follow up.

@branchnetconsulting
Copy link
Author

Our concern is that the massive volume we are seeing of these errors on any given host does not appear to correlate with an abnormally high volume of common Sysmon events on the same host. It's almost like once triggered, Sysmon reporting of this warning message starts thrashing even when the system is not producing enough events to reasonably stress the event queue. Either that or somehow the queue is getting into a broken state where even low levels of incoming messages still get blocked. I would be happy to give further feedback on this. My email is [email protected].

@wzr
Copy link

wzr commented Aug 23, 2024

In my case, it seems that ProcessAccess and Registry Event are the busy (I assume) queues:
These are numbers across a sample of 100 computers, for just 24 hours

Description count
Events dropped from driver queue: ProcessAccess:1 837351817
Events dropped from driver queue: ProcessAccess:2 225073628
Events dropped from driver queue: ProcessAccess:3 50324386
Events dropped from driver queue: ProcessAccess:4 17270094
Events dropped from driver queue: RegistryEvent:1 15456089
Events dropped from driver queue: ProcessAccess:5 8366108
Events dropped from driver queue: ProcessAccess:1 RegistryEvent:1 7699987
Events dropped from driver queue: ProcessAccess:6 5328869
Events dropped from driver queue: ProcessAccess:7 3862218
Events dropped from driver queue: ProcessAccess:8 2918265
Events dropped from driver queue: ProcessAccess:9 2355140
Events dropped from driver queue: ProcessAccess:2 RegistryEvent:1 2071065
Events dropped from driver queue: ProcessAccess:10 1968974
Events dropped from driver queue: ProcessAccess:11 1529965
Events dropped from driver queue: ProcessAccess:12 1249597
Events dropped from driver queue: ProcessAccess:13 1033853
Events dropped from driver queue: RegistryEvent:2 957377
[...]

@wzr
Copy link

wzr commented Aug 30, 2024

@foxmsft I did email at your github username at hotmail, did it ever reach you?

My initial assumption was that the default event queue size would be "enough for any number of events". I'm not that concerned about it being chatty, as that can be tuned.

I'm more interested in why it's dropping the events, and avoiding it altogether. Can you please send me an email at hotmail? I want to follow up.

@wzr
Copy link

wzr commented Dec 2, 2024

@branchnetconsulting any luck on your end on this?

@foxmsft
Copy link
Collaborator

foxmsft commented Dec 2, 2024

Not yet, the ball is in my court.

@wzr
Copy link

wzr commented Dec 2, 2024

@foxmsft @branchnetconsulting we haven't been able to identify the root cause, but it appears all hosts that are seemingly randomly producing queue errors are end user devices, recently installed and running windows 11. We can reproduce with relative confidence, like 7-8 out of 10 devices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants