Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ETW event publishing #15

Merged
merged 46 commits into from
Aug 18, 2023
Merged

Conversation

kirbyUK
Copy link
Contributor

@kirbyUK kirbyUK commented Jul 23, 2023

Initial code to add ETW logging for plugins, as described in #3 here.

Here's a screenshot showing some ETW events logged from a plugin.

etw-log-screenshot

Design considerations

I have tried to keep the interface as close as possible to the existing DTrace ETW one, but doing so has caused a couple of design issues I'd like to talk over.

The first is that I implemented the variadic function using C++ templates, but they're not really suitable at all. Since the APIs object is global, and made in the driver, we can only define one log function with all the specific parameters, and all plugins have to call that. That is, if you wanted different parameters, you'd have to recompile and reinstall the driver. I think for this we're going to have to use C-style variadic functions, but that will diverge from the linked example at least slightly because we'll need some kind of number-of-fields parameter. I don't really see any way around it unless you can think of something else, however, or unless we stray much more from the existing DTrace style (e.g. use classes?).

Secondly, the code spins up the event - including all of the memory allocations needed, fires the event, then tears it all down again. It would definitely be more performant to not do this, but I'm not too sure what we'd cache providers on, or where. I haven't benchmarked how much of a difference it would make, it might not really be an issue.

As it stands at the moment, I have a recreation of the Microsoft example as a plugin. Give it a go and let me know what you think.

@google-cla
Copy link

google-cla bot commented Jul 23, 2023

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@stevemk14ebr
Copy link
Collaborator

stevemk14ebr commented Jul 24, 2023

This is simply put, excellent. Thank you so much for continuing your original PR and completing this!

I agree with you that templates are not suitable for this, we can probably port this to a printf style interface without too much trouble, at least at the interface boundary that Dlls call this must be done - we perhaps can keep the templates for child calls. As far as the caching, we could make a global vector that contains a structure that holds all of the fields needed to register a new provider. The user may be required to do a one time call to 'setupEtwTrace' or something like that with the necessary details which would dispatch to the driver where all the ETW structures and allocations occur, then the results are saved to said structure, and the structure finally saved into the global vector. Users could register multiple providers this way and refer to them later if necessary inside of the ETW trace call. All plugin Dll would therefore just need 'setupEtwTrace' and 'EtwTrace' calls to be exposed.

I should have some time to review this in depth within the next few weeks. Please sign the CLA when possible :)

C/STrace/Interface.h Show resolved Hide resolved
C/STrace/EtwLogger.h Outdated Show resolved Hide resolved
C/STrace/EtwLogger.h Outdated Show resolved Hide resolved
C/STrace/EtwLogger.h Outdated Show resolved Hide resolved
C/STrace/EtwLogger.h Outdated Show resolved Hide resolved
C/STrace/EtwLogger.h Outdated Show resolved Hide resolved
C/STrace/EtwLogger.h Outdated Show resolved Hide resolved
C/STrace/EtwLogger.h Outdated Show resolved Hide resolved
C/STrace/EtwLogger.h Outdated Show resolved Hide resolved
C/STrace/EtwLogger.h Outdated Show resolved Hide resolved
@stevemk14ebr
Copy link
Collaborator

Thanks for the continued work! Let me know when you're ready for a second review 😄

@kirbyUK
Copy link
Contributor Author

kirbyUK commented Aug 10, 2023

I've got the two big changes - the C-style variadic functions and the caching - down now. I've still got a little bit more to do, but feel free to start having a look whenever you can if you'd like :)

I've had to use a few patterns I'm not a huge fan of for the caching, namely constructing objects in initialise functions because I can't throw exceptions, and using a destruct function because I can't use any destructors. This looks like it's just standard restrictions for a kernel driver if I'm not mistaken? I think given the circumstances they're fine, let me know if you have any thoughts.

@stevemk14ebr
Copy link
Collaborator

stevemk14ebr commented Aug 10, 2023

Yea you can't use destructors for global class objects because destructors of this type rely on the atexit registration performed inside of a C++ static initializer list which gets invoked by the runtime at process startup (for usermode binaries). We don't have a CRT so this would break. You can use destructors for stack or heap allocated object still though, as the destructor call is inserted inline by the compiler on object deletion/scope lifetime end.

I believe you can however use constructors still. You're correct exceptions are a no-no but you could initialize each pointer to nullptr, and if an allocation fails just ignore that failure. Each member function that uses a pointer that is set in the constructor would first branch to exit if any required pointer is null, basically defer constructor errors until later. I don't like this pattern at all though, you'll see I mostly use the initializer member function pattern myself.

EDIT: I've explicitly called out some changes I think we can do regarding destructors in my review

@stevemk14ebr stevemk14ebr self-requested a review August 10, 2023 16:56
C/STrace/EtwLogger.h Show resolved Hide resolved
C/STrace/EtwLogger.h Show resolved Hide resolved
C/STrace/EtwLogger.h Outdated Show resolved Hide resolved
//
// Unregister any registered ETW providers.
//
for (auto i = 0; i < g_ProviderCache.len(); i++)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be sufficient to just call g_ProviderCache.Destruct as it handles destruction of it's child objects. You should be able to use the destructors rather than a Destruct call for the contained objects. The only thing that needs to not use normal C++ destructors is Global classes, which is only this cache vector itself.

Copy link
Contributor Author

@kirbyUK kirbyUK Aug 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very bizarrely the commit 8fc0611 breaks the code - EventViewer can't decode the events anymore. What's even weirder though is that I put a breakpoint on the destructor and it's not even called? Will need to investigate, but reverted for now

C/STrace/EtwLogger.cpp Outdated Show resolved Hide resolved
C/STrace/EtwLogger.cpp Show resolved Hide resolved
C/STrace/EtwLogger.cpp Outdated Show resolved Hide resolved
C/STrace/EtwLogger.cpp Outdated Show resolved Hide resolved
C/STrace/EtwLogger.cpp Show resolved Hide resolved
C/STrace/EtwLogger.cpp Outdated Show resolved Hide resolved
@stevemk14ebr
Copy link
Collaborator

stevemk14ebr commented Aug 17, 2023

I took a look at all the destructor stuff. I think I've got it in a working state, can you try this and let me know if it works for you?

git apply destructors.patch

destructors.patch

If this is ok and works for you then I think we're ready to merge ;)

EDIT: How did you get traceview to decode the message field? I get a decode error when I select Auto. I setup my trace to use the GUID d7827ef0-cc9e-4b7c-a322-be5280ff3622

image

EDIT 2: Oh weird ok this breaks decoding too I see what you're saying. Hmm, maybe we just ignore it until I can figure this out

@stevemk14ebr
Copy link
Collaborator

This works so I'm merging the bulk of it now. We can open additional PRs for adjustments (such as maybe destructors) later.

@stevemk14ebr stevemk14ebr merged commit 6f4394b into mandiant:main Aug 18, 2023
2 checks passed
@stevemk14ebr
Copy link
Collaborator

Really nice work on this one @kirbyUK ! Future work includes switching out the textual logger system entirely for this ETW provider mechanism and I'll probably eventually write some sort of frontend usermode UI to consume this data nicely. We still need to figure out ETW event consumption parsing iirc yes?

fyi:
https://twitter.com/stevemk14ebr/status/1692552935421386818?s=20

@stevemk14ebr
Copy link
Collaborator

I figured out the destructor problem. When EtwTrace called push_back to put the new EtwProvider into the vector the temporary was being destructed. This called EtwUnregister which broke tracing. There was a minor bug in FindProvider as well that caused the compiler to erase the call entirely as it thought it was a no-op so this provider was constantly re-made and destroyed. It's a miracle it worked ;)

@kirbyUK
Copy link
Contributor Author

kirbyUK commented Aug 20, 2023

Hi, sorry for the radio silence on this - I was away for the weekend with no mobile signal!

Thanks a lot for taking a look at the destructors, sorry about the initial bug. I've been hit by the problem of returning temporaries quite a few times in the past weeks, I should have spotted it! 😅

You're right in that we still need to figure out event decoding. I popped tdh.dll up in IDA once and when MS say it's a userspace library they really mean it, there's barely any kernel calls there and it's quite complicated! Now that I know about actually sending events I have a better idea on how they work though, so that could provide a decent start.

Thanks again!

@stevemk14ebr
Copy link
Collaborator

Looks like TDH.dll has symbols though ;)

You can pull the PDB using https://twitter.com/stevemk14ebr/status/1694390773435805899?s=20

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants