Understanding Null Pointer Dereference in Windows Kernel Drivers
In this blog post, we’ll explore one of the classic yet dangerous bugs—null pointer dereference. We’ll break down what it really means, build a custom vulnerable driver, and see firsthand how it can bring down an entire Windows system with a blue screen of death (BSOD). Introduction A null pointer dereference happens when a driver tries to access memory through a pointer that hasn’t been properly initialized—usually pointing to address 0x0. In user mode, this might just crash an app, but in kernel mode, it’s a lot more serious. Since the kernel operates with full system privileges with limited error handling, dereferencing a null pointer can trigger a blue screen of death (BSOD) and bring down the entire system. These bugs often slip through when developers assume a pointer is valid without checking, making them both common and dangerous. Simple NULL Pointer Dereference Vulnerability (Video Buffer Simulation) We’re looking at a small, simple but deadly custom driver that pretends to be a graphics/video driver. It randomly fails to allocate a “video buffer” and then blindly writes a 0xDEADBEEF magic value, even if the buffer is NULL! We intentionally crash the system (BSOD) for fun, and if you open a debugger, you’ll spot the famous DEADBEEF pattern in memory. The vulnerability arises because the driver attempts to write 0xDEADBEEF to a video buffer without verifying if the memory allocation succeeded. If the buffer allocation fails and returns NULL, this write will cause a NULL pointer dereference, leading to an instant system crash (BSOD). Simple BSOD with Null Pointer Dereference in a Custom Driver In my custom driver, I’ve implemented a vulnerable IOCTL handler that simulates a simple video buffer allocation. One of the vulnerabilities involves a NULL pointer dereference triggered when the allocation randomly fails and the driver blindly writes to the NULL pointer. This isn’t a full exploitation write-up—just a demonstration of how careless memory handling in drivers can crash the system. We’ll explore advanced exploitation paths in future posts. The Windows Debugger shows a crash in the DispatchIoctl function of my custom driver, specifically at the instruction mov dword ptr [rdi], 0xDEADBEEF, where RDI is NULL. This confirms a classic NULL pointer dereference, as the kernel attempts to write to address 0x0, causing a SYSTEM_THREAD_EXCEPTION_NOT_HANDLED BSOD. This happens because the video buffer allocation failed and the pointer remained NULL, but the driver blindly wrote to it, leading to the crash. Null Pointer Dereference in a Custom TCP-like Windows Kernel Driver This is a null pointer dereference vulnerability embedded in a custom Windows kernel driver that mimics processing of TCP-like network packets. We created our own structure, TCP_HEADER, which includes a field named PayloadPointer—intended to represent a pointer to the packet’s actual data. The vulnerability arises because the driver assumes that this pointer is always valid, without performing any null or sanity checks. If a malicious user-mode application crafts a TCP_HEADER with PayloadPointer set to NULL and passes it to the driver, the kernel will blindly attempt to access *(NULL). Simple BSOD with Null Pointer Dereference in a Custom Driver In my custom driver, I’ve implemented a vulnerable IOCTL handler that simulates TCP packet parsing. One of the vulnerabilities involves a null pointer dereference triggered by sending a TCP-like structure with a PayloadPointer set to NULL. This isn’t a full exploitation write-up, just a demonstration of how a malformed user-supplied packet can crash the kernel. We’ll explore advanced exploitation paths in future posts. The Windows Debugger shows a crash in the DeviceIoControlHandler function, specifically at the instruction movzx edx, byte ptr [rax], where rax is NULL. This confirms a classic null pointer dereference, as the kernel tries to read from address 0x0, leading to a SYSTEM_THREAD_EXCEPTION_NOT_HANDLED BSOD. In kernel mode, dereferencing a null pointer doesn’t just crash the app—it crashes the whole system, triggering a blue screen of death (BSOD) with the familiar SYSTEM_THREAD_EXCEPTION_NOT_HANDLED bug check. Null Pointer Dereference in Custom EDR Driver: File Path Vulnerability This is a null pointer dereference vulnerability in a custom-built Windows kernel driver that simulates how an EDR (Endpoint Detection and Response) component might scan executable files provided by user-mode. The vulnerability arises when the driver blindly trusts a user-supplied pointer to a file path without checking if it’s valid. The driver receives a FILE_SCAN_REQUEST structure from user mode. This structure includes a pointer (Filename) and a length (FilenameLength). The idea is to copy the filename string into a local kernel buffer (scanBuffer) so the driver can inspect or scan the file for threats. This line tries to copy a filename string from a user-supplied structure into a local buffer for scanning or logging. The problem? The driver never checks whether request->Filename is a valid, non-NULL pointer. If user mode sends a NULL here, the driver blindly dereferences 0x0. Simple BSOD with Null Pointer Dereference in a Custom EDR Driver I wrote a user-mode tool that prompts for a file path and sends it to the kernel via DeviceIoControl. If I provide a legit path, the driver attempts to scan the file normally. But if I just hit Enter without typing anything, it sends a NULL pointer. The CPU raises a page fault, and Windows responds with a BSOD. This is a classic null pointer dereference, often caused by developers assuming the user-supplied pointer is always valid. In older Windows versions (XP, Vista, 7), null pointer dereference vulnerabilities in kernel drivers could be exploited by mapping the NULL page (0x0) from user mode and placing attacker-controlled data there. If the kernel dereferenced a null pointer, this would lead to arbitrary code execution in ring 0, enabling full system compromise. Starting with Windows 8, Microsoft mitigated this entire class of bugs by blocking NULL page allocation and disabling NTVDM by default. NTVDM (used for running 16-bit apps on x86 systems) previously allowed NULL page mapping, which attackers abused to revive this old technique on Windows 10 x86. Today, these mitigations effectively neutralize most null dereference exploits in modern Windows systems. Conclusion In this post,
Understanding Arbitrary Access Primitives in Windows Kernel
In this blog post, we will explore some of the most powerful and commonly abused vulnerabilities in kernel-mode: arbitrary access primitives. From reading kernel memory and hijacking execution flows, to directly interacting with physical memory or model-specific registers (MSRs), each of these primitives opens doors to high-impact, post-exploitation techniques. Whether you’re writing an exploit, doing rootkit research, or reverse-engineering drivers, understanding these vulnerabilities is essential. There are five key types of arbitrary access vulnerabilities we’ll explore in this series:arbitrary read, arbitrary write, physical read, physical write, and MSR read—each offering unique capabilities for kernel exploitation and post-exploitation. Arbitrary Read Arbitrary read allows an attacker to read memory from any address in kernel space. By supplying a user-controlled pointer, the kernel will read from that location and return the data. This can be used to leak kernel base addresses, tokens, or function pointers. It’s typically the first step toward bypassing KASLR or escalating privileges. To demonstrate this vulnerability, we created a custom vulnerable driver exposing an IOCTL_ARBITRARY_READ operation: This is vulnerable because it blindly reads from a user-supplied kernel address (rw->Address) without validating it first. If the pointer is invalid or points to sensitive memory, it may leak kernel information or crash the system. Exploiting Arbitrary Read Vulnerability Using WinDbg’s eq command, we identify the base address of ntoskrnl.exe as fffff80425ea5000. By passing this address to our arbitrary read PoC, we successfully leak the MZ header (0x905a4d), confirming a valid kernel memory read. This demonstrates the ability to leak kernel pointers—a crucial step for bypassing KASLR. This leak proves that our vulnerability allows reading any memory in the kernel address space—a powerful primitive for further exploitation. Bonus Tip Arbitrary read vulnerabilities often involve functions like RtlCopyMemory, memcpy, or memmove, where a driver copies data from a user-supplied kernel address without validation. Safer APIs like MmCopyMemory exist but are rarely misused. The root cause is usually the absence of checks like ProbeForRead or MmIsAddressValid, allowing attackers to read sensitive kernel memory. These bugs typically surface in DeviceIoControl handlers that directly trust user input like rw->Address. Arbitrary Write Arbitrary write lets a userland attacker overwrite any memory address in kernel space. It is often used to hijack execution, such as overwriting function pointers or token privileges. If the attacker knows what to write and where, they can gain full system access. Combined with read, it’s a devastating primitive for kernel exploitation. To demonstrate this vulnerability, we created a custom vulnerable driver exposing an IOCTL_ARBITRARY_WRITE operation: This IOCTL handler implements an arbitrary write vulnerability by directly writing a user-supplied value (rw->Value) to a user-specified kernel address (rw->Address) without validating the pointer. The lack of access checks allows attackers to overwrite sensitive kernel structures, potentially leading to privilege escalation or system instability. Exploiting the Arbitrary Write Vulnerability To demonstrate the power of an arbitrary write vulnerability, we used WinDbg to locate the second entry in the HalDispatchTable. For the purpose of the demo, we are taking HalDispatchTable. You can take any desired address where you want to write it. Now, we will run our PoC to perform an arbitrary write. We’ll attempt to write the value 0x4141414 to the target kernel address fffff804262cd254. After executing the PoC, we can confirm that the value at this address has been successfully overwritten. MSR Read MSR (model-specific register) read vulnerabilities expose critical CPU-level settings. By using a vulnerable driver that allows arbitrary RDMSR calls, attackers can extract values like IA32_LSTAR (which stores the kernel’s syscall entry point). This breaks KASLR and can bypass syscall hooking mechanisms, making it a powerful primitive in both EDR evasion and advanced kernel exploitation. This driver uses __readmsr(msr->MsrId) to read from a Model-Specific Register (MSR) based on a user-supplied ID and returns the result to user mode via msr->Value. MSRs store critical CPU configuration data, including pointers to kernel functions. Without validating the MSR ID, this gives attackers access to privileged information. Registers like IA32_LSTAR or IA32_SYSENTER_EIP can reveal kernel base addresses, enabling KASLR bypass. Exploiting MSR Read Vulnerability In this this PoC, we demonstrate a classic exploitation technique by leaking the value of the IA32_LSTAR MSR (Model-Specific Register) located at 0xC0000082. This register holds the address of the kernel’s SYSCALL entry point, typically pointing to the nt!KiSystemCall64 function within ntoskrnl.exe. By reading its value from user mode via an MSR read vulnerability, we effectively bypass Kernel Address Space Layout Randomization (KASLR), a crucial Windows security mechanism. To confirm the leak in WinDbg, use !address 0xfffff804261f8180 and ensure the leaked address falls within the memory range of ntoskrnl.exe. Physical Read & Physical Write Physical memory access vulnerabilities let attackers bypass virtual memory protections to read or write raw RAM directly. With physical read, one can inspect memory-mapped devices, firmware, or hidden kernel structures—useful for uncovering secrets or debugging hardware-level code. Physical write is even more potent, allowing direct tampering with hardware registers or kernel memory, potentially disabling security features or planting persistent backdoors. While dangerous and often system-crashing if misused, in expert hands, these primitives are essential tools in advanced kernel exploitation, rootkit development, and hypervisor-level research. These handlers allow user-mode input to map arbitrary physical memory addresses using MmMapIoSpace() without validating the rw->Address field. In both read and write cases, the driver maps and accesses the physical memory directly using the user-provided address. This is vulnerable because attackers can specify sensitive or protected physical addresses and read secrets (e.g., kernel code, credentials) or write malicious values (e.g., patching kernel code, disabling protections). Vulnerable MSR IOCTL Handler in AMDPowerProfiler.sys Today we’re diving into a real-world example of a vulnerable kernel driver—AMDPowerProfiler.sys. This driver exposes unsafe access to Model-Specific Registers (MSRs) via an IOCTL handler. By accepting a user-controlled pointer without validation, it gives attackers powerful read/write primitives to sensitive CPU registers directly from user mode. a1 is a pointer passed from user mode, likely via DeviceIoControl, but it’s never validated (e.g., no ProbeForRead, ProbeForWrite, or try/except). If the first byte at a1 is non-zero (*(_BYTE *)a1), the driver calls __readmsr on the
Understanding Double Free in Windows Kernel Drivers
What is Double-Free? A double-free vulnerability occurs when a program frees the same memory block multiple times. This typically happens when ExFreePoolWithTag or ExFreePool is called twice on the same pointer, causing corruption in the Windows kernel memory allocator. If an attacker can predict or control the reallocation of this memory, they may be able to corrupt memory structures, overwrite critical pointers, or redirect execution flow to controlled memory regions. Double-free vulnerabilities often lead to heap corruption, kernel crashes (BSOD), or even arbitrary code execution, if exploited properly. 1. Classic Double-Free (Same Function Call Twice) Concept: The driver allocates memory using ExAllocatePoolWithTag and frees it twice using ExFreePoolWithTag. This causes corruption in the pool allocator, potentially leading to heap corruption or arbitrary execution. In this example, we implement a custom kernel driver that allocates a pool of memory, frees twice it, and then intentionally accesses it, triggering a BSOD. The vulnerability occurs because g_DoubleFreeMemory is freed twice using ExFreePoolWithTag, leading to a double-free bug. After the first free, the pointer still holds the now-invalid memory address, allowing a second ExFreePoolWithTag call on an already freed block. This can lead to memory corruption, potential use-after-free (UAF) scenarios, and arbitrary code execution if an attacker reallocates the freed memory. Simple BSOD with Double-Free in a Custom Driver The exploit will follow the same pattern as previously explained, as shown with the blue screen below. 2. Double free via Memory Descriptor List IoFreeMdl is used to release a Memory Descriptor List (MDL) in Windows kernel mode. Incorrect handling, such as double-freeing an MDL, can lead to system crashes or exploitation opportunities. This guide demonstrates creating a custom kernel driver that contains a double-free vulnerability and a user-mode PoC to trigger it. In this code, an MDL (g_Mdl) is allocated using IoAllocateMdl, and its successful allocation is logged. The first call to IoFreeMdl (g_Mdl) correctly frees the MDL. A KeDelayExecutionThread introduces a 1-second delay before attempting to free the already-freed MDL again, triggering a double-free vulnerability. Simple BSOD with Double-Free in a Custom Driver This user-mode PoC opens a handle to the vulnerable driver (DoubleFreeLink) and sends an IOCTL request (IOCTL_TRIGGER_DOUBLE_FREE) to trigger the double-free vulnerability in the kernel driver. If successful, the exploit could lead to a system crash or potential exploitation. BSOD Triggered: The system crashes with a BUGCHECK_CODE: 0x4E (PFN_LIST_CORRUPT) due to the double-free of an MDL in the kernel driver. Making Double-Free More Challenging We’ve explored basic use-after-free (UAF) and double-free vulnerabilities, which might seem easy to understand. However, in real-world scenarios, these bugs are much harder to detect and exploit. Unlike simple examples, real UAF and double-free issues are rare and often require luck to find. Now, let’s step up the challenge—I’ll introduce a slightly more complex case that mirrors real-world scenarios but remains understandable. 0. Setup: Struct-Based Resource Handling Before diving into allocation, let’s understand the structure. This struct mimics a common pattern in driver development wrapping raw buffers inside helper structures. These wrappers often abstract buffer ownership and lifecycle management, but when misused, they also obscure bugs like double-free and UAF. That’s exactly what happens here. 1. Allocation Phase This setup is clean and typical in real-world Windows drivers. But here’s the catch: no centralized memory tracking, no flags, and no safe-guard against double cleanup. A disaster waiting to happen if callbacks are reused. In this step, we allocate memory twice: 2. Double-Free via Wrapped FreeHandle Routine The double-free vulnerability is triggered when the buffer pDummy->Buffer is first manually freed. This simulates a typical cleanup scenario like 𝙲𝙳𝚘𝚠𝚗𝚕𝚘𝚊𝚍𝙱𝚞𝚏𝚏𝚎𝚛::𝚁𝚎𝚕𝚎𝚊𝚜𝚎() but the buffer pointer is never nullified or flagged as freed. Later, the driver calls a helper routine wrapped around the cleanup phase: Inside FreeHandle(), the same buffer is freed again without validation. Because FreeHandle() blindly assumes ownership and responsibility for cleanup, it unknowingly triggers a second free on an already-freed memory block. This cleanup wrapping common in error handling paths, DriverUnload, or exception-safe routines makes such bugs deceptively difficult to detect in large codebases. The result? A dangerous double-free that can corrupt memory or open the door to further exploitation. Summary: Wrapping Around Danger – Double-Free in Disguise This driver shows a classic double-free bug: memory is freed once directly, then again via a cleanup callback (FreeHandle). The issue lies in freeing pDummy->Buffer twice without resetting or checking ownership. What makes it tricky is how the second free is wrapped in a callback just like real-world code, where cleanup is scattered across destructors or handlers, making such bugs harder to catch in large systems. Double-Free (Mitigation): Double-free vulnerabilities can be avoided by nullifying pointers after the first free, and checking their state before every deallocation. In complex code with shared pointers or cleanup callbacks, use flags or state checks to ensure memory is freed only once. Bonus Tip: Spotting Double-Free in Windows Drivers To identify double-free vulnerabilities, start by looking for deallocation functions. In user-mode, watch for free, delete, GlobalFree, or Release. In Windows kernel drivers, key functions include ExFreePoolWithTag, IoFreeMdl, ObDereferenceObject, MmFreeContiguousMemory, and RtlFreeHeap. Many of these calls are wrapped inside internal cleanup functions or callbacks (like CDownloadBuffer::Release or FreeHandle), which can obscure the actual free. Always trace pointer lifecycle: if it’s freed and still accessed or freed again, that’s a bug. Check if the pointer is nullified or checked post-free—if not, it might be reused unsafely.
Understanding Use-After-Free (UAF) in Windows Kernel Drivers
In this blog post, we’ll explore use-after-free (UAF) vulnerabilities in Windows kernel drivers. We will start by developing a custom vulnerable driver and analyzing how UAF occurs. Additionally, we will explain double free vulnerabilities, their implications, and how they can lead to system crashes or privilege escalation. Finally, we’ll develop a proof-of-concept (PoC) exploit to demonstrate the impact of these vulnerabilities, including triggering a blue screen of death (BSOD). What is Use-After-Free? A use-after-free (UAF) vulnerability occurs when a program continues to use a pointer after the associated memory has been freed. This can lead to memory corruption, arbitrary code execution, or system crashes. Common APIs That Allocate and Free Memory in Windows Kernel Drivers In Windows kernel development, memory allocation and deallocation are crucial operations. Improper management of allocated memory can lead to use-after-free (UAF) vulnerabilities, resulting in arbitrary code execution, privilege escalation, and system crashes (BSODs). This section explores various allocation and deallocation functions in Windows kernel drivers, their correct usage, and potential security risks. 1. Use-After-Free Classic Pool-Based Windows kernel provides paged and non-paged memory pools for allocation. In the case of classic pool-based UAF, the Windows kernel driver allocates memory using ExAllocatePoolWithTag(), deallocates it with ExFreePoolWithTag(), and then mistakenly accesses it. This results in a crash (BSOD) due to accessing invalid memory. Such vulnerabilities are critical, as they can be exploited to execute arbitrary code, escalate privileges, or corrupt kernel memory. In this example, we implement a custom kernel driver that allocates a pool of memory, frees it, and then intentionally accesses it, triggering a BSOD. Memory Allocation The kernel driver uses ExAllocatePoolWithTag() to allocate memory for storing data (in this case, wrenchData). This memory is part of the non-paged pool, meaning it remains in physical memory and isn’t swapped out. Memory Deallocation The memory is then freed using ExFreePool(wrenchData). However, the pointer wrenchData still holds the address of the now-freed memory. The problem arises because the pointer is not nullified or reset after freeing the memory. Use-After-Free Use-after-free happens when the freed memory is accessed again, as demonstrated by the code RtlCopyMemory(wrenchData->data, L”WKL UAF Attack!”, sizeof(L”WKL UAF Attack!”)). The kernel tries to copy data into the freed memory, which leads to unpredictable behavior. This memory is no longer valid and accessing it may cause system instability or crashes. Overwriting the Pointer The pointer wrenchData is then deliberately set to an invalid address (0x500). This step is crucial because it could lead to further exploitation if this invalid memory location is accessed in the future, causing a crash (BSOD) or other unintended behavior. Simple BSOD with UAF in a Custom Driver For now, I’ll take a simple UAF scenario and demonstrate how it can cause a BSOD using IOCTL. This is not full exploit development—just a basic crash to illustrate a use-after-free. We’ll dive deeper into exploitation techniques in future blog posts. This PoC demonstrates a use-after-free (UAF) vulnerability in a kernel driver. It opens the vulnerable device and sends an IOCTL command (IOCTL_TEST_CODE) that triggers the UAF. The driver attempts to access memory (wrenchData) that has already been freed, leading to invalid memory access, which could cause memory corruption, system instability, or a BSOD. In future posts, we’ll explore how to turn this into a fully working exploit. The crash occurs when the driver attempts to access freed memory, specifically in the ExFreeHeapPool function. The invalid memory access happens due to a use-after-free (UAF) condition, where a pointer to freed memory is still being dereferenced (mov rbx, qword ptr [rax+10h]). This results in accessing invalid or corrupted memory, leading to a system crash or potential memory corruption, as seen in the stack trace. 2. Use-After-Free in IRP-Based Memory Management The IRP-based memory management involves several key APIs, such as IoAllocateIrp, which allocates an IRP for processing I/O requests, and IoFreeIrp, which frees the IRP when it’s no longer needed. Additionally, IoCallDriver is used to send the IRP to another driver for further processing, while IoCompleteRequest signals the completion of the request. In our custom driver, we allocate memory for an IRP using IoAllocateIrp and process the request. However, after completing the request, we mistakenly free the IRP using IoFreeIrp but later attempt to access or modify the buffer that was passed with the IRP. This can lead to a use-after-free vulnerability, as the memory is no longer valid after being freed. In this code, the driver processes an IOCTL request and allocates memory for the IRP buffer (IRP_BUFFER) located in the system buffer of the IRP. It then copies the string “IRP Data” into the buffer->data. After the IRP is processed, it is freed using IoFreeIrp with the line IoFreeIrp(Irp);. However, the driver proceeds to access the buffer->data after the IRP is freed, which leads to a use-after-free (UAF) vulnerability. Accessing the memory of buffer->data after it has been deallocated results in undefined behavior, such as crashes or potential security exploits. Simple BSOD with UAF in a Custom Driver The exploit will follow the same pattern as previously explained. Let’s now examine the issue using WinDbg, as shown below. The crash appears to be related to a use-after-free (UAF) vulnerability. Specifically, the faulting address ffff860d71dfb9f0 seems to indicate that the IRP (I/O Request Packet) was freed, but the driver or process continued to access the freed memory. The IoFreeIrp call in the kernel driver appears to have been followed by an attempt to access the freed IRP buffer (located at ffff860d71dfb9f0), which caused the system to trigger a bug check (error code 1232). The stack trace points to the IOCTL handler in the kernel driver (KernelPool!IOCTL+0x90), which is where the memory access occurred after the IRP was freed. 3. Use-After-Free via ObDereferenceObject() The Windows kernel manages objects like FILE_OBJECT, DEVICE_OBJECT, and ETHREAD using reference counting. When an object is created or accessed, its reference count increases, and when it’s no longer needed, the reference count decreases. The function responsible for this is ObDereferenceObject(). If an object is freed while another part of the system
Understanding Integer Overflow in Windows Kernel Exploitation
In this blog post, we will explore integer overflows in Windows kernel drivers and cover how arithmetic operations can lead to security vulnerabilities. We will analyze real-world cases, build a custom vulnerable driver, and demonstrate how these flaws can impact memory allocations and system stability. What is Integer Overflow in the Kernel? Integer overflow occurs when an arithmetic operation exceeds the maximum value a data type can hold, causing it to wrap around. In the Windows kernel, integer overflows can lead to memory corruption, buffer overflows, or incorrect size calculations in kernel allocations, often resulting in heap corruption, out-of-bounds writes, and bug checks (AKA “blue screen of death” or BSOD). These vulnerabilities can arise in multiple ways: Before we dive into integer overflow vulnerabilities in the Windows kernel, let’s first understand data types and how they work in memory. Understanding Data Types When working with low-level programming in C and C++, especially in Windows kernel and user mode applications, choosing the right data type is critical. A wrong choice can lead to integer overflows, memory corruption, privilege escalation, and serious security vulnerabilities. To make things easier, I’ve put together a cheat sheet that you can refer back to whenever you’re analyzing a kernel driver or a user-mode application for potential bugs. This table gives you a quick overview of how different data types store values and where things can go wrong. Use this as your go-to reference when hunting for integer overflows, wraparounds, and other dangerous bugs in kernel and user-mode applications. Data Type Size (x64/x86) Signed Range Unsigned Range Used In char 1 byte -128 to 127 0 to 255 User & Kernel unsigned char 1 byte N/A 0 to 255 User & Kernel signed char 1 byte -128 to 127 N/A User & Kernel short 2 bytes -32,768 to 32,767 0 to 65,535 User & Kernel unsigned short 2 bytes N/A 0 to 65,535 User & Kernel signed short 2 bytes -32,768 to 32,767 N/A User & Kernel int 4 bytes -2,147,483,648 to 2,147,483,647 0 to 4,294,967,295 User & Kernel unsigned int 4 bytes N/A 0 to 4,294,967,295 User & Kernel signed int 4 bytes -2,147,483,648 to 2,147,483,647 N/A User & Kernel long (Windows) 4 bytes (x86/x64) -2,147,483,648 to 2,147,483,647 0 to 4,294,967,295 User & Kernel unsigned long 4 bytes N/A 0 to 4,294,967,295 User & Kernel signed long 4 bytes -2,147,483,648 to 2,147,483,647 N/A User & Kernel long long 8 bytes -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 0 to 18,446,744,073,709,551,615 User & Kernel unsigned long long 8 bytes N/A 0 to 18,446,744,073,709,551,615 User & Kernel signed long long 8 bytes -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 N/A User & Kernel SIZE_T 8 bytes (x64) / 4 bytes (x86) N/A 0 to 18,446,744,073,709,551,615 (x64) / 4,294,967,295 (x86) User & Kernel SSIZE_T 8 bytes (x64) / 4 bytes (x86) -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 (x64) / -2,147,483,648 to 2,147,483,647 (x86) N/A User & Kernel ULONG 4 bytes N/A 0 to 4,294,967,295 Kernel Only ULONGLONG 8 bytes N/A 0 to 18,446,744,073,709,551,615 Kernel Only DWORD 4 bytes 0 to 4,294,967,295 Same as unsigned int User & Kernel NTSTATUS 4 bytes Varies (signed) N/A Kernel Only HANDLE 8 bytes (pointer) System pointer System pointer User & Kernel The above data sheet provides a comprehensive reference for both user mode and kernel mode data types, covering their sizes, ranges, and potential overflow scenarios. This information is based on official Microsoft documentation and kernel data types and serves as a valuable resource for identifying vulnerabilities related to integer overflows in kernel drivers. Common Data Types That Can Cause Integer Overflow in Kernel Data Type Size Signed/Unsigned Range Overflow Type ULONG 4 bytes Unsigned 0 to 4,294,967,295 (0xFFFFFFFF) Unsigned wraparound LONG 4 bytes Signed -2,147,483,648 to 2,147,483,647 Signed overflow ULONG64 8 bytes Unsigned 0 to 18,446,744,073,709,551,615 Large value overflow LONG64 8 bytes Signed -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 Signed overflow SIZE_T 4 bytes (x86) / 8 bytes (x64) Unsigned Platform-dependent Unsigned wraparound SSIZE_T 4 bytes (x86) / 8 bytes (x64) Signed Platform-dependent Signed overflow LONG_PTR 4 bytes (x86) / 8 bytes (x64) Signed Platform-dependent Pointer arithmetic overflow INT64 8 bytes Signed Same as LONG64 Multiplication overflow Network Packet Overflow in Custom Windows Kernel Drivers (Addition ULONG Overflow) I am demonstrating a custom Windows kernel driver that simulates the processing of network packets. To understand the vulnerability, let’s first discuss ULONG and its range. In Windows, ULONG is a 32-bit unsigned integer, meaning it can hold values from 0x00000000 (0 in decimal) to 0xFFFFFFFF (4,294,967,295 in decimal). Since it cannot store negative values, any arithmetic operation that exceeds 0xFFFFFFFF causes an integer overflow, wrapping the value back to a much smaller number instead of continuing to increase. This behavior is the root cause of the vulnerability in my custom driver. The vulnerable function in this custom driver takes a user-supplied packet size and adds 0x1000 to determine how much memory to allocate for storing the packet. However, if an attacker provides a large value like 0xFFFFFFFF, adding 0x1000 causes an integer wraparound, meaning instead of a large allocation, the kernel ends up allocating a much smaller buffer than expected. For example, 0xFFFFFFFF + 0x1000 wraps around to 0x00000FFF, allocating only 4,095 bytes instead of the intended large buffer. Triggering the Bug: Integer Wraparound in Packet Allocation I created a simple PoC (proof of concept) to trigger the vulnerable line in the driver. The function takes a user-supplied packet size and adds 0x1000 for memory allocation. However, providing a large value like 0xFFFFFFFF causes an integer wraparound, resulting in a much smaller allocation (0x00000FFF instead of the intended large buffer), leading to a crash. The crash occurred at: movntps xmmword ptr [rcx-10h], xmm0, attempting to write beyond the allocated buffer at rcx = ffff860d714f1010, which is already out of bounds from the 0x1000-byte allocation at RAX. This confirms an out-of-bounds memory write due to the integer overflow in the allocation size calculation. Packet Size Overflow in Custom Windows Kernel Drivers (Signed Integer Overflow Long) I am demonstrating a custom Windows kernel driver that simulates the processing
Harnessing the Power of Cobalt Strike Profiles for EDR Evasion – Part 2
This blog post is a continuation of the previous entry “Harnessing the Power of Cobalt Strike Profiles for EDR Evasion“, we covered the malleable profile aspects of Cobalt Strike and its role in security solution evasion. Since the release of version 4.9, Cobalt Strike has introduced a number of significant updates aimed at improving operator flexibility, evasion techniques, and custom beacon implementation. In this post, we’ll dive into the latest features and enhancements, examining how they impact tradecraft and integrate into modern adversary simulation workflows.We will build an OPSEC-safe malleable C2 profile that incorporates the latest best practices and features. All codes and scripts referenced throughout this post are available on our GitHub repository. CS 4.9 – Post-Exploitation DLLs Cobalt Strike 4.9 introduces a new malleable C2 option, post-ex.cleanup. This option specifies whether or not to clean up the post-exploiation reflective loader memory when the DLL is loaded.Our initial attempt was to extract the post-exploitation DLLs within the Cobaltstrike JAR file: Upon checking for strings, nothing was detected as the DLLs are encrypted.When checking the documentation, we stumbled upon the POSTEX_RDLL_GENERATE hook. This hook takes place when the beacon is tasked to perform a post exploitation task such as keylogging, taking a screenshot, run Mimikatz, etc. According to the documentation, the raw Post-ex DLL binary is passed as the second argument. So we created a simple script, to save its value to the disk: Load the CNA script to the Cobal Strike client, and task the beacon to perform a post-exploitation task (this case a screenshot): Tasking the beacon with all the possible post-exploitation tasks, will provided us all the 10 post-ex DLLs: After extracting the DLLs, find all the strings within. We come up with the following set of profile configuration (shortened for readability) on preventing any potential static detection: The full profile with all the found strings can be found here. Note: It is highly recommended to replace the plaintext strings with something meaningful to the operator, since the changes will be outputted during or after the post-exploitation job. For example, in the image below we modified the string to show them in reverse during a port scan: Beacon Data Store Beacon data store allows us to stored items to be executed multiple times without having to resend the item. The default data store size is 16 entries, although this can be modified by configuring the stage.data_store_size option within your Malleable C2 profile to match your needs: WinHTTP Support Even though there is a new profile option to set a default internet library, we will not be including the option in our profile. The reason is that both libraries are heavily monitored from security solutions and there is no difference in terms of evasion between the libraries. What matters, is a good red team infrastructure which bypasses the network and memory detection.However, if you prefer to using a specific library (in this case winhttp.dll), the following option can be applied to the profile: CS 4.10 – BeaconGate BeaconGate is a feature that instructs Beacon to intercept supported API calls via a custom Sleep Mask. This allows the developer to implement advanced evasion techniques without having to gain control over Beacon’s API calls through IAT hooking in a UDRL, a method that is both complex and difficult to execute. It is recommended that you have the profile configured to proxy all the 23 functions that Cobalt Strike currently supports (as of 4.11). This can be done by setting the new stage.beacon_gate Malleable C2 option, as demonstrated below: The profile will also enable the use of BeaconGate where we later start playing with it. This is crucial, otherwise the changes will not be applied to exported Beacons. To get started, we need to work with Sleepmask-VS project from Fortra’s repository. If you prefer the Linux environment for development, you can use the Artifact Kit template instead. The BeaconGateWrapper function in /library/gate.cpp is where these API calls are handled. The following demo code checks if the the VirtualAlloc function is called. This enabled us to intercept the execution flow and add the evasion mechanism(s): The same can be applied for all the other supported high-level API functions. In this example, we are going to implement callback spoofing mechanism. Since the goal of this blog is to explain how the BeaconGate implementation works, we will use the HulkOperator’s code for the spoofing mechanism. The custom SetupConfig function expects a function pointer to spoof. This can be achieved by utilizing the functionCall structure. The functionPtr field holds the pointer to the WinAPI function you want to hook. To access the function’s name, you can use functionCall->function, and for the number of arguments, use functionCall->numOfArgs. Individual argument values can be retrieved via functionCall->args[i]. Here’s a proof of concept showing how the final code looks: Next time you export a Beacon, the spoof mechanism will be applied. The final implementation code can be found here. CS 4.11 – Novel Process Injection Cobalt Strike 4.11 introduced a custom process injection technique, ObfSetThreadContext. This injection technique, bypasses the modern detection of injected threads (where the start address of a thread is not backed by a Portable Executable image on disk) by making the use of various gadgets to redirect execution. By default, this new option will automatically set the injected thread start address as the (legitimate) remote image entry point, but can be additionally configured with custom module and offset as shown below: The option above sets ObfSetThreadContext as the default process injection technique. The next injection techniques servers as a backup when the default injection technique fails. This happens on certain cases (i.e. x86 -> x64 injection, self-injection etc.) CS 4.11 – sRDI with evasion capabilities According to Fortra, the version 4.11 ports Beacon’s default reflective loader to a new prepend/sRDI style loader with several new evasive features added. sRDI enables the transformation of DLL files into position-independent shellcode. It functions as a comprehensive PE loader, handling correct section permissions, TLS callbacks, and various integrity
Windows Kernel Buffer Overflow
In this blog post, we will explore buffer overflows in Windows kernel drivers. We’ll begin with a brief discussion of user-to-kernel interaction via IOCTL (input/output control) requests, which often serve as an entry point for these vulnerabilities. Next, we’ll delve into how buffer overflows occur in kernel-mode code, examining different types such as stack overflow, heap overflow, memset overflow, memcpy overflow, and more. Finally, we’ll analyze real-world buffer overflow cases and demonstrate potential exploitation in vulnerable drivers. Understanding IOCTL in Windows Kernel Drivers When working with Windows kernel drivers, understanding communication between user-mode applications and kernel-mode drivers is crucial. One common way to achieve this is through IOCTL (input/output control). IOCTL allows user-mode applications to send commands and data to drivers using the DeviceIoControl() function. In the kernel, these requests are received as I/O Request Packets (IRPs), specifically handled in the driver’s IRP_MJ_DEVICE_CONTROL function. The driver processes the IRP, performs the requested action, and optionally returns data to the user-mode application. We won’t dive too deep into the details, but we’ll cover the basics of IOCTL and how it functions through a simple driver example. This diagram is sourced from MatteoMalvica. Breaking Down IOCTL and IRP in Custom Driver Define a Custom IOCTL The line highlighted in red defines a custom IOCTL (input/output control) code using the CTL_CODE macro, which is used by both user-mode applications and kernel-mode drivers to communicate. Handling IOCTL Requests (IRP_MJ_DEVICE_CONTROL) In the driver, IOCTL requests are handled inside the IOCTL function, which is assigned to IRP_MJ_DEVICE_CONTROL. Before calling DeviceIoControl(), a user-mode application must first obtain a handle to the driver using CreateFile(). This handle is necessary to communicate with the driver and ensures that the IOCTL request is sent to the correct device. The handle is passed to DeviceIoControl() with a code and buffer which is processed by the function specified by IRP_MJ_DEVICE_CONTROL (in this case, the IOCTL function). Retrieving IRP Details Inside the IOCTL function, the driver extracts details about the request using IoGetCurrentIrpStackLocation(Irp). The Irp->AssociatedIrp.SystemBuffer parameter is used to access the user-mode buffer because that’s where the I/O manager places the buffer passed in. Meanwhile, irpSp->Parameters.DeviceIoControl.InputBufferLength provides the size of the received data, ensuring we handle it correctly. The stack pointer irpSp (retrieved using IoGetCurrentIrpStackLocation(Irp)) gives access to request-specific parameters, keeping buffer handling separate from other IRP structures to prevent memory corruption. Custom Function The IOCTL function processes user-mode requests sent via DeviceIoControl(). It checks the IOCTL code, retrieves the user buffer, and prints the received message if data is available. Finally, it sets the status and completes the request. Sending an IOCTL from User Mode to a Kernel Driver This simple program communicates with a Windows kernel driver by issuing an IOCTL (input/output control) request. It begins by opening a handle to the driver (\\.\Hello) and then transmits data using DeviceIoControl with the IOCTL_PROC_DATA code. If the operation succeeds, the driver processes the input; otherwise, an error message is displayed. Finally, the program closes the device handle and terminates. Running the User-Mode Application to Communicate with the Driver In our previous blog post, we explored kernel debugging and how to load a custom driver. Now, it’s time to run the user-mode application we just created. Once everything is set up, execute the .exe file, and we should see the message appear in DebugView or WinDbg. I’ll try to demonstrate this using DebugView to show how the communication works between user mode and kernel mode. Strange! As you can see in the image, the IOCTL code in user mode appears as 0x222000, but in kernel mode, it shows up as 0x800. This happens due to how CTL_CODE generates the full 32-bit IOCTL value. You can decode the IOCTL using OSR’s IOCTL Decoder tool: OSR Online IOCTL Decoder. Buffer Overflow A buffer overflow happens when more data is written to a buffer than it can hold, causing it to overflow into adjacent memory. Example: Imagine a glass designed to hold 250ml of water. If you pour 500ml, the extra water spills over—just like excess data spilling into unintended memory areas, potentially causing crashes or security vulnerabilities. Memory Allocation in Kernel Drivers and Buffer Overflow Risks In kernel driver development, proper memory management is even more critical than in user mode as there is no exception handling. When memory operations are not handled carefully, they can lead to buffer overflows, causing severe security vulnerabilities such as kernel crashes, privilege escalation, and even arbitrary code execution. For this article, I have developed a custom vulnerable driver to demonstrate how buffer overflows occur in kernel mode. Before diving into exploitation, let’s first explore the common memory allocation and manipulation functions used in Windows kernel drivers. Understanding these functions will help us identify how overflows happen and why they can be exploited. Understanding Kernel Memory Allocation & Vulnerabilities Memory allocation in kernel-mode drivers typically involves dynamically requesting memory from system pools or handling buffers passed from user-mode applications. Below are some common kernel memory allocation functions: 1. Heap-Based Buffer Overflow Here, the driver allocates memory from the NonPagedPool and copies user-supplied data into it using RtlCopyMemory without checking the buffer size. If the input is too large, it overflows into adjacent memory, corrupting the kernel heap. Example Vulnerability: Heap Overflow in Custom Driver Impact: Memory is allocated using ExAllocatePoolWithTag(NonPagedPool, 128, ‘WKL’), but RtlCopyMemory copies inputLength bytes without validation, leading to heap overflow if inputLength is greater than 128. 2. Stack-Based Buffer Overflow Here, the driver copies data from a user-supplied buffer to a small stack buffer using RtlCopyMemory, without verifying whether the destination buffer is large enough. If the input size is too large, it overwrites stack memory, potentially leading to system crashes or arbitrary code execution. Example Vulnerability: Stack Overflow in Custom Driver Impact: A small stack buffer, stackBuffer[100], is used, and RtlCopyMemory copies user data without checking if inputLength exceeds 100 bytes, causing a stack overflow. 3. Overwriting Memory with Memset Here, the driver fills a kernel buffer with a fixed value using memset, but
Understanding Windows Kernel Pool Memory
This blog covers Windows pool memory from scratch, including memory types, debugging in WinDbg, and analyzing pool tags. We’ll also use a custom tool to enumerate pool tags effortlessly and explore the segment heap. This is the first post in our VR (Vulnerability Research) & XD (Exploit Development) series, laying the foundation for heap overflows, pool spraying, and advanced kernel exploitation. What is the Windows Kernel Pool? The Windows Kernel Pool is a memory region used by the Windows kernel and drivers to store system-critical structures. In short, the Kernel Pool is the kernel-land version of the user-mode “heap”. Unlike user-mode memory, the kernel pool is shared across all processes, meaning any corruption in the kernel pool can crash the entire system (BSOD). Pool Internals Essentially, chunks that are allocated and placed into use or kept free are housed on either a page that is pageable or a page that is non-pageable. It may be interesting to know that two types of page exist. One is paged pool and the other is non-paged pool: To sum up, in order to take advantage of a heap corruption vulnerability, such as a use-after-free (UAF), a researcher will make a distinction as to whether it is a UAF on the non-paged pool, or a UAF on the paged pool. This is important because the paged pool and non-paged pool are different heaps, meaning they are separate locations in memory. In simpler terms, in order to replace the freed chunk, one must trigger the use-after-free event. This means that there are different object structures that can be placed on the non-paged pool or, respectively, the paged pool. Setting Up Kernel Debugging To get started with kernel debugging, you need to set up a Windows VM and configure it using the following admin commands. Typically, this setup requires two machines: a debuggee system that is our target Windows machine and a debugger system that we will be issuing debug commands from. For basic debugging, you can use local kernel debugging (lkd) on a single system. If you haven’t installed it yet, you can download the Windows Debugging Tools from Microsoft’s official website. Now, on your base machine, start WinDbg and try to enter the port number and key. After that, restart the virtual machine. The following screenshot shows kernel debugging on the virtual machine. First, if we want to see basic view pool memory in kernel debugging, we can use the !vm 1 command in WinDbg. This provides a detailed summary of system memory usage, including information about paged pool and non-paged pool allocations. Here, 157 KB represents the current available memory in the system, while 628 KB shows the total committed memory, meaning memory that has been allocated and is in use. This helps in analyzing memory consumption and potential allocation issues in kernel debugging. If you want to explore further, you can use The !vm 2 command in WinDbg. This provides a more detailed breakdown of memory usage across different pool types and memory zones compared to !vm 1. First, Windows provides the API ExAllocatePoolWithTag, which is the primary API used for pool allocations in kernel mode. Drivers use this to allocate dynamic memory, similar to how malloc works in user mode. Note: While ExAllocatePoolWithTag has been deprecated in favor of ExAllocatePool2, it is still widely used in existing drivers so we will examine this function. Later, I will show in detail how to develop a kernel driver by using this API for ExAllocatePoolWithTag. Here’s a short explanation of the key parameters used in Windows pool memory allocation: There’s more than one kind of _POOL_TYPE. If you want to explore more, you can check out Microsoft’s documentation. We are only focusing on paged pool, non-paged pool, and pool tag. It is also worth mentioning that every chunk of memory on a pool has a dedicated pool header structure inline in front of the allocation, which we will examine shortly in WinDbg. Now let’s use the !pool <address> command in WinDbg to analyze a specific memory address. We want to display details about a pool allocation, including its PoolType, PoolTag, BlockSize, and owning process/module. As we can see in the screenshot above, the memory allocation is categorized as paged pool. The details also tell us that the page is ‘Allocated’ or free, and we can discover the pool tag and sometimes the details will also give the binary name, driver name, and other information. Feel free to explore. So, the question arises—how do we find the address of a pool allocation? It’s actually quite simple! If we check the documentation, we can see that ExAllocatePoolWithTag is a function provided by NtosKrnl.exe (the Windows kernel). This means we can set breakpoints in WinDbg to track memory allocations in real-time. So first let’s examine the API with a command called x /D nt!ExAlloca* in debugger and then set a breakpoint. Let’s set a breakpoint at that specific address and see if it gets triggered. As shown below, we’re using the bp <address> command. As soon as we resume our debugger with the g (Go) command, it will automatically hit the breakpoint and we can view the information gathered from register. In WinDbg, when analyzing a call to ExAllocatePoolWithTag, you can check the registers to understand the allocation request: By monitoring these values, you can determine how drivers allocate memory and track specific pool tags in the kernel. We will demonstrate another register rax, but first try to Step Out and use gu. Now, let’s use !pool <address>. But isn’t this strange? We were looking for the tag NDNB. Here’s a handy tip: to find more interesting data, use the command !pool @rax 2. What is a Pool Tag? A Pool Tag is a four-character identifier that helps track memory allocations in Windows kernel pools (PagedPool, NonPagedPool, etc.). Every time memory is allocated using APIs like ExAllocatePoolWithTag, a pool tag is assigned to identify the allocation’s origin. This is useful for debugging memory leaks, analysing kernel memory
HuntingCallbacks – Enumerating the Entire system32
What are Callbacks? Certain Windows APIs support passing a function pointer as one of its parameters. This parameter is then called when a particular event is triggered, or a scenario takes place. Either way, this is usually user-controlled and can be abused from an offensive perspective by passing a malicious function or shellcode. Some of the popularly known callbacks are EnumChildWindows, RegisterClass, etc. There has been much research on the topic of callbacks where they have been abused to be used as call stack evasion, sleep timers, evasion from memory scanners, DLL loading and execution, etc. In this blog post, we try to uncover previously unknown callbacks that could be abused maliciously and produce a tool for the same. In this research, we aim to show how various static analysis methods can be used to create and automate the discovery of previously unknown scenarios and details, and we hope to see more similar contributions in the future. Overview There are two main types of branching in most assembly languages, Windows APIs that support callback opportunities usually take in a function pointer address (or a structure, as you will see) as one of its function arguments. The prototype of the function usually is of the format below: To find a potential function that allows the user to send in a callback function pointer, it needs to satisfy two conditions, As mentioned above, indirect calls are usually compiled to call reg, where reg is any of the registers. But since this research is focused on scanning through multiple windows DLLs, we need to accommodate Control Flow Guard or CFG. Control Flow Guard CFG, in a very simple sense, is a protection mechanism that is placed during compile time within Windows executables to prevent malicious indirect calls that could be abused using ROPs and other memory corruption bugs. Most if not all of the DLLs compiled that we will be scanning are compiled with Control Flow Guard, meaning any function pointers passed through a potential function will not result in a call reg signature. As shown in the following picture, when CFG is enabled, the callee is passed via the RAX register to another wrapper function called __guard_xfg_dispatch_icall_fptr. This function takes in the RAX register to perform the call further. The function __guard_xfg_dispatch_icall_fptr is just a wrapper for a bunch of other sub-functions, which eventually boils down to _guard_dispatch_icall_nop. This function is the final wrapper around the instruction jmp rax where the control flow is redirected toward the user-controlled function pointer. So our plan to scan for potential target Windows APIs that support a callback opportunity is as follows, Note: Some functions have certain checks and verifications that will need to be passed in order to have code reachability towards the CFG dispatch. While executing those potential target functions, the user might need to pass in other parameters or sometimes initialize structures. Prepping Miasm To implement my scanning idea, I chose to go with the Miasm framework because it contains a lot of necessary features that I needed to overengineer the solution. There are several alternatives that could be used that perhaps would be faster or offer a more efficient solution, but coming from a CTF background, Miasm seemed to be the simplest and most appropriate choice to pick, although the concepts described could very well be applicable to other tools. Miasm is a reverse engineering framework that supports symbolic execution and contains its own lifter and IR. It supports Use-Def graphs, in-built disassembler, PE loader, and emulator amongst other stuff. Each of these features will be necessary down the line. Before we proceed to scan every function, we need to initialize a bunch of things with Miasm that will need to be queried moving forward. First, we start by reading the file and passing it to the PE container, which is defined by ContainerPE class. The file data that was read can be passed to the function Container.from_string(data, loc_key) whose return value will be the ContainerPE class. Since we are going to scan an entire directory, we need to check if the files we are scanning are DLLs (some executables also have exported functions, but that is for a future update). We check if the file that we read is a DLL by parsing the Characteristics parameter from it’s COFF Header. If the Characteristics parameter contains the IMAGE_FILE_DLL flag, then we can be mostly sure that we are dealing with a DLL. We then proceed to extract all the functions in the export table of the DLL. Miasm uses an object called LocationDB() that, simply speaking, is an object that keeps track of the symbol names and the corresponding offsets of those symbols within the binary. This is sort of like a separate database populated whenever a new symbol for an offset is defined in the binary. Since the Import Address Table (IAT) contains symbols that will be required for us to check against, we need to populate the loc_db by adding their details and corresponding offset within the binary. This can be performed by parsing the IAT and retrieving both the offset and the symbol using Miasm’s in-built functions. Once all the previously mentioned objects and details are initialized, we can proceed to the core part of the hunt to find the potential target functions. Hunting for Indirect Function Calls In some DLLs, we notice that exported APIs behave as a proxy call to some other DLL; these functions are usually not in the .text section. Attempting to disassemble these exported API obviously will create an exception since there is no actual code block to be disassembled and the exported API is merely a proxy call. Hence, to avoid the tool from crashing in between, we need to add a check that identifies if the address of the exported API that we are scanning is within the .text section and contains the section flag 0x60000020. This can be implemented as follows: We then initialize the disassembly engine of Miasm and
LayeredSyscall – Abusing VEH to Bypass EDRs
Asking any offensive security researcher how an EDR could be bypassed will result one of many possible answers, such as removing hooks, direct syscalls, indirect syscalls, etc. In this blog post, we will take a different perspective to abuse Vectored Exception Handlers (VEH) as a foundation to produce a legitimate thread call stack and employ indirect syscalls to bypass user-land EDR hooks. Disclaimer: The research below must only be used for ethical purposes. Please be responsible and do not use it for anything illegal. This is for educational purposes only. Introduction EDRs use user-land hooks that are usually placed in ntdll.dll or sometimes within the kernel32.dll that are loaded into every process in the Windows operating system. They implement their hooking procedure typically in one of two ways: Hooks are not placed in every function within the target dll. Within ntdll.dll, most of the hooks are placed in the Nt* syscall wrapper functions. These hooks are often used to redirect the execution safely to the EDR’s dll to examine the parameters to determine if the process is performing any malicious actions. Some popular bypasses for circumventing these hooks are: There are more bypass techniques, such as blocking any unsigned dll from being loaded, blocking the EDR’s dll from being loaded by monitoring LdrLoadDll, etc. On the flipside, there are detection strategies that could be employed to detect and perhaps prevent the above-mentioned evasion techniques: The research presented below attempts to address the above detection strategies. LayeredSyscall – Overview The general idea is to generate a legitimate call stack before performing the indirect syscall while switching modes to the kernel land and also to support up to 12 arguments. Additionally, the call stack could be of the user’s choice, with the assumption that one of the stack frames satisfies the size requirement for the number of arguments of the intended Nt* syscall. The implemented concept could also allow the user to produce not only the legitimate call stack but also the indirect syscall in between the user’s chosen Windows API, if needed. Vectored Exception Handler (VEH) is used to provide us with control over the context of the CPU without the need to raise any alarms. As exception handlers are not widely attributed as malicious behavior, they provide us with access to hardware breakpoints, which will be abused to act as a hook. To note, the call stack generation mentioned here is not constructed by the tool or by the user, but rather performed by the system, without the need to perform unwinding operations of our own or separate allocations in memory. This means the call stack could be changed by simply calling another Windows API if detections for one are present. VEH Handler #1 – AddHwBp We register the first handler required to set up the hardware breakpoint at two key areas, the syscall opcode and the ret opcode, both within Nt* syscall wrappers within ntdll.dll. The handler is registered to handle EXCEPTION_ACCESS_VIOLATION, which is generated by the tool, just before the actual call to the syscall takes place. This could be performed in many ways, but we’ll use the basic reading of a null pointer to generate the exception. However, since we must support any syscall that the user could call, we need a generic approach to set the breakpoint. We can implement a wrapper function that takes one argument and proceeds to trigger the exception. Furthermore, the handler can retrieve the address of the Nt* function by accessing the RCX register, which stores the first argument passed to the wrapper function. Once retrieved, we perform a memory scan to find out the offset where the syscall opcode and the ret opcode (just after the syscall opcode) are present. We can do this by checking that the opcodes 0x0F and 0x05 are adjacent to each other like in the code below. Syscalls in Windows as seen in the following screenshot are constructed using the opcodes, 0x0F and 0x05. Two bytes after the start of the syscall, you can find the ret opcode, 0xC3. Hardware breakpoints are set using the registers Dr0, Dr1, Dr2, and Dr3 where Dr6 and Dr7 are used to modify the necessary flags for their corresponding register. The handler uses Dr0 and Dr1 to set the breakpoint at the syscall and the ret offset. As seen in the code below, we enable them by accessing the ExceptionInfo->ContextRecord->Dr0 or Dr1. We also set the last and the second bit of the Dr7 register to let the processor know that the breakpoint is enabled. As you can see in the image below, the exception is thrown because we are trying to read a null pointer address. Once the exception is thrown, the handler will take charge and place the breakpoints. Take note, once the exception is triggered, it is necessary to step the RIP register to the number of bytes required to pass the opcode that generated the exception. In this case, it was 2 bytes. After that, the CPU will continue the rest of the exception and this will perform as our hooks. We will see this performed in the second handler below. VEH Handler #2 – HandlerHwBp This handler contains three major parts: Part #1 – Handling the Syscall Breakpoint Hardware breakpoints, when executed by the system, generate an exception code, EXCEPTION_SINGLE_STEP, which is checked to handle our breakpoints. In the first order of the control flow, we check if the exception was generated at the Nt* syscall start using the member ExceptionInfo->ExceptionRecord->ExceptionAddress, which points to the address where the exception was generated. We proceed to save the context of the CPU when the exception was generated. This allows us to query the arguments stored, which according to Microsoft’s calling convention, are stored in RCX, RDX, R8, and R9, and also allows us to use the RSP register to query the rest of the arguments, which will be further explained later. Once stored, we can change the RIP to point to our demo function; in