LayeredSyscall – Abusing VEH to Bypass EDRs
Asking any offensive security researcher how an EDR could be bypassed will result one of many possible answers, such as removing hooks, direct syscalls, indirect syscalls, etc. In this blog post, we will take a different perspective to abuse Vectored Exception Handlers (VEH) as a foundation to produce a legitimate thread call stack and employ indirect syscalls to bypass user-land EDR hooks. Disclaimer: The research below must only be used for ethical purposes. Please be responsible and do not use it for anything illegal. This is for educational purposes only. Introduction EDRs use user-land hooks that are usually placed in ntdll.dll or sometimes within the kernel32.dll that are loaded into every process in the Windows operating system. They implement their hooking procedure typically in one of two ways: Hooks are not placed in every function within the target dll. Within ntdll.dll, most of the hooks are placed in the Nt* syscall wrapper functions. These hooks are often used to redirect the execution safely to the EDR’s dll to examine the parameters to determine if the process is performing any malicious actions. Some popular bypasses for circumventing these hooks are: There are more bypass techniques, such as blocking any unsigned dll from being loaded, blocking the EDR’s dll from being loaded by monitoring LdrLoadDll, etc. On the flipside, there are detection strategies that could be employed to detect and perhaps prevent the above-mentioned evasion techniques: The research presented below attempts to address the above detection strategies. LayeredSyscall – Overview The general idea is to generate a legitimate call stack before performing the indirect syscall while switching modes to the kernel land and also to support up to 12 arguments. Additionally, the call stack could be of the user’s choice, with the assumption that one of the stack frames satisfies the size requirement for the number of arguments of the intended Nt* syscall. The implemented concept could also allow the user to produce not only the legitimate call stack but also the indirect syscall in between the user’s chosen Windows API, if needed. Vectored Exception Handler (VEH) is used to provide us with control over the context of the CPU without the need to raise any alarms. As exception handlers are not widely attributed as malicious behavior, they provide us with access to hardware breakpoints, which will be abused to act as a hook. To note, the call stack generation mentioned here is not constructed by the tool or by the user, but rather performed by the system, without the need to perform unwinding operations of our own or separate allocations in memory. This means the call stack could be changed by simply calling another Windows API if detections for one are present. VEH Handler #1 – AddHwBp We register the first handler required to set up the hardware breakpoint at two key areas, the syscall opcode and the ret opcode, both within Nt* syscall wrappers within ntdll.dll. The handler is registered to handle EXCEPTION_ACCESS_VIOLATION, which is generated by the tool, just before the actual call to the syscall takes place. This could be performed in many ways, but we’ll use the basic reading of a null pointer to generate the exception. However, since we must support any syscall that the user could call, we need a generic approach to set the breakpoint. We can implement a wrapper function that takes one argument and proceeds to trigger the exception. Furthermore, the handler can retrieve the address of the Nt* function by accessing the RCX register, which stores the first argument passed to the wrapper function. Once retrieved, we perform a memory scan to find out the offset where the syscall opcode and the ret opcode (just after the syscall opcode) are present. We can do this by checking that the opcodes 0x0F and 0x05 are adjacent to each other like in the code below. Syscalls in Windows as seen in the following screenshot are constructed using the opcodes, 0x0F and 0x05. Two bytes after the start of the syscall, you can find the ret opcode, 0xC3. Hardware breakpoints are set using the registers Dr0, Dr1, Dr2, and Dr3 where Dr6 and Dr7 are used to modify the necessary flags for their corresponding register. The handler uses Dr0 and Dr1 to set the breakpoint at the syscall and the ret offset. As seen in the code below, we enable them by accessing the ExceptionInfo->ContextRecord->Dr0 or Dr1. We also set the last and the second bit of the Dr7 register to let the processor know that the breakpoint is enabled. As you can see in the image below, the exception is thrown because we are trying to read a null pointer address. Once the exception is thrown, the handler will take charge and place the breakpoints. Take note, once the exception is triggered, it is necessary to step the RIP register to the number of bytes required to pass the opcode that generated the exception. In this case, it was 2 bytes. After that, the CPU will continue the rest of the exception and this will perform as our hooks. We will see this performed in the second handler below. VEH Handler #2 – HandlerHwBp This handler contains three major parts: Part #1 – Handling the Syscall Breakpoint Hardware breakpoints, when executed by the system, generate an exception code, EXCEPTION_SINGLE_STEP, which is checked to handle our breakpoints. In the first order of the control flow, we check if the exception was generated at the Nt* syscall start using the member ExceptionInfo->ExceptionRecord->ExceptionAddress, which points to the address where the exception was generated. We proceed to save the context of the CPU when the exception was generated. This allows us to query the arguments stored, which according to Microsoft’s calling convention, are stored in RCX, RDX, R8, and R9, and also allows us to use the RSP register to query the rest of the arguments, which will be further explained later. Once stored, we can change the RIP to point to our demo function; in