Bypassing CrowdStrike Falcon and MDE


For the past couple of months, I have been diving into malware development in C with the goal of better understanding how offensive security professionals get past security solutions like EDR. Endpoint Detection and Response (EDR) solutions are critical for protecting organizations from malicious activities.

In this post, the goal is not to go in depth and provide the exact code I used to bypass different security solutions. Instead, I want to share my journey in malware development and help guide others in researching about the specific methods I used to evade these products.



What is EDR?

Endpoint Detection and Response (EDR) are advanced security solutions that monitor for suspicious activity and respond to threats on endpoints.

They consist of 2 parts: a user-mode application and a kernel-mode driver. Both parts gather information by collecting data on events such as system logs, network traffic, inter-process communications (IPCs), RPC calls, authentication attempts, and user activity. EDRs also analyze, correlate data and can automatically respond to these threats by isolating endpoints or terminating suspicious processes.

I would also like to add that the antivirus, or Next-Gen AV as they like to call it, ships with/takes advantage of the EDR telemetry, but you aren’t really “bypassing EDR” unless you mute telemetry or find a way that doesn’t generate the logs. AV alerts is not the same as EDR telemetry.

When most people talk about bypassing EDR, they mostly refer to the NGAV. With that being said, throughout this blog, I am bypassing their NGAV since my payloads are not killed nor do they generate any AV alerts, but the EDR telemetry still exists for Blue Teamers to use.

Lastly, all the data and alerts collected are sent and logged in a dashboard or console which can be used for monitoring and threat hunting.

dashboard.png


How EDRs Detect You

EDRs can monitor processes using the methods below:

Bypassing EDRs requires multiple techniques because they use multiple methods to monitor processes. It is also important to know that applying bypass techniques may let your loader avoid detection, but not the C2 payload in use. This could be due to well known anomalies associated with the C2: the C2 executes a noisy command such as spawning cmd.exe, the C2 uses recognizable indicators like Cobalt Strike’s default pipe names, and loading Sliver’s AMSI bypass.


Bypass Techniques

I decided to perform my testing on CrowdStrike Falcon and Microsoft Defender for Endpoint (MDE). I didn’t have to do bypass techniques for all of the methods mentioned in the previous section, therefore the techniques I will talk about below is specifically what I used to bypass these products.

Technique 1 - Custom Functions & Dynamic Linking

Using custom functions and dynamically linking functions is critical to the success of your loader. You do not want to have suspicious imported functions in your binary’s IAT because EDR like Falcon and Elastic are extremely sensitive to this. They will see imports associated with common TTPs and flag your binary instantly. This also helps prevent you from potentially executing hooked functions. I used a custom GetProcAddress to manually retrieve the address to different functions in ntdll.dll, and a custom GetModuleHandle to retrieve a handle to ntdll.dll so that I don’t have to execute those hooked functions. Implementing the functions manually also lets you better understand how they work, and lets you circumvent userland hooking.

For a custom GetModuleHandle, you can get the first item in the linked list containing all of the imported DLLs in your binary by going through PEB→Ldr to obtain the PEB_LDR_DATA structure. This structure contains the InMemoryOrderModuleList.Flink member which will give you a pointer to the first element of the doubly-linked list. This pointer will point to a LDR_DATA_TABLE_ENTRY structure which you can use to get the next item in the list by using the InInitializationOrderLinks.Flink member if defined or the Reserved2[0] member of the structure if undefined. Keep looping until the FullDllName.Buffer member of the structure matches the target DLL name to get a handle to.

For a custom GetProcAddress, it takes a bit more effort. You have to get the IMAGE_DOS_HEADER structure, then from there get the IMAGE_NT_HEADERS structure. After that you can obtain the IMAGE_OPTIONAL_HEADER structure. This structure lets you retrieve the export data directory of a DLL and loop through the addresses, names, and ordinals to retrieve the function you want.

Lastly I suggest using LdrLoadDll found in ntdll.dll to load a DLL instead of LoadLibraryA since that is one less thing that you need to import or dynamically link.

Note:
Make sure that you do not hardcode strings for the target functions or DLLs and instead use hashes because the strings could get flagged. If you need to use strings, like in the case of LdrLoadDll, then have it encrypted, and decrypt it at runtime right before passing the string into LdrLoadDll.

Technique 2 - Indirect Syscalls

EDR solutions will often perform a technique called API Inline hooking where they replace bytes of an API function with a custom version that will perform some additional actions before calling the original function. EDRs will do this to analyze what you are passing into functions. MDE doesn’t do userland API hooking, though Falcon does which is why I had to use indirect syscalls. Falcon will place a jmp instruction in the NT functions to point to their own code, instead of the typical mov eax, <ssn here>. Falcon will also scramble the NT functions so that they are in a different order then they would typically be, making it more difficult to just hard code or have a simpler solution for retrieving the SSN.

hooks.png

To avoid calling hooked functions, we can use the Direct Syscalls technique to perform the argument passing into the registers and execute the syscall instruction ourselves so that we don’t execute the hooked NT functions. However there are IoCs associated with this like having the syscall instruction hardcoded in your code, and having syscalls executed in the memory region of your process instead of Ntdll.dll.

Indirect Syscalls are better because we can perform all the preparation argument passing ourselves, and then redirect our instruction pointer to a syscall instruction in Ntdll with the jmp instruction. And since we already passed the data into the registers, everything will execute perfectly fine. To use Indirect Syscalls in your project, make a .asm file that contains a syscall stub for the NT functions you want to use, and call the indirect syscalls function in your C code instead of the normal NT function like NtAllocateVirtualMemory.

Here’s an example syscall stub to put in your code:

.code
extern SCAddr: DWORD            ; Declare the function in the C file to fetch the address to a syscall instruction and save it to SCAddr
extern NtAVMssn: DWORD          ; Declare the function in the C file to retrieve the SSN for each NT function you want to execute, and save it to NtAVMssn

MyNtAllocateVirtualMemory PROC
	mov r10, rcx                  ; Save registers.
	mov eax, NtAVMssn             ; Move our target functions SSN into the eax register
	jmp qword ptr [SCAddr]        ; Jump to the syscall instruction for indirect syscalls
	ret
MyNtAllocateVirtualMemory ENDP

I personally used Indirect Syscalls for allocating memory, writing memory, execution delay, unhooking, etc. Also quick tip: Falcon picks up on SysWhispers3 generated code while MDE doesn't. So that's why I had to do it manually in order to bypass Falcon.

Technique 3 - Anti-Sandbox

Anti-Sandbox techniques are malware author attempts to detect isolated, malware analyzing environments. A lot of techniques include checking the hardware specs of the sandbox, fast forwards, specific installed software, number of running processes, checking hostname, CPU temperature, user activity with mouse clicks, opened web browser tabs, etc. A friend of mine and I put as many public techniques we could find and implement into a single binary and uploaded it to VirusTotal to see the results of the different techniques. On a majority of the sandboxes, almost none of the public techniques worked against these sandboxes. There was 1 we found worked. I decided to dive deeper on my own, while my friend also dug deeper on his own for work. I came across a very low level function to detect VMs I could only find barely documented in a single random website, but I got it to work flawlessly. The downside is that only detecting for a VM isn’t a good idea because what if clients are running their machines on thin clients and connecting to a VDI?

Digging Deeper

I decided to dig deeper into user actions as a way to identify a sandbox. I came across a pretty unique way to identify user interaction through specific processes, and if user activity was found, it would write a file to disk as proof. It worked against every sandbox on VirusTotal where no files were written, but the files were written to my machines since I was actively using them. It was beautiful. I will not be sharing this technique yet 😔. Maybe soon! 😃

Takeaways

I highly recommend making a binary filled with different anti-sandbox technique and uploading it to VirusTotal so you can see what sandboxes look and act like. This will help you find techniques that work and discover new ones!

Technique 4 - Unhooking DLLs

As discussed in Technique 2 - Indirect Syscalls, EDRs will hook NT functions in ntdll.dll and other functions in other DLLs to see what you are doing. What we did earlier to evade EDR is performing indirect syscalls. However it isn’t easy and convenient to implement indirect syscalls for absolutely every function you want to execute, as some of them may not have syscalls, like LdrLoadDll. Additionally, once we load our shellcode for our favorite C2, chances are they will also not implement indirect syscalls and will use the normal WinAPI functions making our payloads get caught instantly. To circumvent this, we can perform something called DLL unhooking. This is where we replace the entire .text section (the actual executable code) of our hooked DLLs, with an unhooked version. The unhooked .text sections can come from the internet, suspended processes, mapping the DLL into memory, or reading the bytes from disk.

dlls.png

This is a good picture from ired.team. ired also shows some example code for the implementation.

One thing to note is that if you are reading the bytes from disk (CreateFileA then ReadFile), the offset of the text section from the base address will be 1024 bytes. If you are mapping the DLL from disk (CreateFileA, CreateFileMapping, MapViewOfFile) then the module will be loaded in memory and the offset of the text section to the base address will be 4096 bytes. You can also obtain this offset programmatically from the BaseofCode (and SizeOfCode for its size) members of the IMAGE_OPTIONAL_HEADER structure which is a member of the IMAGE_NT_HEADERS struct which is a member of the IMAGE_DOS_HEADER struct. You can also get the DLL's size from the Misc.VirtualSize member of the IMAGE_SECTION_HEADER structure which is immediately after the Nt Headers section (base + Nt Headers Relative Virtual Address).

Technique 5 - Drip Allocation

This is a method of allocating memory for your shellcode to be written into it. Typically we see online people allocate an entire massive memory region for their shellcode, change permissions to RWX and then execute it by creating a thread. This is terrible. This is a super common technique and EDR look for these common chain of events. We can bypass event-based detections but switching it up. Drip allocation is a thing I heard of from DripLoader. I didn’t look at DripLoader’s code, but the idea is simple: Slowly allocate 4kb (smallest possible) page sizes by putting it in a for loop with a delay at the end of each loop. Then in the next (or the same loop) write your shellcode at 4kb at a time. This will break that chain of events.

Technique 6 - Payload Decryption

Up to this point, our C2’s shellcode should have been encrypted in our loader to prevent having our memory region scanned and the shellcode picked up instantly. Before we execute our shellcode, we need to decrypt it. Some people may implement the decryption routine in their code however that’s unnecessary. Thankfully the Windows OS comes with many system functions that are for encryption/decryption. SystemFunction032 and 33 are specifically for using RC4. To use this all we have to do is just pass the base address of our memory region and the size, and it will get decrypted for us, no decryption routine of our own necessary! This will largely cut down on the size of the binary.

We can utilize SystemFunction032 to RC4 encrypt/decrypt a memory region, specifically the region containing our shellcode. From there we can modify our memory region to RX and then execute it.

Example Code

typedef struct {
	DWORD	Length;         // Size of the data to encrypt/decrypt
	DWORD	MaximumLength;  // Max size of the data to encrypt/decrypt, although often its the same as Length (USTRING.Length = USTRING.MaximumLength = X)
	PVOID	Buffer;         // The base address of the data to encrypt/decrypt
} USTRING;

typedef NTSTATUS(WINAPI* fnSystemFunction032)(struct USTRING* data, struct USTRING* key);

fnSystemFunction032 pSystemFunction032 = (fnSystemFunction032)GetProcAddress(LoadLibraryA("advapi32.dll"), "SystemFunction032");
 
unsigned char Payload[] = { shellcode here };
DWORD PayloadSize = sizeof(Payload);
unsigned char Rc4Key[] = { key here };
DWORD KeySize = sizeof(Rc4Key);

USTRING Data = {
	.Length         = PayloadSize, // Parameter name from function
	.MaximumLength  = PayloadSize, // Typically same as length
	.Buffer         = Payload
}; // pass our shellcode and its size into the USTRING structure

USTRING	Key = {
	.Length         = KeySize,
	.MaximumLength  = KeySize, // same as length
	.Buffer         = Rc4Key
}; // Pass our key and its size into the USTRING structure
 
pSystemFunction032(&Data, &Key); // Pass it in here. The shellcode in Data will be decrypted.

Note that if you are going to use a custom version of GetProcAddress and GetModuleHandle, you will need to put Cryptsp.dll since Advapi32.dll SystemFunction032 is a forwarded function pointing to Cryptsp.dll SystemFunction032.

Technique 7 - Payload Execution

Next up is payload execution. I typically see online people create threads and point it directly at their decrypted code. The chain of events of allocating memory → writing shellcode → create thread to execute is extremely common and is picked up very easily. Additionally, tools like Sysmon log when threads are created. It is best to execute your payload in a less suspicious way. Some examples of code execution is using APCs, creating function pointers, hijacking the RIP register of a created sacrificial thread, callback functions, and fibers.

I personally used callback functions to execute my payload, though there are many other methods that work just fine.

Technique 8 - Removing Unnecessary Imports & Data Entropy

When we compile our program in Visual Studio, it will by default have the CRT library linked to it. The CRT library exports lots of functions for your binary to use that make life easy. Things like rand, printf, memcpy, malloc, sprintf_s, etc.

imports.png

You can view the imports of your binary using: dumpbin /imports DirectSyscalls.exe or by using pestudio which will show you a good amount of information, and even highlight what imports look bad and what MITRE Technique ID they have associated with them. I found this particularly useful.

pestudio.png

A lot of these functions I don’t even use, yet here they are, making my binary more suspicious. To get past this, you need to remove the CRT library which will get rid of all the DLLs and unused functions from your Import Address Table (IAT). The downside to this is that your need to make sure you have custom implementations of the removed functions. To be so real, Google and ChatGPT are your best friends here. They can cook something up that you can slightly modify to get working with your code quickly.

I also suggest removing debug information and remember to disable security checks because you will get errors if you leave security checks enabled and have the CRT library removed.

Additionally I wanted to compare Falcon and MDE here. Falcon is a lot more sensitive to entropy than MDE. For example I stored my shellcode in the resource section of my binary and Falcon flagged it immediately, resulting in me having to stage the payload. However, MDE didn’t care when I had 100kb of encrypted shellcode in my resource section. It said absolutely nothing suspicious going on here!


Wrapping Up

The combination of these techniques resulted in me bypassing both Microsoft Defender for Endpoint and CrowdStrike Falcon for initial access and obtaining command execution for simple commands without generating any alerts.

Of course I still have a lot to learn as I dive deeper into malware development but this has been a great journey so far. I also need to develop my own post-exploitation tools to avoid detection when trying to perform extremely malicious actions like dumping LSASS.

I really hope this blog was able to explain some topics at a high level, and maybe it also helps someone understand the different types of techniques necessary to bypass these security solutions.