Introduction

In malware development, evasion often matters more than functionality. A beacon with advanced capabilities is useless if it gets flagged and killed immediately. One of the trickiest challenges is implementing execution delays. Traditional sleep mechanisms like NtDelayExecution() and NtWaitForSingleObject() are heavily monitored by EDRs. These syscalls are low-hanging fruit for security products, call them and you’ve essentially announced your presence.

In this post, we’ll explore a simple technique to delay execution without touching the usual syscalls. No NtDelayExecution(). No NtWaitForSingleObject(). Just a method that flies under the radar while giving us the control we need.

Why Does My Beacon’s Nap Keep Getting Interrupted by EDRs?

The first thing that raises suspicion for EDRs and memory scanning tools is the state of the thread. To better understand why, we need to look at what these syscalls actually do.

Both of those syscalls force a thread into a waiting state, but for different reasons—one waits for an object to be signaled, the other waits for a timer to expire.

  1. Syscall Transition & Thread State Change
    The thread transitions from user mode to kernel mode via the SSDT. The thread’s state in the KTHREAD structure changes from Running to Waiting. For NtWaitForSingleObject(), the thread is added to an object’s wait queue. For NtDelayExecution(), a kernel timer is created and linked to the thread. In both cases, the scheduler removes the thread from the ready queue and marks it as non-schedulable.

  2. Context Switch & Reactivation
    A context switch occurs—the CPU switches to another ready thread, and the waiting thread consumes no CPU cycles. Once the wait condition is satisfied (object signaled or timer expires), the kernel marks the thread as Ready, places it back in the scheduler’s queue, and execution resumes in user mode.

Why EDRs Love These:
Both syscalls force the thread into a Waiting state, which is easily detectable. EDRs can enumerate threads, inspect their states, and flag any thread sitting in a wait state with suspicious timing patterns. Combined with heavy hooking on these syscalls, traditional sleep mechanisms are extremely easy to detect.

Now that we know how EDRs catch us, we can figure out a way to avoid detection. We’ll do this by creating our own delay execution routine that doesn’t change the thread’s state or rely on system calls.

The Solution: Using KUSER_SHARED_DATA

The simplest way to delay execution without calling suspicious functions is by using system time and we can read it without making any API calls. The trick is the KUSER_SHARED_DATA structure, a special memory page that Windows maps into every running process.

This structure contains system information like performance counters and version details, but what we care about is the system time. The best part? KUSER_SHARED_DATA lives at a fixed memory address, so we always know where to find it and its data.

Reading System Time

The system time is stored at memory address 0x7FFE0014 (on 32-bit) or 0x7FFE0008 (on 64-bit). It’s a 64-bit number representing time in FILETIME format (100-nanosecond ticks since January 1, 1601). Windows automatically updates this value constantly.

Why This Works:

  • No syscalls just reading memory directly
  • No thread state changes your thread stays Running
  • Nothing for EDRs to hook we’re just reading a number from memory
  • Looks completely normal processes read shared memory all the time

Implementation

We’ll use assembly for this because it gives us direct access to CPU registers. Compilers often generate extra code that uses the stack. But don’t worry this same technique works in C/C++, Rust, or any language that lets you read from a specific memory address.

How the Delay Works

The idea is simple:

  • Take the number of seconds we want to wait
  • Convert it to Windows time format
  • Add it to the current time to get our “stop time”
  • Keep checking the current time until we reach the stop time
section .text
    global DelayExecution

DelayExecution:
    mov rax, 0x989680       ; 10,000,000 - converts seconds to file time format
    mul rcx                 ; multiply by the seconds we want to wait (passed as argument)
    mov rcx, 0x7FFE0014     ; address where Windows stores the current time
    add rax, [rcx]          ; add our wait time to current time = our stop time

loop:
    cmp rax, [rcx]          ; check: have we reached the stop time yet?
    ja loop                 ; if not, keep looping
    ret                     ; if yes, we're done - return

Here’s the same thing in C:

void DelayExecution(int seconds) {
    unsigned long long *CurrentTime = (unsigned long long*)0x7FFE0014;
    unsigned long long StopTime = (seconds * 0x989680) + *CurrentTime;

    while (1) {
        if (*CurrentTime > StopTime) {
            break;
        }
    }
    return;
}

Making It Better

This technique works, but we can make it harder to detect:

Add Junk Code:
Instead of just looping and checking time, throw in some useless calculations or function calls between checks. This makes it look less obvious to anyone analyzing the code and confuses automated detection tools.

Skip Time Entirely:
Another option is to forget about time and just run slow operations instead. Things like multiplying huge matrices or doing complex math problems naturally take time to finish. The downside? You can’t control exactly how long it takes it depends on the CPU and system load.

Both approaches make your delay look more like normal program behavior and less like a beacon sleeping.