Stalker

Frida Stalker — Technical Reference

Source: frida-gum/gum/backend-arm64/gumstalker-arm64.c Supported architectures: AArch64 (ARM64), Intel 64 (x86-64), IA-32 Primary use platforms: Android, iOS (AArch64); Linux, macOS, Windows (x86-64/IA-32)


Architecture Overview

Stalker operates on one basic block at a time:

  1. A block starting at real_address is read by GumArm64Relocator
  2. An instrumented copy is written to a slab by GumArm64Writer
  3. Branch/return instructions at the end of the block are virtualized — replaced with code that re-enters Stalker via an entry gate
  4. The entry gate calls gum_exec_ctx_replace_current_block_with() to instrument the next block
  5. Previously instrumented blocks are cached in a hashtable keyed on real_address; on cache hit, control jumps directly to the instrumented copy (subject to trustThreshold)

Prerequisite knowledge: Capstone disassembler (cs_insn), GumArm64Writer/GumArm64Relocator APIs, AArch64 calling conventions (AAPCS64), AArch64 Link Register (X30/LR) semantics.


Entry Points

FunctionDescription
gum_stalker_follow_me(self, transformer, sink)Follow the current thread; uses LR to find the start address
gum_stalker_follow(self, thread_id, transformer, sink)Follow another thread; uses ptrace/gum_process_modify_thread() to inject
gum_exec_ctx_replace_current_block_with(ctx, start_address)Re-enter Stalker to instrument the next block (called by entry gates)

gum_stalker_follow_me — Assembly Bootstrap (AArch64)

1gum_stalker_follow_me:
2  stp x29, x30, [sp, -16]!   ; save FP and LR
3  mov x29, sp
4  mov x3, x30                 ; pass original LR as 4th arg (return address)
5  bl  _gum_stalker_do_follow_me
6  ldp x29, x30, [sp], 16
7  br  x0                      ; branch to instrumented entry point returned in X0

Following Another Thread (gum_stalker_follow)

1void gum_stalker_follow(GumStalker *self, GumThreadId thread_id,
2                        GumStalkerTransformer *transformer, GumEventSink *sink);

CPU Context Structure (AArch64)

 1typedef GumArm64CpuContext GumCpuContext;
 2
 3struct _GumArm64CpuContext {
 4  guint64 pc;
 5  guint64 sp;       /* X31 */
 6  guint64 x[29];
 7  guint64 fp;       /* X29 — frame pointer */
 8  guint64 lr;       /* X30 */
 9  guint8  q[128];   /* FPU/NEON/CRYPTO (SIMD) registers */
10};

JavaScript API

1Stalker.follow([threadId, options])   // start stalking threadId (or current thread)
2Stalker.unfollow([threadId])
3Stalker.exclude(range)                // { base, size } — exclude a memory range
4Stalker.parse(events)                 // parse raw binary event buffer → JS array of tuples
5Stalker.addCallProbe(address, callback[, data])  // add probe for a function address
6Stalker.removeCallProbe(id)
7Stalker.trustThreshold                // integer property (default: 1)

Options Object

 1{
 2  events: {
 3    call: false,       // emit GUM_CALL event on each call instruction
 4    ret: false,        // emit GUM_RET event on each return
 5    exec: false,       // emit GUM_EXEC event on each instruction
 6    block: false,      // emit GUM_BLOCK event when a block is executed
 7    compile: false,    // emit GUM_COMPILE event when a block is instrumented
 8  },
 9  onReceive(events) {},       // raw binary blob, parse with Stalker.parse()
10  onCallSummary(summary) {},  // aggregated { address: callCount } map (more efficient)
11  transform(iterator) {},     // custom transformer (see Transformer section)
12  data: ptr("0x...")          // user data passed to C transformer/callout
13}

Configuration Options

trustThreshold

Controls how many times a block must execute unchanged before it is re-used without re-comparison.

ValueBehavior
-1Never trust; always re-instrument (slowest)
0Trust immediately from first execution
N (default: 1)Trust after N consecutive executions with identical bytes

Stalker.exclude(range)

1Stalker.exclude({ base: Module.getBaseAddress('libc.so'), size: module.size });

Memory: Slabs

Instrumented code is stored in 4 MB slabs (GUM_CODE_SLAB_MAX_SIZE = 4 * 1024 * 1024).

1struct _GumSlab {
2  guint8    *data;         // tail start (after header)
3  guint      offset;       // current write position in tail
4  guint      size;         // usable tail size
5  GumSlab   *next;         // singly-linked list of slabs
6  guint      num_blocks;
7  GumExecBlock blocks[];   // zero-length array; actual entries in header region
8};

Block Allocation

1#define GUM_EXEC_BLOCK_MIN_SIZE 1024  // minimum bytes required before allocating new slab

GumExecBlock Fields

 1struct _GumExecBlock {
 2  GumExecCtx    *ctx;
 3  GumSlab       *slab;
 4  guint8        *real_begin;      // start of original code
 5  guint8        *real_end;        // end of original code
 6  guint8        *real_snapshot;   // copy of original bytes (= code_end; for trust comparison)
 7  guint8        *code_begin;      // start of instrumented copy
 8  guint8        *code_end;        // end of instrumented copy
 9  GumExecBlockFlags flags;
10  gint           recycle_count;   // trust threshold counter
11};

Layout in slab tail per block: [instrumented code][original snapshot][BRK #14 debug marker]


Helpers (Inline Code Fragments)

Six helper functions are written into each slab (or reused from a nearby slab within ±128 MB):

HelperFunction
last_prolog_minimalSave caller-saved registers (minimal context)
last_epilog_minimalRestore caller-saved registers
last_prolog_fullSave all registers matching GumArm64CpuContext layout
last_epilog_fullRestore all registers
last_stack_pushPush GumExecFrame onto side-stack
last_stack_pop_and_goPop frame and branch to instrumented return target

Helpers are called with direct BL (±128 MB range). If the new slab is beyond 128 MB from existing helpers, fresh copies are written into the new slab.


Context Save/Restore

Context Types

TypeRegisters SavedWhen Used
GUM_PROLOG_MINIMALX0–X18, X29, X30, Q0–Q7, NZCV flagsDefault; all code paths that don’t need callout/probe visibility
GUM_PROLOG_FULLAll registers matching GumArm64CpuContextRequired by Stalker.addCallProbe() and iterator.putCallout()

Prologue Inline Stub (written at each instrumented block)

1// Written by gum_exec_ctx_write_prolog():
2stp x19, lr, [sp, -(16 + GUM_RED_ZONE_SIZE)]!   // save X19 (scratch) and LR; skip redzone
3bl  <last_prolog_minimal_or_full>

Red zone: 128-byte region below SP that a leaf function may use; prologue advances SP past it before touching the stack.

Epilogue Inline Stub

1// Written by gum_exec_ctx_write_epilog():
2bl  <last_epilog_minimal_or_full>
3ldp x19, x20, [sp, (16 + GUM_RED_ZONE_SIZE)]     // restore X19 and X20 (post-adjust)

Reading Registers from Saved Context

1// Emits code (does not read directly) to load source_register → target_register:
2gum_exec_ctx_load_real_register_into(ctx, target_reg, source_reg, gc);

Frames (Side-Stack)

1struct _GumExecFrame {
2  gpointer real_address;   // original return address
3  gpointer code_address;   // instrumented landing pad address
4};

last_stack_push (pseudo-code)

 1void last_stack_push_helper(gpointer real_address, gpointer code_address) {
 2  GumExecFrame **x16 = &ctx->current_frame;
 3  GumExecFrame  *x17 = *x16;
 4  if ((x17 & (page_size - 1)) != 0) {  // not page-aligned = not exhausted
 5    x17--;
 6    x17->real_address  = real_address;
 7    x17->code_address  = code_address;
 8    *x16 = x17;
 9  }
10  // if exhausted: silently discard (fall back to slow path on return)
11}

last_stack_pop_and_go (pseudo-code)

 1// Called by virtualized RET; x16 = return register value
 2void last_stack_pop_and_go_helper(gpointer x16) {
 3  GumExecFrame **x0 = &ctx->current_frame;
 4  GumExecFrame  *x1 = *x0;
 5  gpointer x17 = x1->real_address;
 6  if (x17 == x16) {                         // fast path: expected return
 7    x17 = x1->code_address;                 // go to instrumented landing pad
 8    x1++;
 9    *x0 = x1;                               // pop frame
10    goto x17;
11  } else {                                  // slow path: unexpected return
12    *x0 = ctx->first_frame;                 // clear entire side-stack
13    ctx->return_at = x16;
14    minimal_prolog();
15    gum_exec_ctx_replace_current_block_from_ret(ctx, ctx->return_at);
16    minimal_epilog();
17    goto ctx->resume_at;                    // branch to newly instrumented block
18  }
19}

Transformer

1// Default transformer — passes all instructions through unchanged:
2static void gum_default_stalker_transformer_transform_block(
3    GumStalkerTransformer *transformer,
4    GumStalkerIterator    *iterator,
5    GumStalkerOutput      *output)
6{
7  while (gum_stalker_iterator_next(iterator, NULL))
8    gum_stalker_iterator_keep(iterator);
9}

Custom Transformer (JavaScript)

 1Stalker.follow(threadId, {
 2  transform(iterator) {
 3    let instruction = iterator.next();
 4    do {
 5      if (instruction.mnemonic === 'bl') {
 6        iterator.putCallout(onCall);   // insert a callout before this instruction
 7      }
 8      iterator.keep();                 // emit the instruction as-is
 9    } while ((instruction = iterator.next()) !== null);
10  }
11});
12
13function onCall(context) {
14  // context is CpuContext — read/write registers
15  console.log('call to', context.pc);
16}

Callout Structure

1typedef void (* GumStalkerCallout)(GumCpuContext *cpu_context, gpointer user_data);
2
3struct _GumCalloutEntry {
4  GumStalkerCallout  callout;
5  gpointer           data;
6  GDestroyNotify     data_destroy;
7  gpointer           pc;
8  GumExecCtx        *exec_context;
9};

Virtualizing Branch/Return Instructions

Termination Conditions (EOB / EOI)

StateMeaningTriggered by
EOB (End of Block)Block ends hereAny branch, call, or return instruction
EOI (End of Input)No valid instructions followUnconditional branch or return (not calls — callee returns)

gum_exec_block_virtualize_branch_insn

Handles: unconditional branches (B, BR), conditional branches (B.cond, CBZ, CBNZ, TBZ, TBNZ), and call instructions (BL, BLR, BLRAA, BLRAAZ, BLRAB, BLRABZ).

Conditional branch output pattern:

1INVERSE_CONDITION  is_false        ; e.g. CBZ → CBNZ
2  jmp_transfer_code(target, cond_entry_gate)
3is_false:
4  jmp_transfer_code(fallthrough_addr, cond_entry_gate)

Call instruction handling:

  1. Emit call event (if configured)
  2. Check for registered call probes → emit probe call code if any
  3. If target in excluded range (immediate only): emit original call + jmp_transfer_code to re-enter Stalker at return address using excluded_call_imm gate
  4. If target in register: emit runtime check against excluded ranges via gum_exec_block_check_address_for_exclusion()
  5. Else: emit gum_exec_block_write_call_invoke_code():
    • Emit entry gate call to instrument callee
    • Call last_stack_push helper with real and instrumented return addresses
    • Emit landing pad (initially re-enters Stalker; may be backpatched to direct branch)
    • Branch to instrumented callee via exec_generated_code

gum_exec_block_virtualize_ret_insn

  1. Emit return event (if configured)
  2. Emit ret_transfer_code which loads return register into X16 and jumps to last_stack_pop_and_go

AArch64 Call Instructions (all update LR with return address)

BL, BLR, BLRAA, BLRAAZ, BLRAB, BLRABZ


Entry Gates

1#define GUM_ENTRYGATE(name) gum_exec_ctx_replace_current_block_from_##name
2#define GUM_DEFINE_ENTRYGATE(name)                                    \
3  static gpointer GUM_THUNK GUM_ENTRYGATE(name)(                      \
4      GumExecCtx *ctx, gpointer start_address) {                      \
5    if (counters_enabled) total_##name##s++;                          \
6    return gum_exec_ctx_replace_current_block_with(ctx, start_address);\
7  }

Defined entry gates:

Gate NameTrigger
call_immCall to immediate address
call_regCall via register
post_call_invokeLanding pad re-enter after call
excluded_call_immCall to excluded range (immediate)
excluded_call_regCall to excluded range (register)
retReturn instruction (unexpected; slow path)
jmp_immUnconditional branch to immediate
jmp_regUnconditional branch via register
jmp_cond_ccConditional branch (B.cond)
jmp_cond_cbzCBZ
jmp_cond_cbnzCBNZ
jmp_cond_tbzTBZ
jmp_cond_tbnzTBNZ
jmp_continuationExhausted block continuation

Events

Event Types

TypeConstantDescription
CallGUM_CALLA call instruction was executed
ReturnGUM_RETA return instruction was executed
ExecuteGUM_EXECA single instruction was executed
BlockGUM_BLOCKA basic block was executed
CompileGUM_COMPILEA basic block was instrumented

Event Emitter Functions

Each calls gum_exec_block_write_unfollow_check_code() to embed an unfollow check.

Event Delivery


Call Probes

1const id = Stalker.addCallProbe(targetAddress, (args) => {
2  // args[0], args[1], ... — function arguments
3}, optionalData);
4
5Stalker.removeCallProbe(id);

Backpatching (Optimization)

Deterministic branches can bypass Stalker entirely after first execution:

Branch TypeOptimization
Unconditional branch to immediateReplace with direct branch to instrumented block
Conditional branch (both paths known)Replace condition + direct branches to both instrumented targets
Indirect branch (BR X0) with stable targetEmit compare + direct branch if matches; fallback to Stalker if not

Controlled by trustThreshold. Landing pads start as Stalker re-entry code; backpatched to direct branches once trust is established.


Unfollow

1void gum_stalker_unfollow_me(GumStalker *self);
2void gum_stalker_unfollow(GumStalker *self, GumThreadId thread_id);

Unfollow Current Thread

  1. Set ctx->state = GUM_EXEC_CTX_UNFOLLOW_PENDING
  2. Each event emission calls gum_exec_block_write_unfollow_check_code()gum_exec_ctx_maybe_unfollow() at runtime
  3. gum_exec_ctx_maybe_unfollow() checks state; if pending and pending_calls == 0: calls gum_exec_ctx_unfollow() → sets resume_at, clears TLS context key, sets state to GUM_EXEC_CTX_DESTROY_PENDING
  4. Special case: if the next block is gum_stalker_unfollow_me itself, gum_exec_ctx_replace_current_block_with() returns the original uninstrumented address — the thread exits Stalker without further instrumentation

Unfollow Another Thread

Freeze/Thaw

On systems without RWX page support (W^X enforcement):


Miscellaneous

Exclusive Load/Store Handling

AArch64 exclusive load/store pairs (LDXR/STXR family) are used for atomic primitives (mutexes, semaphores). Inserting event instrumentation between them would break the exclusive monitor.

Solution: Track exclusive_load_offset in the iterator; suppress non-essential instrumentation for up to 4 instructions following an exclusive load.

 1// Exclusive load instructions reset the counter:
 2case ARM64_INS_LDAXR: case ARM64_INS_LDXR: /* ... */
 3  gc->exclusive_load_offset = 0;
 4
 5// Exclusive store instructions clear the guard:
 6case ARM64_INS_STXR: case ARM64_INS_STLXR: /* ... */
 7  gc->exclusive_load_offset = GUM_INSTRUCTION_OFFSET_NONE;
 8
 9// Instrumentation suppressed while inside exclusive window:
10if (gc->exclusive_load_offset == GUM_INSTRUCTION_OFFSET_NONE)
11  gum_exec_block_write_exec_event_code(...);

Exhausted Blocks

If fewer than GUM_EXEC_BLOCK_MIN_SIZE (1024) bytes remain in the slab tail, the iterator returns FALSE early. gum_exec_ctx_obtain_block_for() treats this as an implicit B <next_instruction> using the jmp_continuation entry gate — the block is split and the remainder becomes a new block in a new slab.

Syscall Virtualization (Linux/AArch64 only)

Handles SVC instruction for clone(2) syscall to prevent new threads inheriting Stalker instrumentation:

1// Pseudo-code for generated instrumentation:
2if x8 == __NR_clone:
3  x0 = do_original_syscall()
4  if x0 == 0:           // child thread
5    goto original_instruction_address  // exit Stalker; run uninstrumented
6  return x0             // parent thread: continue normally
7else:
8  return do_original_syscall()

AArch64 syscall convention: args in X0–X7, syscall number in X8, return value in X0.

Pointer Authentication (iOS ARMv8.3+)

PAC uses unused high bits of pointers to store cryptographic authentication codes:

1pacia lr, sp      ; sign LR using SP and key → LR'
2stp fp, lr, [sp, #-FRAME_SIZE]!
3; ...
4ldp fp, lr, [sp], #FRAME_SIZE
5autia lr, sp      ; verify LR'; fault if corrupted
6ret lr

When reading pointer registers (e.g., for indirect branch target or return address), Stalker must strip PAC before use:

1gum_arm64_writer_put_xpaci_reg(cw, reg);   // strip PAC from reg

Applies to: determining branch/return destinations, all indirect pointer reads from application registers.


Performance Notes