Payload Obfuscation for Red Teams
| Instructor: | Duncan Ogilvie |
|---|---|
| Duration: | 2 days |
| Format: | On-site training with lectures and guided exercises. |
| Price: | TBD |
| Registration: | training@ogilvie.pl |
Description
Payload obfuscation can move sensitive logic out of native instruction streams and into a virtual execution environment. This training teaches participants how VM-based obfuscation works, how to compile payload logic to RISC-V, and how to execute that code inside a compact interpreter embedded in a host process.
The course starts from first principles with a small custom VM. Participants reverse a bytecode program, identify opcode handlers, write bytecode by hand, inspect interpreter dispatch, and compare a simple switch-based VM with a direct-threaded variant. This gives participants a concrete model for why virtualization raises reverse-engineering cost and why writing bytecode manually does not scale.
The second part introduces RISC-V as a practical VM instruction set. Participants learn the RV64 register model, common instructions, calling convention, position-independent shellcode constraints, linker scripts, ELF containers, raw binary extraction, tracing, and disassembly workflow. They compile small C payloads to rv64im, run them in riscvm, and debug failures with traces and instruction references.
The final part builds useful payloads and hardens the VM. Participants study the host/guest memory model, ecall-based syscalls, import resolution, host_call, the LLVM transpiler, the minimal runtime, relocation handling, payload packaging, opcode shuffling, instruction encryption, direct dispatch, C2 integration, and interpreter obfuscation tradeoffs.
Teaching
The training is exercise-driven. Each lecture block introduces a mechanism that is applied immediately in a lab task. Participants work inside a prepared development environment and build a complete pipeline from C source code to obfuscated RISC-V payload bytes executed by a host VM.
The course goal is pipeline literacy. Participants learn how to inspect each stage, diagnose broken payloads, reason about VM feature mismatches, read the generated RISC-V, and evaluate which obfuscation layers increase analyst effort.
The material is designed for authorized red teams, security researchers, and reverse engineers who need to understand code virtualization from both the builder and analyst perspective.
Learning Objectives
- Explain VM-based obfuscation, bytecode, VM contexts, opcode handlers, and interpreter dispatch.
- Reverse simple VM bytecode and translate it into C-like pseudocode.
- Write basic bytecode programs for arithmetic, conditionals, and loops.
- Understand why a well-supported ISA such as RISC-V is useful for payload virtualization.
- Read common RV64 instructions, registers, ABI names, branches, loads, stores, and calls.
- Compile freestanding C code to
rv64imobject code with Clang. - Use linker scripts, map files, ELF containers, relocations, and
llvm-objcopyto produce raw shellcode bytes. - Debug RISC-V payloads with VM traces, Ghidra, and instruction references.
- Understand the shared host/guest memory model used by
riscvm. - Implement and use VM syscalls through the RISC-V
ecallconvention. - Resolve host imports and call host functions from RISC-V with
resolve_importandhost_call. - Understand how LLVM bitcode extraction and import maps feed the transpiler.
- Follow the transpiler pipeline from Windows API C code to RISC-V LLVM IR and shellcode.
- Understand the role of
crt0, import initialization, relocation processing, minimal CRT functions, and payload startup. - Build and run example payloads through a controlled VM and C2 demo.
- Apply VM hardening features such as opcode shuffling, instruction encryption, and direct dispatch.
- Understand interpreter-obfuscation options such as LLVM obfuscation passes, native rewriting, junk insertion, opaque predicates, and per-sample variation.
- Evaluate limitations such as missing host-to-guest callbacks, restricted C++ support, and the difficulty of translating existing x64 shellcode.
Outline
-
Day 1: VM Obfuscation and RISC-V Payloads
- Environment setup
- GitHub Codespaces onboarding
- Repository fork, machine type selection, and toolchain verification
- Project tour:
exercise_*,riscvm,payload,transpiler, andobfuscator - Local Docker workflow for offline use after the training
- VM-based obfuscation model
- Native execution versus virtualized execution
- Bytecode, VM contexts, virtual registers, program counters, and handlers
- Reverse-engineering cost model for custom VMs
- Non-goals: no turnkey advanced obfuscator and no opaque compiler-theory deep dive
- Mini VM analysis
minivm.cppstructure- Active bytecode extraction
- Opcode table recovery
- Register and context layout
- Labels, jumps, conditionals, and bytecode patching
- Switch dispatch versus direct-threaded dispatch
- Mini VM exercises
- Recover the executed bytecode
- Document all opcodes and their semantics
- Translate the bytecode into C pseudocode
- Write bytecode for
a + b,a * b,a - b, anda == 42 ? 1337 : 0 - Bonus: implement
fib(n)and analyze the direct-threaded binary in a disassembler
- RISC-V as a payload VM ISA
- Why RISC-V fits this use case
rv64imscope and disabled compressed instructions- Register aliases:
zero,ra,sp, temporaries, saved registers, and argument registers - Common instructions:
addi,mv,sw,lw,add,blt,jal,jalr, andret - Pseudo-instructions and reference documentation workflow
- RISC-V shellcode build pipeline
- Freestanding C payload structure
- Clang
riscv64target selection -march=rv64imand-mcmodel=medany- Linker script layout for raw shellcode
- ELF as a temporary container for symbols and relocations
llvm-objdumpdisassembly andllvm-objcopyraw binary extraction- Map files and symbol lookup
- RISC-V shellcode exercises
- Build and run a
hellopayload in the Linux VM build - Run with
--traceand inspect the generated trace - Load the ELF in Ghidra and comment each instruction
- Explain what happens when
_startreturns instead of callingexit - Complete a build script that automates compile, link, dump, and run steps
- Build and run a
- Host interaction model
- Host process versus RISC-V guest
- Shared address space and pointer handling
- Code, data, heap, and stack layout
- Why guest code cannot directly execute host instructions
- VM exit and re-entry through
ecall
- VM syscall interface
ecallconvention and syscall numbers- Argument and return registers
- Debug print syscalls
- Memory helper syscalls
resolve_importfor module and function lookuphost_callfor invoking host functions- Syscall tracing and debugging
- Host interaction exercises
- Recover available syscalls from
riscvm.cpp - Implement a
print_stringsyscall stub - Resolve and call
putsfrom RISC-V code - Use
host_callto pass arguments into a host function - Bonus: create a payload that reads a lab file and displays its contents
- Recover available syscalls from
- Environment setup
-
Day 2: Transpilation, Runtime, and Hardening
- From handwritten stubs to automated payload builds
- Limits of writing VM payload code by hand
- Payload source constraints
- Windows API-oriented C payloads
- Debug builds versus hardened builds
- LLVM transpiler pipeline
- Windows payload compilation with Clang/MinGW
- Embedding and extracting LLVM bitcode
- Import map generation from the PE import table
- LLVM IR transformation from host imports to RISC-V VM runtime calls
- Pointer-sized argument casting and
host_callargument arrays - RISC-V LLVM IR emission
- Object generation, linking, relocation extraction, and raw binary packaging
- Minimal runtime and loader
crt0responsibilities- Relocation application at startup
- Import resolution before
main - Global constructor initialization
maininvocation and VMexit- Minimal CRT functions for allocation,
new/delete, strings, and output - Unsupported runtime features such as exceptions and RTTI-heavy C++
- Payload exercises
- Build
riscvmfor Windows with tracing enabled - Build the
payloadproject - Run the
hellopayload - Run the message-box payload through Wine and NoVNC or a Windows host
- Run the C2 test payload through the controlled demo server
- Inspect the generated bitcode, import map, RISC-V object, map file, and final payload bytes
- Build
- C2 integration patterns
- Embedding the VM as a library
- Loading payload bytes from disk, memory, or a network channel
- HTTP POST demo server workflow
- Polling versus push-style payload delivery
- Custom syscall boundary design for a framework
- Logging, tracing, and controlled lab execution
- VM hardening overview
- Feature flags and payload/VM compatibility checks
- Layering obfuscation features safely
- Debuggability tradeoffs when tracing is disabled
- Static signatures against an unmodified interpreter
- Instruction encryption
- Whole-payload encryption versus decode-time instruction encryption
- Position-dependent keys derived from the program counter
- Fetch-time decryption inside
riscvm_fetch - Feature metadata appended to protected payloads
- What encryption hides and what memory dumping can still recover
- Opcode shuffling
- Primary opcode,
funct3, andfunct7remapping opcodes.json, generated headers, and shuffled payload bytes- Keeping the interpreter and payload mapping synchronized
- Breaking standard RISC-V disassemblers
- Per-sample randomization and its effect on custom tooling
- Primary opcode,
- Direct dispatch and interpreter control flow
- Switch-based dispatch recognition
- Direct-threaded dispatch with computed targets
- Handler-to-handler jumps
- Performance and reverse-engineering impact
- Compiler flags used to preserve the desired dispatch shape
- Hardening exercises
- Build
riscvmwith hardening enabled - Verify that encrypted payloads still execute
- Generate a new opcode shuffling map
- Compare traces and disassembly before and after hardening
- Bonus: add a new lab payload to the build configuration
- Build
- Interpreter obfuscation and signature resistance
- Why a static VM interpreter is easy to fingerprint
- Handler-level obfuscation goals
- LLVM-based obfuscator options
- Native rewriting with junk instructions and opaque predicates
- Liveness checks and behavior-preserving rewrites
- Environment keying and custom feature gates
- Limitations and design tradeoffs
- No automatic translation of existing x64 shellcode
- No host-to-guest callbacks without additional stubs
- Limited C++ runtime support
- Host API calls remain observable behavior
- VM size, speed, compatibility, and analysis-cost tradeoffs
- Follow-up paths for custom syscalls, callbacks, stronger interpreter rewriting, and defensive deobfuscation tooling
- From handwritten stubs to automated payload builds
Requirements and Recommendations
Prerequisites
Participants should be familiar with:
- C programming. This is required for the hands-on payload exercises.
- Basic reverse engineering concepts.
- Assembly at a modest level. Prior RISC-V experience is not required.
- Python basics for build scripts and helper tooling.
- Command-line workflows with CMake, Clang, and common LLVM tools.
Helpful but optional:
- LLVM IR familiarity.
- Windows API experience.
- Prior exposure to VMProtect, Themida, OLLVM, or similar obfuscation systems.
Workstation Requirements
Each participant needs their own workstation. The prepared environment requires:
- A browser.
- A free personal GitHub account.
- Access to GitHub Codespaces during the training.
The exercises can also be run after the training with Docker. A Windows VM or host is useful for follow-up testing, but the prepared Codespaces environment uses Wine and NoVNC for the workshop labs.
Classroom Requirements
The training is delivered on-site only. A dedicated classroom with a projector is required. The training uses a collaborative format with frequent questions, live troubleshooting, and shared exercise discussion.
Instructor
Duncan Ogilvie is the creator of x64dbg and co-author of RISC-Y Business: Raging against the reduced machine. He has professional experience in DRM, mobile security, reverse engineering, and binary tooling. The course materials focus on practical VM internals, transparent build pipelines, and the tradeoffs between obfuscation strength, debuggability, and analyst effort.