Chapter 3: Fuzzing
Fuzzing is an automated vulnerability discovery technique that has found thousands of critical security bugs in production software. This chapter covers the fundamentals of fuzzing, key tools, and methodologies for finding vulnerabilities.
3.1. 2.1 Fuzzing Fundamentals
What is Fuzzing?
Fuzzing is the process of feeding a program with malformed, unexpected, or random data in an attempt to provoke a crash or anomalous behavior. It is a form of black-box or gray-box testing that does not require access to the source code.
Types of Fuzzers:
- Dumb Fuzzers: Generate random inputs without knowledge of the input format. They are simple but inefficient.
- Smart Fuzzers: Have knowledge of the input format and generate inputs that are more likely to cause errors.
- Coverage-Guided Fuzzers: (The most common today) Use instrumentation to track which parts of the code are executed with a given input. Then, they mutate the inputs to explore new code paths.
3.2. 2.2 AFL++ and Coverage-Guided Fuzzing
AFL++ is a state-of-the-art coverage-guided fuzzer that has found a large number of vulnerabilities in real-world software.
How it Works:
- Instrumentation: The source code is compiled with special instrumentation that records code coverage.
- Initial Corpus: The fuzzer starts with a set of valid test inputs (the corpus).
- Mutation: AFL++ mutates the corpus inputs in various ways (bit flipping, arithmetic, etc.).
- Execution: The instrumented program is run with the mutated input.
- Analysis: If the mutated input explores a new code path, it is added to the corpus. If it causes a crash, it is saved for analysis.
3.3. 2.3 FuzzTest and In-Process Fuzzing
FuzzTest is an in-process fuzzing framework from Google. Unlike AFL++, which runs the program in a separate process for each input, FuzzTest runs the fuzzer within the same process as the code being tested. This can be much faster for certain types of applications.
3.4. 2.4 Honggfuzz and Protocol Fuzzing
Honggfuzz is another popular coverage-guided fuzzer. It is known for its performance and advanced features, such as persistent fuzzing and network protocol fuzzing.
3.5. 2.5 Syzkaller and Kernel Fuzzing
Syzkaller is a specialized fuzzer for finding vulnerabilities in operating system kernels. It uses a system call description language to generate programs that exercise the kernel interfaces. Syzkaller has been extremely successful in finding vulnerabilities in the Linux kernel.
3.6. 2.6 Practical AFL++ Configuration
Step-by-Step Installation
# Install build dependencies
sudo apt update
sudo apt install -y build-essential gcc-13-plugin-dev cpio python3-dev
libcapstone-dev pkg-config libglib2.0-dev libpixman-1-dev
automake autoconf python3-pip ninja-build cmake git wget meson
# Install LLVM 19 (check for the latest version at https://apt.llvm.org/)
wget https://apt.llvm.org/llvm.sh
chmod +x llvm.sh
sudo ./llvm.sh 19 all
# Verify LLVM installation
clang-19 --version
llvm-config-19 --version
# Install Rust (required for some AFL++ components)
curl --proto '=https' --tlsv1.2 -sSf "https://sh.rustup.rs" | sh
source ~/.cargo/env
# Compile and install AFL++
mkdir -p ~/soft && cd ~/soft
git clone --depth 1 https://github.com/AFLplusplus/AFLplusplus.git
cd AFLplusplus
make distrib
sudo make install
# Verify installation
which afl-fuzz
afl-fuzz --version
Target Compilation with Instrumentation
# Compile C/C++ program with AFL++ instrumentation
CC=/usr/local/bin/afl-clang-fast
CXX=/usr/local/bin/afl-clang-fast++
cmake ..
make -j$(nproc)
# Enable sanitizers for better bug detection
export AFL_USE_ASAN=1
export AFL_USE_UBSAN=1
export ASAN_OPTIONS="detect_leaks=1:abort_on_error=1:symbolize=1"
Fuzzer Execution
# Configure the system for optimal fuzzing
echo core | sudo tee /proc/sys/kernel/core_pattern
echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
# Create seed corpus
mkdir -p seeds
for i in {0..4}; do
dd if=/dev/urandom of=seeds/seed_$i bs=64 count=10 2>/dev/null
done
# Run the fuzzer
afl-fuzz -i seeds/ -o findings/ -m none -d -- ./target_binary @@
# Parallel fuzzing (multiple instances)
# Terminal 1: Master instance
afl-fuzz -i seeds/ -o findings/ -M Master -- ./target @@
# Terminal 2+: Slave instances
afl-fuzz -i seeds/ -o findings/ -S Slave1 -- ./target @@
afl-fuzz -i seeds/ -o findings/ -S Slave2 -- ./target @@
# Check status
afl-whatsup findings/
3.7. 2.7 Crash Analysis and Exploitability Assessment
Crash analysis is the process of determining whether a crash discovered by fuzzing represents an exploitable vulnerability. This section covers the tools and methodologies for systematic crash triage.
Decision Tree for Crash Analysis
CRASH RECEIVED
│
▼
┌───────────────────────┐
│ Source code │
│ available? │
└───────────────────────┘
│ │
Yes No
│ │
▼ ▼
┌─────────────────────┐ ┌─────────────────────┐
│ Recompile with │ │ Use debugger │
│ ASAN + UBSAN │ │ (GDB/WinDbg) │
└─────────────────────┘ └─────────────────────┘
│ │
▼ ▼
┌─────────────────────┐ ┌─────────────────────┐
│ Execute crash │ │ Analyze registers │
│ Get report │ │ and memory │
└─────────────────────┘ └─────────────────────┘
│ │
└────────┬───────────┘
▼
┌───────────────────────────┐
│ Classify vulnerability │
│ with CASR │
└───────────────────────────┘
3.7.1. 2.7.1 Case Study: Heap Buffer Overflow Analysis
Scenario: Fuzzing discovered a crash in an image parser. Let's analyze step by step.
Vulnerable Code:
// vuln_parser.c - Vulnerable image parser
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
void build_huffman_table(uint8_t *input, size_t size) {
if (size < 8) return;
uint32_t table_size = *(uint32_t*)input;
uint8_t *codes = input + 4;
uint8_t *table = malloc(256);
memcpy(table, codes, table_size);
printf("Built Huffman table with %u codes\n", table_size);
free(table);
}
ASAN Output:
==37160==ERROR: AddressSanitizer: heap-buffer-overflow on address
0x511000000140 at pc 0x56d6a37d0f62 bp 0x7ffd9f024440 sp 0x7ffd9f023c00
WRITE of size 512 at 0x511000000140 thread T0
Interpretation:
| Field | Value | Meaning |
|---|---|---|
| Bug Type | heap-buffer-overflow | Heap overflow |
| Operation | WRITE of size 512 | Writing 512 bytes |
| Location | vuln_parser.c:16 | Bug line |
| Allocation | 256-byte buffer | Buffer allocated |
| Overflow | 512 - 256 = 256 | Overflow amount |
Assessment: EXPLOITABLE - Attacker controls both overflow size and data.
3.7.2. 2.7.2 Case Study: Use-After-Free Analysis
Vulnerable Code:
typedef struct {
char *name;
void (*process)(void);
} Handler;
Handler *handler = NULL;
void unregister_handler(void) {
if (handler) {
free(handler);
// BUG: Missing handler = NULL
}
}
void call_handler(void) {
if (handler) {
handler->process(); // UAF!
}
}
Exploitation Strategy: 1. Free handler 2. Heap grooming: allocate objects of same size 3. Reclaim freed memory with controlled data 4. Trigger UAF → code execution
Assessment: EXPLOITABLE
3.7.3. 2.7.3 Case Study: Integer Overflow → Heap Corruption
Vulnerable Code:
void process_image(uint32_t width, uint32_t height, uint8_t *data) {
size_t buffer_size = width * height * 4; // overflow!
uint8_t *buffer = malloc(buffer_size);
for (size_t i = 0; i < width * height; i++) {
buffer[i] = data[i]; // massive overflow
}
}
Chain: Integer overflow → malloc(0) → loop with large bounds → heap corruption
Assessment: EXPLOITABLE
3.8. 2.8 Developing Fuzzing Harnesses
3.8.1. 2.8.1 Example: JSON Parser Harness
// fuzz_json.c
#include <json-c/json.h>
#include <stdint.h>
#include <stddef.h>
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
const char *data1 = (const char *)data;
json_tokener *tok = json_tokener_new();
json_object *obj = json_tokener_parse_ex(tok, data1, size);
if (obj) {
json_object_to_json_string_ext(obj, JSON_C_TO_STRING_PRETTY);
json_object_put(obj);
}
json_tokener_free(tok);
return 0;
}
Compilation:
clang-19 -g -fsanitize=address,fuzzer -I./json-c fuzz_json.c json-c/libjson-c.a -o fuzz_json
./fuzz_json corpus/ -max_total_time=300
3.8.2. 2.8.2 Harness Design Principles
| Principle | Description | Impact |
|---|---|---|
| In-Process Execution | LLVMFuzzerTestOneInput without fork | 10-100x faster |
| Direct API Target | Call core functions | Avoids arg parsing |
| Coverage Maximization | Exercise multiple paths | Finds more bugs |
| Proper Cleanup | Free memory | Prevents OOM |
| Sanitizer Compatible | ASAN/UBSAN | Better detection |
3.8.3. Chapter 2 Conclusions
-
Fuzzing finds real vulnerabilities - Not just theoretical crashes.
-
Coverage-guided fuzzing is powerful - AFL++, Honggfuzz, FuzzTest.
-
Sanitizers are essential - ASAN, UBSAN detect subtle bugs.
-
Time matters - Many bugs require hours/days.
-
Corpus quality affects results - Valid inputs reach deep code.
-
Parsers are primary targets - Image, protocol, file format parsers.
Discussion Questions:
-
Why is in-process faster than file-based wrappers?
-
How does corpus quality affect fuzzer penetration?
- What are the risks of over-mocking in harnesses?
- How to determine if fuzzing has reached diminishing returns?