Fuzzing
Fuzzing
Fuzzing is an automated software testing technique that involves providing invalid, unexpected, or random data as inputs to a program. The program is then monitored for exceptions such as crashes or memory leaks.
Types of fuzzing.
- Dumb Fuzzing (Black-Box):
- Very simple modifications to legitimate data, often consisting of generating completely random data. It treats the program as black box and is unaware of internal program structure.
- It is easy to setup and it identifies low hanging fruit or shallow bugs.
- White Box Fuzzing:
- Leverages program analysis such as symbolic execution to systematically increase coverage or reach critical program locations.
- Coverage-Guided Fuzzing (Grey-Box):
- Leverages instrumentation to gain information about the program’s execution structure (code coverage or basic block transitions)
- The fuzzer prioritises inputs that explore new code paths.
- It is efficient for vulnerability detection.
- Smart Fuzzing:
- Uses the knowledge of the input structure to generate inputs.
- This generates a lot of inputs that the target application understands, leading to improved code coverage and faster fuzzing time.
- This requires an input model.
- Mutation-Based Fuzzing:
- Generates inputs by modifying an existing corpus of valid seed inputs.
- Generation-Based Fuzzing:
- Generates inputs from scratch, often leveraging a predefined input model.
Why Fuzzing is Used?
Fuzzing is important component of security testing and software development for a several reasons:
- Vulnerability Detection
- It is used to find security vulnerabilities that might be exploited by for malicious intent.
- Some of the security vulnerabilities are:
- Memory corruption bugs in languages like C or C++.
- Buffer Overflows
- Use After Free
- Code injection issues.
- Memory corruption bugs in languages like C or C++.
- Black-Box Testing:
- Fuzzing allows researchers to identify flaws in software without having full access to the source code, using a black-box approach.
- Validation:
- Fuzzing can be used to validate the findings of static analysis tools.
- If static analysis reports a potential problem, fuzzing can generate an input that actually triggers that issue.
Corpus
A corpus is the set of input samples that the fuzzer uses to explore the target program. The samples are usually small, valid, and diverse inputs that help the fuzzer maximize code coverage.
A good corpus is crucial for code coverage because fuzzers like AFL++, libfuzzer, etc. mutate the files in the corpus to discover new paths, crashes, and behaviors.
Fuzzing using AFL++
AFL++ is a coverage-based fuzzer. It keeps track of which areas of the binary are being executed.
Main coverage of AFL++
PCGUARD=afl-clang-fast- This is a type of code coverage instrumentation mechanism implemented within
LLVMfor fuzzing purposes. It works by inserting additional instructions that record the program’s execution path.
- This is a type of code coverage instrumentation mechanism implemented within
LTO=afl-clang-lto- This is an
LLVM-based instrumentation technique used with the fuzzer AFL, unitlizing Link-Time Optimization (LTO). LTO optimizes during the linking phase enabling whole program optimization and precise instrumentation for fuzzing.
- This is an
GCC_PLUGIN=afl-gcc-fast
Which one to choose?
alf-clang-fast - it is quicker to set u but offers less sophisticated optimizations.
- When you need fast compilation times.
- When fuzzing smaller and simpler applications.
afl-clang-lto- produces larger binaries but results in more effective fuzzing. - When you require whole program optimization (large or complext applications)
- When performance is critical
How it works:
AFL++instruments the target binary to gather information on how inputs affect code paths during execution.AFL++uses a genetic algorithm to evolve its input set over time, mutating and recombining inputs that produce interesting results to explore as much of the code-base as possible.AFL++is a coverage guided fuzzer.
AFL++ key features:
- Instrumentation:
- Involves compiling the code with a specialized AFL compiler. This compiler inserts additional instructions into the program allowing it to monitor and record a coverage map.
- It also enables the use of sanitizers, which help detect bugs that may not cause crashes but could still lead to issues.
- Mutation:
- The process of altering existing test inputs to create new ones that could trigger new code paths or reveal hidden vulnerabilities.
- The mutated inputs are executed against the target program, with those expanding coverage or causing crashes being prioritized for further mutation.
Link Time Optimization
Link Time Optimization allows compilers to perform whole program optimizations during the linking phase, leading to faster and smaller executables.
In the context of fuzzing it enables specific instrumentation modes in some fuzzers benefits:
- Faster binaries.
- More precise coverage information(edge coverage).
- Better optimization compatibility.
- More reliable feedback to the fuzzer (improves performance and path discovery).
White Box
- Build from source using
afl-gcc-fast / afl-g++-fastorafl-clang-fast / afl-clang++-fast- e.g.
CC=afl-gcc-fast cXX=afl-g++fast ./configure --prefix="$HOME/install_folder/ - e.g. ``
- e.g.
- Run the fuzzer
- e.g:
afl-fuzz -i ./input -o ./output -s 123 -- ./binary ./output_information-iindicates the directory for input cases-oindicates the directory whereAFL++will store mutated file-sindicates the static random seed to use@@is the placeholder target’s command line thatAFLwill substitute with each input file name.
- e.g:
- `