forked from innovimax/afl-1
-
Notifications
You must be signed in to change notification settings - Fork 0
Unofficial American Fuzzy Lop repo
tn3rt/afl
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
================== american fuzzy lop ================== 1) The afl-fuzz approach ------------------------ American Fuzzy Lop is a brute-force fuzzer coupled with a genetic algorithm. It uses a modified form of edge coverage to pick up subtle, local-scale changes to program control flow. Algorithm: 1) Load user-supplied initial test cases into the queue, 2) Take next input file from the queue, 3) Attempt to trim the test case to the smallest size that doesn't alter the measured behavior of the program, 4) Repeatedly mutate the file using a traditional fuzzing strategies, 5) If the instrumentation records a new state transition, add mutated output as a new entry in the queue. 6) Go to 2. The discovered test cases are periodically culled to eliminate ones that have been obsoleted by newer, higher-coverage finds. The tests undergo several other effort-minimization steps. The tool creates a self-contained corpus of interesting test cases. These are extremely useful for seeding other, labor- or resource-intensive testing regimes - for example, for stress-testing browsers, office applications, graphics suites, or closed-source tools. The fuzzer is thoroughly tested to deliver out-of-the-box performance far superior to blind fuzzing or coverage-only tools. 2) Instrumenting programs for use with AFL ------------------------------------------ When source code is available, instrumentation can be injected by a companion tool that works as a drop-in replacement for gcc or clang in any standard build process for third-party code. The correct way to recompile the target program may vary depending on the specifics of the build process, but a nearly-universal approach would be: $ CC=/path/to/afl/afl-gcc ./configure $ make clean all For C++ programs, you'd would also want to set CXX=/path/to/afl/afl-g++. The clang wrappers (afl-clang and afl-clang++) can be used in the same way; clang users may also leverage a higher-performance instrumentation mode, as described in llvm_mode/README.llvm. For testing libraries, a simple program needs to read data from stdin or from a file and pass it to the library. Link this executable against a static version of the instrumented library, or to make sure that the correct .so file is loaded at runtime (usually by setting LD_LIBRARY_PATH). The simplest option is a static build: $ CC=/path/to/afl/afl-gcc ./configure --disable-shared Setting AFL_HARDEN=1 when calling 'make' will cause the CC wrapper to automatically enable code hardening options that make it easier to detect simple memory bugs. Libdislocator, a helper library included with AFL (see libdislocator/README.dislocator) can help uncover heap corruption issues, too. 3) Instrumenting binary-only apps --------------------------------- When source code is *NOT* available, the fuzzer offers experimental support for fast, on-the-fly instrumentation of black-box binaries. This is accomplished with a version of QEMU running in the lesser-known "user space emulation" mode. You can conveniently build the QEMU feature: $ cd qemu_mode $ ./build_qemu_support.sh See qemu_mode/README.qemu for more. The mode is approximately 2-5x slower and less conductive to parallelization than compile-time instrumentation. 4) Choosing initial test cases ------------------------------ To operate correctly, the fuzzer requires 1+ starting files that contain good examples of normal input data. There are two basic rules: - Keep the files small, under 1 kB is ideal. See perf_tips.txt for size discussion. - Use multiple test cases only if they are functionally different from each other. Using fifty different vacation photos to fuzz an image librar is useless. Good examples of starting files are in testcases/ subdirectory. * If a large corpus of data is available for screening, perhaps use afl-cmin utility to find a subset of distinct files that exercise different code paths. 5) Fuzzing binaries ------------------- afl-fuzz carries out the fuzzing process. It requires a read-only directory with initial test cases, a separate place to store its findings, and the path to the target. For targets that accept input from stdin, the syntax is: $ ./afl-fuzz -i testcase_dir -o findings_dir /path/to/program [...params...] For targets that take file input, use '@@' to mark the location in the target's command line where the input file name should be placed. The fuzzer will substitute it: $ ./afl-fuzz -i testcase_dir -o findings_dir /path/to/program @@ Use the -f option to have the mutated data written to a named file. This is useful if the program expects a particular file extension. QEMU mode fuzzes non-instrumented binaries with the -Q command or in a blind-fuzzer mode with the -n flag. Override the default timeout and memory limit with the -t and -m commands. Compilers and video decoders may need these settings. See perf_tips.txt for optimization tips. afl-fuzz starts with an array of deterministic fuzzing steps, which can take several days, but tends to produce neat test cases. Add the -d option for quicker results. 6) Interpreting output ---------------------- See status_screen.txt for info on UI stats and monitoring the process. If UI elements are red, read this file! Ctrl-C stops the fuzzing. At minimum, complete at least one queue cycle, which may take hours or days. There are three subdirectories created in the output directory and updated in real time: - queue/ - Test cases for every execution path, and user starting files. This is the synthesized corpus. Before using this corpus, shrink its size using the afl-cmin. The tool finds a smaller subset of files with equivalent edge coverage. - crashes/ - unique test cases that caused a fatal signal . The entries are grouped by the received signal (e.g., SIGSEGV, SIGILL, SIGABRT). - hangs/ - Unique time out cases. A hang is the larger of 1 second and the value of the -t parameter. Rarely necessary: The value can be fine-tuned by setting AFL_HANG_TMOUT. Crashes and hangs are _unique_ if the execution paths are state transitions that aren't in previously-recorded faults. If a bug can be reached in multiple ways, there will be some count inflation early on, but this should quickly taper. File names for crashes and hangs are correlated with parent, non-faulting queue entries, which helps with debugging. When a crash can't be reproduced, check that the memory limit is the same as that used by the tool. Try: $ LIMIT_MB=50 $ ( ulimit -Sv $[LIMIT_MB << 10]; /path/to/tested_binary ... ) Change LIMIT_MB to match the -m parameter. Existing output directories can resume aborted jobs. $ ./afl-fuzz -i- -o existing_output_dir [...etc...] gnuplot can generate some pretty graphs for active fuzzing tasks using afl-plot. See http://lcamtuf.coredump.cx/afl/plot/. 7) Parallelized fuzzing ----------------------- afl-fuzz instances take up ~1 core. On multi-core systems, parallelization is necessary to fully-utilize the hardware. See parallel_fuzzing.txt for fuzzing a common target on multiple cores or networked machines. Parallel fuzzing mode is a simple way for interfacing AFL to other fuzzers, symbolic, or concolic execution engines. 8) Fuzzer dictionaries ---------------------- afl-fuzz mutation engine is optimized for compact data formats like images, multimedia, compressed data, regex, or shell scripts. It is less suited for verbose and redundant languages like HTML, SQL, or JavaScript. afl-fuzz seeds the fuzzing process with an optional dictionary of language keywords, magic headers, adnd other tokens along the targeted data type. It uses this to reconstruct the underlying grammar in real time: http://lcamtuf.blogspot.com/2015/01/afl-fuzz-making-up-grammar-with.html To use this feature, create a dictionary in a format discussed in dictionaries README.dictionaries. Then point the fuzzer to it with the -x option. (Several dictionaries are already provided in that subdirectory.) There isn't a way to provide more structured descriptions of the underlying syntax, but the fuzzer will likely figure out syntax structures based on the instrumentation feedback. http://lcamtuf.blogspot.com/2015/04/finding-bugs-in-sqlite-easy-way.html * When an explicit dictionary isn't given, afl-fuzz will try to extract existing syntax tokens in the input corpus. This works for some, but isn't nearly as good as the -x mode. If a dictionary is hard to find, let AFL run a while, and then use the token capture library. See libtokencap/README.tokencap. 9) Crash triage ---------------- The grouping of crashes produces a small data set that can be triaged manually or with a GDB or Valgrind script. Every crash is traceable to its parent non-crashing test case in the queue, making it easier to diagnose faults. Some fuzzing crashes can be difficult to evaluate for exploitability without a lot of debugging and code analysis. For this, afl-fuzz has a "crash exploration" mode with the -C flag. In this mode, the fuzzer takes 1+ crashs as input and uses feedback fuzzing strategies to enumerate all code paths that can be reached in the program, while keeping it in the crashing state. Changes that do not affect the execution path or do not results in a crash are rejected. The output is a corpus of files that show the degree of control the attacker has over the faulting address or whether an initial out-of-bounds read is possible. Try afl-tmin for test case minimization. $ ./afl-tmin -i test_case -o minimized_result -- /path/to/program [...] The tool works with crashing and non-crashing test cases. In the crash mode, it will accept instrumented and non-instrumented binaries. In the non-crashing mode, the minimizer makes the file simpler without altering the execution path. The minimizer accepts the -m, -t, -f and @@ syntax in a manner compatible with afl-fuzz. afl-analyze tool takes an input file, attempts to sequentially flip bytes, and observes the target's behavior. It then color-codes the input based on which sections appear to be critical. It can offer quick insights into complex file formats. See technical_details.txt for more. 10) Going beyond crashes ------------------------ Fuzzing is a wonderful and underutilized technique for discovering non-crashing design and implementation errors, too. Quite a few interesting bugs have been found by modifying the target programs to call abort() when, say: - Two bignum libraries produce different outputs when given the same fuzzer-generated input, - An image library produces different outputs when decoding the same image several times in a row, - A serialization library fails produces unstable outputs, - A compression library produces an output inconsistent with the input file when it compresses and decompresses a blob. Implementing these sanity checks can take little time. If you are the maintainer of a particular package, you can make this code conditional with #ifdef FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION (a flag also shared with libfuzzer) or #ifdef __AFL_COMPILER (this one is just for AFL). 11) Common-sense risks ---------------------- Keep in mind that - Targeted programs may end up erratically grabbing gigabytes of memory or filling up disk space with junk files. You shouldn't be fuzzing on systems where the prospect of data loss is not an acceptable risk. - Fuzzing involves billions of reads and writes to the filesystem. On modern systems, this will be usually heavily cached, resulting in fairly modest "physical" I/O - but there are many factors that may alter this equation. Monitor for potential trouble with iostat command. $ iostat -d 3 -x -k [...optional disk ID...] 12) Known limitations & areas for improvement --------------------------------------------- - AFL detects faults by checking for the first spawned process dying due to a signal (SIGSEGV, SIGABRT, etc). Programs that install custom handlers for these signals may need to have the relevant code commented out. Faults in child processed spawned by the fuzzed target may evade detection unless you manually add some code to catch that. - The fuzzer offers limited coverage if encryption, checksums, cryptographic signatures, or compression are used to completely wrap the data format to be tested. You can comment out the relevant checks (See experimental/libpng_no_checksum/). You can also write a postprocessor. (See experimental/post_library/.) - There are trade-offs with ASAN and 64-bit binaries. (See notes_for_asan.txt.) - There is no direct support for fuzzing network services, background daemons, or interactive apps that require UI interaction. Simple code changes might make them behave. See also Preeny as an option, https://github.com/zardus/preeny. Tips for modifying network-based services: https://www.fastly.com/blog/how-to-fuzz-server-american-fuzzy-lop - AFL doesn't output human-readable coverage data. If you want to monitor coverage, use afl-cov from Michael Rash: https://github.com/mrash/afl-cov - See http://lcamtuf.coredump.cx/prep/ to report problems with afl. Beyond this, see INSTALL for platform-specific tips.
About
Unofficial American Fuzzy Lop repo
Resources
Stars
Watchers
Forks
Packages 0
No packages published
Languages
- C 84.5%
- Shell 7.6%
- C++ 4.2%
- Makefile 2.9%
- Other 0.8%