docs/SAND.md
SAND introduces a new fuzzing workflow that can greatly reduce (or even eliminate) sanitizer overhead and combine different sanitizers in one fuzzing campaign.
The key point of SAND is that: sanitizing all inputs is wasting fuzzing power, because bug-triggering inputs are extremely rare (~1%). Obviously, not all inputs worth going through sanitizers. Therefore, if we can somehow "predict" if an input could trigger bugs (defined as "execution pattern"), we could greatly save fuzzing power by only sanitizing a small proportion of all inputs. That's exactly how SAND works.
For a normal fuzzing workflow, we have:
target_asanafl-fuzz -i seeds -o out -- ./target_asanFor SAND fuzzing workflow, this is slightly different:
target_native, which we will define as a "native binary". It is usually done by using afl-clang-fast/lto(++) to compile your project without AFL_USE_ASAN/UBSAN/MSAN.target_asan. Do note this step can be repeated for multiple sanitizers, like MSAN, UBSAN etc. It is also possible to have ASAN and UBSAN to build together.afl-fuzz -i seeds -o out -w ./target_asan -- ./target_native. Note -w can be specified multiple times.Then you get:
afl-fuzz -i seeds -o out -- ./target_nativeafl-fuzz -i seeds -o out -- ./target_asanTake test-instr.c as an example.
afl-clang-fast test-instr.c -o ./native
Just like the normal building process, except using afl-clang-fast
AFL_LLVM_ONLY_FSRV=1 AFL_USE_UBSAN=1 AFL_USE_ASAN=1 afl-clang-fast test-instr.c -o ./asanubsan
AFL_LLVM_ONLY_FSRV=1 AFL_USE_MSAN=1 afl-clang-fast test-instr.c -o ./msan
Do note AFL_LLVM_ONLY_FSRV=1 is crucial, this enables forkservers but disables pc instrumentation. You are allowed to reuse sanitizers-enabled binaries, i.e. binaries built without AFL_LLVM_ONLY_FSRV=1, at a cost of reduced speed.
mkdir /tmp/test
echo "a" > /tmp/test/a
AFL_NO_UI=1 AFL_SKIP_CPUFREQ=1 afl-fuzz -i /tmp/test -o /tmp/out -w ./asanubsan -w ./msan -- ./native @@
That's it!
By default, SAND uses the hash value of the simplified coverage map as execution pattern, i.e. if an input has a unique simplefied coverage map, it will be sent to sanitizers for inspection. This shall work for most cases. However, if you are strongly worried about missing bugs, try AFL_SAN_ABSTRACTION=unique_trace afl-fuzz ..., which filters inputs having a unique coverage map. Do note this significantly increases the number of inputs by 4-10 times, leading to much lower throughput. Alternatively, SAND also supports AFL_SAN_ABSTRACTION=coverage_increase, which essentially equals to running sanitizers on the corpus and thus having almost zero overhead, but at a cost of missing ~15% bugs in our evaluation.
Though we just used ASAN as an example, SAND works best if you provide more sanitizers, for example, UBSAN and MSAN.
You might do it via afl-fuzz -i seeds -o out -w ./target_asan -w ./target_msan -w ./target_ubsan -- ./target_native. Don't worry about the slow sanitizers like MSAN, SAND could still run very fast because only rather a few inputs are sanitized.
The execution pattern evaluated in our papers is targeting the common bugs, as ASAN/MSAN/UBSAN catches. For other bug types, you probably need to define new execution patterns and re-evaluate.
Generally, this is due to too many inputs going through sanitizers, for example, because of unstable targets. You could check stats from plot_file to confirm this. Try to switch execution patterns as stated above.