Then start your first Fuzzing with AFL 07/06 Update SLTechnology News&Howtos

Then start your first Fuzzing with AFL

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

Then use AFL to start your first Fuzzing, for this problem, this article introduces the corresponding analysis and solutions in detail, hoping to help more partners who want to solve this problem to find a more simple and easy way.

I. Preface

Fuzzy testing (Fuzzing) technology, as one of the most effective means of vulnerability mining, has been the preferred technology for many security researchers to discover vulnerabilities in recent years. AFL, LibFuzzer, honggfuzz and other simple and friendly tools have emerged one after another, which greatly reduce the threshold of fuzzy testing. In the recent process of mining learning loopholes, students in Alpha Lab feel that the relevant resources on the Internet are a little jumbled, so that beginners don't know how to start. I would like to summarize and sort out some excellent blog articles, papers and tools collected in the learning process, share some ideas and experiences in the learning process, and supplement some content that is not involved on the Internet.

Due to the wide range of related topics, the author decided to divide all the content into a series of articles, and only focus on AFL, a landmark tool, starting with the simplest methods of use and basic concepts, and then introduce the follow-up work after testing, how to improve the speed of Fuzzing, some skills and the analysis of the source code.

II. Brief introduction of AFL

AFL (American Fuzzy Lop) is a fuzzy testing tool based on Coverage-guided developed by security researcher Micha boot Zalewski (@ lcamtuf). By recording the code coverage of input samples, it adjusts the input samples to improve coverage and increase the probability of finding vulnerabilities. The workflow is roughly as follows:

① inserts piles when compiling from source code to record code coverage (Code Coverage)

② selects some input files and joins the input queue (queue) as the initial test set

③ "mutates" the files in the queue according to certain policies.

If ④ updates the coverage through the mutation file, it will be retained and added to the queue.

⑤ the above process will continue in a loop, during which the files that trigger the crash will be recorded.

Third, select and evaluate the objectives of the test

Before you start Fuzzing, you must first select a target. The goal of AFL is usually a program or library that accepts external input, which usually comes from a file (a later article will also describe how to Fuzzing a network program).

1. What language is it written in?

AFL is mainly used for testing Cmax Candle + programs, so this is the top priority rule for us to find software. (there are also some AFL-based JAVA Fuzz programs such as kelinci, java-afl, etc., but they don't know how effective they are.)

two。 Whether it is open source

AFL can not only insert piles on the source code at compile time, but also use AFL's QEMU mode to insert piles on binary files, but the former is relatively much more efficient, and it is easy to find many suitable projects on Github.

3. Program version

The goal should be the latest version of the software, otherwise it would be embarrassing to work hard to find a vulnerability only to find that it had been reported and fixed a long time ago.

4. Whether there are sample programs and test cases

If the target has ready-made basic code examples, especially some open source libraries, it is convenient for us to call the library without having to write another program; if the target has test cases, it will be easier to build the corpus later.

5. Project scale

Some programs are very large and will be divided into several modules. In order to improve the efficiency of Fuzz, the Fuzzing part needs to be defined before Fuzzing. Here we recommend the source code reader tool Understand, which has treemap function and can directly see the structure and scale of the project. For example, in the source code of ImageMagick below, the gray box represents a folder, and the blue box represents a file, whose size and color reflect the number of lines and file complexity, respectively.

6. There have been vulnerabilities in the program.

If a program has exposed multiple vulnerabilities, it is still very likely that there are undiscovered security vulnerabilities. For example, ImageMagick discovers new vulnerabilities that are difficult to exploit every month, and some serious vulnerabilities with high impact occur every year. You can see that there are 357 CVE in 2017 alone! (image source medium.com)

IV. Building a corpus

AFL requires some initial input data (also known as seed files) as the starting point for Fuzzing, which can even be meaningless data, and AFL can automatically determine the file format structure through heuristic algorithms. Lcamtuf gives an interesting example in his blog-when Fuzzing djpeg, only a string "hello" is used as input, and a large number of jpge images are generated out of thin air!

Although AFL is so powerful, if you want to achieve faster Fuzzing speed, it is necessary to generate a high-quality corpus. This section solves three problems: how to select input files, where to find these files, and how to simplify the found files.

1. Choice

(1) valid input

Although sometimes invalid input can lead to bug and crashes, valid input can find more execution paths faster.

(2) as small as possible

Smaller files can not only reduce testing and processing time, but also save more memory. AFL suggests that it is better to be less than 1 KB, but you can actually weigh it according to the program you are testing, which is specified in the perf_tips.txt of the AFL document.

two。 Looking for

Use the test cases provided by the project itself

Target program bug submission page

Use a format converter to generate some file formats that are not easy to find from existing file formats:

Some test cases are provided in the testcases directory of the afl source code

Other open source corpora

Afl generated image test sets

Fuzzer-test-suite

Libav samples

Ffmpeg samples

Fuzzdata

Moonshine

3. Pruning

Some large corpora found on the Internet often contain a large number of documents, so it is necessary to simplify them. There is a term for this work-Corpus Distillation (Corpus Distillation). AFL provides two tools to help us do this-- afl-cmin and afl-tmin.

(1) remove the input file that executes the same code-afl-cmin

The core idea of afl-cmin is to try to find the smallest subset with the same coverage as the corpus complete set. For example: if there are multiple files that overwrite the same code, the extra files will be discarded. The method of use is as follows:

$afl-cmin-I input_dir-o output_dir-/ path/to/tested/program [params]

More often, we need to get input from the file, where we can use "@ @" instead of entering the file name on the command line of the program being tested. Fuzzer replaces it with the actual executed file:

$afl-cmin-I input_dir-o output_dir-/ path/to/tested/program [params] @ @

In the following example, we reduce a corpus of 1253 png files to only 60 files.

(2) reduce the size of a single input file-afl-tmin

The overall size has been improved, and then each file will be processed in more detail. The principle of afl-tmin reducing file size is not studied here, and will be explained in a later article when there is an opportunity. Here is only how to use it (in fact, it is also very simple, interested friends can search by themselves).

Afl-tmin has two working modes, instrumented mode and crash mode. The default mode of operation is instrumented mode, as shown below:

$afl-tmin-I input_file-o output_file-/ path/to/tested/program [params] @ @

If the parameter-x, that is, crash mode, is specified, the files that cause the program to exit abnormally will be deleted directly.

$afl-tmin-x-I input_file-o output_file-/ path/to/tested/program [params] @ @

Afl-tmin accepts individual file input, so you can batch it with a simple shell script. If there are a large number of files in the corpus and the size is very large, this process may take several days or even longer!

For i in *; do afl-tmin-I $I-o tmin-$i-~ / path/to/tested/program [params] @ @; done

The following picture shows the change in the size of the corpus after pruning in two modes:

At this point, you can use afl-cmin again and find that you can filter out some files again.

Fifth, build the program to be tested

As mentioned earlier, AFL inserts piles from the source code compiler to record code coverage. This work requires using the wrapper of the two compilers it provides to compile the target program, which is not much different from the normal compilation process, and this section is just a brief demonstration.

1. Afl-gcc mode

Afl-gcc/afl-g++, as the wrapper of gcc/g++, is used in exactly the same way, the former passes the received parameters to the latter, and we just need to set the compiler to afl-gcc/afl-g++ when we compile the program, as demonstrated below. If the program is not built in autoconf, it is OK to directly change the compiler in the Makefile file to afl-gcc/g++.

. / configure CC= "afl-gcc" CXX= "afl-g++"

When Fuzzing shared libraries, you may need to write a simple demo that passes input to the library you want to Fuzzing (in fact, most projects come with a similar demo). In this case, you can set LD_LIBRARY_PATH to have the program load the .so file that has passed the AFL stub, but the easiest way is to build statically, in the following ways:

$. / configure-disable-shared CC= "afl-gcc" CXX= "afl-g++"

2. LLVM mode

The LLVM Mode mode compiler can get a faster Fuzzing speed, go to the llvm_mode directory to compile, and then build the sequencer using afl-clang-fast, as shown below:

$cd llvm_mode$ apt-get install clang$ export LLVM_CONFIG= `which llvm- config` & & make & & cd. $. / configure-- disable-shared CC= "afl-clang-fast" CXX= "afl-clang-fast++"

The author will report an error when compiling with a high version of clang, and convert it to clang-3.9 after compiling. If the default version of clang installed on your system is too high, you can install multiple versions and then use update-alternatives to switch.

VI. Start Fuzzing

Afl-fuzz program is the main program of AFL Fuzzing, usage is not difficult, but the ingenious working principle behind it is worth studying, considering that the first article only gives readers a preliminary understanding, this section simply demonstrates how to run Fuzzer, and other details will be skipped here for the time being.

1. White-box testing

(1) Test pile insertion procedure

After compiling the program, you can choose to use afl-showmap to track the execution path of a single input, and print the output of the program execution and captured tuples (tuples). Tuple is used to obtain branch information to measure program coverage, which will be explained in detail in the next article.

$afl-showmap-m none-o / dev/null -. / build/bin/imagew 23.bmp out.png [*] Executing'. / build/bin/imagew'...-- Program output begins-- 23.bmp-> out.pngProcessing: 13x32 Captured-Program output ends-- [+] Captured 1012 tuples in'/ dev/null'.

Using different inputs, afl-showmap will normally capture different tuples, which shows that our stubs are valid, and the afl-cmin mentioned earlier uses this tool to remove duplicate input files.

$afl-showmap-m none-o / dev/null -. / build/bin/imagew 111.pgm out.png [*] Executing'. / build/bin/imagew'...-- Program output begins-- 111.pgm-> out.pngProcessing: 7x7 tuples in-Program output ends-- [+] Captured 970 tuples in'/ dev/null'.

(2) execute fuzzer

Before performing afl-fuzz, if the system is configured to send core dump file (core) notifications to external programs. Will cause the delay between sending crash messages to Fuzzer to increase, which may lead to the crash being mistakenly reported as a timeout, so we have to modify the core_pattern file temporarily, as shown below:

Echo core > / proc/sys/kernel/core_pattern

You can then execute afl-fuzz, which is usually in the following format:

$afl-fuzz-I testcase_dir-o findings_dir / path/to/program [params]

Or replace the input file with "@ @" and Fuzzer will replace it with the actual executed file:

$afl-fuzz-I testcase_dir-o findings_dir / path/to/program @ @

If there is nothing wrong, Fuzzer will officially start working. First, preprocess the files in the input queue; then give warning information about the corpus used, such as the following figure indicates that there is a large file (14.1KB) and too many input files; finally, start the main Fuzz cycle and display the status window.

(3) use screen

A Fuzzing process usually lasts for a long time, and if the terminal running the afl-fuzz instance is shut down unexpectedly, the Fuzzing will also be interrupted. By starting each instance in screen session, you can easily connect and disconnect. I won't say any more about the use of screen here, you can check it yourself.

$screen afl-fuzz-I testcase_dir-o findings_dir / path/to/program @ @

You can also name each session to facilitate reconnection.

$screen-S fuzzer1 $afl-fuzz-I testcase_dir-o findings_dir / path/to/program [params] @ @ [detached from 6999.fuzzer1] $screen-r fuzzer1. 2. Black box test

The so-called black box testing, generally speaking, is the testing of programs without source code, and then the QEMU mode of AFL will be used. It is enabled in a manner similar to the LLVM mode, which is also compiled first. Note, however, that because the version of QEMU used by AFL is too old, the function memfd_create () defined in util/memfd.c conflicts with the function of the same name in glibc, where you can find the patch for QEMU, and then run the script build_qemu_support.sh to download and compile automatically.

$apt-get install libini-config-dev libtool-bin automake bison libglib2.0-dev-y $cd qemu_mode$ build_qemu_support.sh$ cd.. & & make install

From now on, you can use QEMU mode for Fuzzing simply by adding the-Q option.

$afl-fuzz-Q-I testcase_dir-o findings_dir / path/to/program [params] @ @ 3. Parallel testing

(1) single system parallel testing

If you have a multi-core machine, you can bind an afl-fuzz instance to a corresponding core, that is, you can run as many afl-fuzz instances as there are several cores on the machine, which can greatly improve the execution speed. Although everyone should know the number of cores of their machines, let's mention how to check it:

$cat / proc/cpuinfo | grep "cpu cores" | uniq

Afl-fuzz parallel Fuzzing, the general practice is to specify a master Fuzzer (Master Fuzzer) through the-M parameter and multiple slave Fuzzer (Slave Fuzzer) through the-S parameter.

$screen afl-fuzz-I testcases/-o sync_dir/-M fuzzer1 -. / program$ screen afl-fuzz-I testcases/-o sync_dir/-S fuzzer2 -. / program$ screen afl-fuzz-I testcases/-o sync_dir/-S fuzzer3 -. / program.

These two types of Fuzzer implement different Fuzzing strategies. The former performs deterministic testing (deterministic), that is, some special rather than random variation on the input file, while the latter carries out completely random variation.

You can see that the-o here specifies a synchronous directory, and in parallel testing, all Fuzzer will cooperate with each other and pass new test cases to each other when finding a new code path, as shown in the following figure from the Fuzzer0's point of view, it looks at the corpus of other fuzzer and synchronizes the test cases of interest by comparing id.

The afl-whatsup tool can view the running status and overall running overview of each fuzzer, plus the-s option shows only the overview, where the data is the sum of all the fuzzer.

The afl-gotcpu tool can also view the usage status of each core.

(2) Multi-system parallel testing

The basic working principle of multi-system parallelism is similar to the mechanism described in single-system parallelism. You need a simple script to accomplish two things. On the local system, compress the files under queue in each fuzzer instance directory and distribute them to other machines via SSH to extract them.

Let's take a look at an example. Suppose there are two machines now. The basic information is as follows:

Fuzzer1fuzzerr2172.21.5.101172.21.5.102 runs 2 instances and 4 instances

In order to be able to automatically synchronize data, you need to use authorized_keys for authentication. To synchronize the input queue of each instance in fuzzer2 to fuzzer1, you can do the following:

#! / bin/sh# all hosts to be synchronized FUZZ_HOSTS='172.21.5.101 172.21.5.102 synchronization # SSH userFUZZ_USER=root# synchronization directory SYNC_DIR='/root/syncdir'# synchronization interval SYNC_INTERVAL=$ ((30 * 60)) if ["$AFL_ALLOW_TMP" = ""]; then if ["$PWD" = "/ tmp"-o "$PWD" = "/ var/tmp"] Then echo "[-] Error: do not use shared / tmp or / var/tmp directories with this script." 1 > & 2 exit 1 fifirm-rf .sync _ tmp 2 > / dev/nullmkdir .sync _ tmp | | exit 1while:; do # package data on all machines for host in $FUZZ_HOSTS; do echo "[*] Retrieving data from ${host}." Ssh-o' passwordauthentication no' ${FUZZ_USER} @ ${host}\ "cd'$SYNC_DIR' & & tar-czf-SESSION*" > ".sync _ tmp/$ {host} .tgz" done # Distribution data for dst_host in $FUZZ_HOSTS; do echo "[*] Distributing data to ${dst_host}..." For src_host in $FUZZ_HOSTS; do test "$src_host" = "$dst_host" & & continue echo "Sending fuzzer data from ${src_host}." Ssh-o' passwordauthentication no' ${FUZZ_USER} @ $dst_host\ "cd'$SYNC_DIR' & & tar-xkzf-& > / dev/null"

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.