What does Fuzz result analysis and code coverage refer to 07/19 Update SLTechnology News&Howtos

What does Fuzz result analysis and code coverage refer to

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article will explain in detail what is meant by Fuzz result analysis and code coverage. The content of the article is of high quality, so the editor will share it with you for reference. I hope you will have some understanding of the relevant knowledge after reading this article.

I. Preface

The following describes when to finish the testing process and what needs to be done after the test is completed. And begin to introduce some AFL-related principles step by step, the following are the main issues discussed in this paper:

1. When to finish the Fuzzing work

What files are generated by 2.afl-fuzz

3. How to verify and classify the generated crash

4. What is used to evaluate the results of Fuzzing

5. Code coverage and related concepts

How does 6.AFL record code coverage

II. Working status of Fuzzer

Because afl-fuzz never stops, when to stop testing is often determined by the state provided by afl-fuzz. In addition to the previously mentioned methods of viewing afl-fuzz status through the status window and afl-whatsup, here are a few additional methods.

1. Afl-stat

Afl-stat is one of the afl-utils tools AFL aids (there are other better programs in this set of tools, which will be described later), which is similar to the output of afl-whatsup.

Before using it, you need a configuration file to set the output directory of each afl-fuzz instance:

{"fuzz_dirs": ["/ root/syncdir/SESSION000", "/ root/syncdir/SESSION001",... "/ root/syncdir/SESSION00x"]}

Then specify the configuration file to run:

Afl-stats-c afl-stats.conf [SESSION000 on fuzzer1] Alive: 1 Execs: 64 m Speed: 0.3 x Pend: 6588 Crashes: 101 [SESSION001 on fuzzer1] Alive: 1 Execs: 105m Speed: 576.6 x Pend: 417 Crashes 0 Crashes: 291.. .2. Custom afl-whatsup

Afl-whatsup displays the status by reading the fuzzer_stats file in the afl-fuzz output directory, and each check needs to be performed manually, which is very troublesome. So you can modify it to show the status of the fuzzer in real time. The method is also very simple, the basic idea is to add a loop to all the code, and you can make some adjustments according to your preferences:

3. Afl-plot

All of the above are command-line-based tools, and if you want more intuitive results, you can use afl-plot to map intuitive trends in various state metrics.

# install dependency tool gnuplot$ apt-get install gnuplot$ afl-plot afl_state_dir graph_output_dir

Take the case of testing libtiff as an example, enter the afl-plot output directory, open index.html, and you will see the following three pictures:

First of all, there is the change of path coverage. when the number of pending fav becomes zero and the number of total paths basically does not grow, it is very unlikely that there will be any new discovery of fuzzer.

Then there are crashes and timeout changes.

Finally, there is a change in execution speed, and it is important to note here that if execution gets slower and slower over time, one possibility is that fuzzer is running out of shared resources.

4. Pythia

In the process of consulting the data, the author also found pythia, an extension of AFL. Although I do not know how effective it is, I still mention it here by the way. Its feature is that it can estimate the probability of discovering new crash and path, and its running interface has the following fields compared with the original AFL:

Correctness: when no crash is found, a probability is found that causes crash input.

Fuzzability: indicates how difficult it is to find a new path in the program. A higher number means that the program is easier to Fuzz.

Current paths: displays the number of paths currently found.

Path coverag: path coverage.

Third, finish the test. When will it end?

The purpose of checking the working status of afl-fuzz is to provide a basis for when to stop testing, which can usually be stopped when the following conditions are met.

(1) the color of the "cycles done" field in the status window becomes green. The color of this field can be used as a reference for when to stop testing. With the increasing number of cycles, its color will gradually change from magenta to yellow, blue and green. When it turns green, it's hard to find anything new if you continue to Fuzzing, so you can stop afl-fuzz through Ctrl-C.

(2) it has been a long time since the last discovery of a new path (or collapse), as for the exact amount of time still need to be determined, for example, as long as a week or more, it is estimated that everyone is impatient.

(3) it seems rare that the code of the target program is almost completely covered by test cases, but it should be possible for some small programs, and how to calculate coverage will be described below.

(4) among the various data provided by pythia mentioned above, once the path covera reaches 99% (usually unlikely), you can stop fuzz if you don't expect to run more crash, because many crash may be caused by the same reason. Another thing is that the value of correctness reaches 1e-08, which, according to pythia developers, requires about 100 million executions between the last discovery of path/uniq crash and the next discovery, which can also be used as a measure.

two。 Output result

There are many files in the output directory of afl-fuzz, and sometimes you may need to write files in which you want to write an accessibility. Take the synchronous directory when multiple fuzz instances are tested in parallel as an example:

$tree-L 3. ├── fuzzer1 │ ├── crashes │ │ ├── id:000000,sig:06,src:000019+000074,op:splice,rep:2 │ │ ├──... │ │ ├── id:000002,sig:06,src:000038+000125,op:splice Rep:4 │ │ └── README.txt │ ├── fuzz_bitmap │ ├── fuzzer_stats │ ├── hangs │ │ └── id:000000,src:000007,op:flip1,pos:55595 │ ├── plot_data │ └── queue │ ├── id:000000,orig:1.png │.... id:000101,sync:fuzzer10 Src:000102 └── fuzzer2 ├── crashes ├──...

Queue: stores all test cases with unique execution paths.

Crashes: a unique test case that causes the target to receive a fatal signal and crash.

Crashes/README.txt: saves the command-line parameters for the target to execute these crash files.

Hangs: a unique test case that causes the target to time out.

The running status of the fuzzer_stats:afl-fuzz.

Plot_data: used for afl-plot drawing.

4. Deal with test results

At this point, we may have run out a lot of crashes, so the next step is to determine whether and how the bug that caused these crashes can be used. This is another important aspect. Of course, I think this is much more difficult than what I mentioned earlier, which requires an in-depth understanding of common binary vulnerability types, operating system security mechanisms, code auditing and debugging. But if you only do a simple analysis and classification of crash, then the following methods can provide some help.

1. Crash exploration mode

This is a mode of operation of afl-fuzz, also known as peruvian rabbit mode, to determine the availability of bug, details of which can be found in lcamtuf's blog.

$afl-fuzz-m none-C-I poc-o peruvian-were-rabbit_out-- ~ / src/LuPng/a.out @ @ out.png

For example, when you find that the target program is trying to write\ jump to a memory address that obviously comes from the input file, then you can guess that the bug should be exploitable; however, it is not so easy to detect vulnerabilities such as NULL pointer dereferences.

Taking a test case that causes crash as the input to afl-fuzz, and using the-C option to turn on crash exploration mode, you can quickly generate a lot of crashes related to the input crash, but slightly different, thus judging that you can control the length of a block of memory address. The author did not find a suitable example in practice, but found a very good example in an article-a tcpdump stack overflow vulnerability. The crash exploration pattern generates 42 new crash from one crash and reads adjacent memory of different sizes.

2. Triage_crashes

In the experimental directory of the AFL source code, there is a script called triage_crashes.sh that can help us trigger the collected crashes. For example, in the following example, 11 represents the SIGSEGV signal, which may be due to a buffer overflow that causes the process to reference invalid memory; 06 represents the SIGABRT signal, which may be caused by the execution of the abort\ assert function or double free, these results can be used as a simple reference.

$~ / afl-2.52b/experimental/crash_triage/triage_crashes.sh fuzz_out ~ / src/LuPng/a.out @ @ out.png 2 > & 1 | grep SIGNAL + ID 000000, SIGNAL 11 + ID 000001, SIGNAL 06 + ID 000002, SIGNAL 06 + ID 000003, SIGNAL 06 + ID 000004, SIGNAL 11 + ID 000005, SIGNAL 11 + ID 000006, SIGNAL 11 +... 3. Crashwalk

Of course, the above two ways are too chicken, if you want to get more detailed crashes classification results, as well as the specific causes of crashes, then crashwalk is one of the good choices. This tool is based on gdb's exploitable plug-in and is relatively easy to install. On ubuntu, you only need to take the following steps:

$apt-get install gdb golang$ mkdir tools$ cd tools$ git clone https://github.com/jfoote/exploitable.git$ mkdir go$ export GOPATH=~/tools/go$ export CW_EXPLOITABLE=~/tools/exploitable/exploitable/exploitable.py$ go get-u github.com/bnagy/crashwalk/cmd/...

Crashwalk supports two modes of AFL/Manual. The former obtains the execution command of the target by reading the crashes/README.txt file (mentioned in the previous section 3), while the latter can specify some parameters manually. The two ways of use are as follows:

# Manual Mode$ ~ / tools/go/bin/cwtriage-root syncdir/fuzzer1/crashes/-match id-~ / parse @ @ # AFL Mode$ ~ / tools/go/bin/cwtriage-root syncdir-afl

The output of both modes is the same, as shown in the figure above. This tool is much more detailed than the previous methods, but the results are still confusing when there is a lot of crashes.

4. Afl-collect

The last big recommended tool is afl-collect, which is also a tool in the afl-utils suite and is also based on exploitable to check the availability of crashes. It can automatically delete invalid crash samples, delete duplicate samples and automate sample classification. The command is a little longer to use, as follows:

$afl-collect-j 8-d crashes.db-e gdb_script. / afl_sync_dir. / collection_dir-- / path/to/target-- target-opts

But the result is very intuitive like this:

Code coverage and related concepts

Code coverage is an extremely important concept in fuzzy testing. Code coverage can be used to evaluate and improve the testing process. The more code is executed, the more likely it is to find bug. After all, bug cannot be found 100% in covered code, but no bug can be found in uncovered code, so the concept of code coverage will be described in detail in this section.

1. Code coverage (Code Coverage)

Code coverage is a way to measure code coverage, that is, whether a line of code in the source code has been executed; for binary programs, this concept can also be understood as whether an instruction in the assembly code has been executed. There are many ways to measure it, but both GCOV of GCC and SanitizerCoverage of LLVM provide three levels of coverage detection: function, basic-block and edge. For more details, please refer to the official documentation of LLVM.

two。 Basic block (Basic Block)

Abbreviated as BB, refers to a set of instructions executed sequentially. After the first instruction in BB is executed, all subsequent instructions will be executed. All instructions in each BB will be executed the same number of times, that is, a BB must meet the following characteristics:

There is only one entry point, and the instruction in BB is not the target of any jump instruction.

There is only one exit point, and only the last instruction transfers the execution flow to another BB

Drag the above program into IDA, and you can see that it is also divided into four basic blocks:

3. Edge (edge)

AFL's technical white paper mentions that fuzzer captures edge (edge) coverage through stuffing code. So what is edge? We can think of the program as a control flow graph (CFG). Each node of the graph represents a basic block, and edge is used to represent the transition between the basic blocks. Knowing the number of execution of each basic block and jump, you can know the number of execution of each statement and branch in the program, thus obtaining more fine-grained coverage information than recording BB.

4. Tuple (tuple)

In the implementation of AFL, two tuples (branch_src, branch_dst) are used to record the information of the current basic block + the previous basic block, so as to obtain the execution flow and code coverage of the target. The pseudo code is as follows:

Cur_location =; / / Mark the current basic block shared_ MEM [cur _ location ^ prev_location] + + with a random number; / / save the current block and the previous block XOR to shared_mem [] prev_location = cur_location > 1There is a bit shift to the right to distinguish the jump from the current block to the current block.

The actual inserted assembly code, as shown in the following figure, first saves the values of various registers and sets ecx/rcx, and then calls _ _ afl_maybe_log, the content of this method is quite complex, so we won't talk about it here, but its main function is similar to the pseudo code above, which is used to record coverage and put it into a shared memory.

6. Calculate code coverage

Now that we understand the concepts related to code coverage, let's take a look at how to calculate the code coverage of our test cases for the previous test goals.

One of the tools you need to use here is GCOV, which is released with gcc, so you don't need to install it separately. Like the principle of afl-gcc stub compilation, gcc compiles a stub program to generate code coverage information at execution time.

Another tool is LCOV, which is the graphical front end of GCOV, which collects gcov data from multiple source files and creates source code HTML pages that contain annotations using coverage information.

The last tool is afl-cov, which can quickly help us call the first two tools to deal with code coverage results from afl-fuzz test cases. You can install afl-cov using apt-get install afl-cov in ubuntu, but this version does not seem to support branch coverage statistics, so it is better to download the latest version from Github and use the Python script in the directory without installing it:

$apt-get install lcov$ git clone https://github.com/mrash/afl-cov.git$. / afl-cov/afl-cov-Vafl-cov-0.6.2

Taking Fuzz libtiff as an example, the code coverage process for calculating the Fuzzing process is as follows:

The first step is to recompile the source code using gcov, add the "- fprofile-arcs" and "- ftest-coverage" options to CFLAGS, and you can re-specify a new directory in-- prefix so as not to overwrite the binaries of the previous alf stubs.

$make clean$. / configure-- prefix=/root/tiff-4.0.10/build-cov CC= "gcc" CXX= "disable-shared$ make$ make install +" CFLAGS= "- fprofile-arcs-ftest-coverage"-- disable-shared$ make$ make install

The second step is to execute afl-cov. The-d option specifies the afl-fuzz output directory;-live is used to process an AFL directory that is still updated in real time, and afl-cov exits when afl-fuzz stops;-- enable-branch-coverage is used to turn on edge coverage (branch coverage) statistics; and-c is used to specify the source directory. The last-e option is used to set the program and parameters to be executed, where AFL_FILE, similar to "@ @" in afl, is replaced with a test case, and LD_LIBRARY_PATH is used to specify the program's library file.

$cd / tiff-4.0.10$ afl-cov-d ~ / syncdir-- live-- enable-branch-coverage-c. -e "cat AFL_FILE | LD_LIBRARY_PATH=./build-cov/lib. / build-cov/bin/tiff2pdf AFL_FILE"

The result of successful execution is as follows:

We can use the-live option to calculate the coverage while the fuzzer is running, or we can calculate it after the test is over, and we end up with a html file like this. It provides an overview page that shows the coverage of each directory, or you can click to enter a directory to view the coverage of a specific file.

Click to enter each file, and there is more detailed data. The number in front of each line of code represents the number of times that line of code has been executed, and code that has not been executed will be marked in red.

On Fuzz results analysis and code coverage refers to what is shared here, I hope that the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.