Hey there.
These days I've been playing with libfuzzer, a tool that comes with clang compiler and that allows us to fuzz a program compiled with clang. The fuzzing consists on passing (pseudo-)random data as program input and check if that breaks.
To do this with libfuzzer, it is required to define in the program a function
called LLVMFuzzerTestOneInput
that accepts a buffer of bytes as argument. Then
libfuzzer will call this function in a loop with different data. The function
implementation will be something like this:
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
DoSomethingInterestingWithMyAPI(Data, Size);
return 0; // Non-zero return values are reserved for future use.
}
Afterwards, you need to compile the project with clang and the following options:
clang -g -O1 -fsanitize=fuzzer,address fuzzer-file.c project.c -o fuzz-project
This way clang will create an executable (fuzz-project
in this case) that will
allow us to fuzz the library. Besides, libfuzzer also adds code to instrument
the program (doing this libfuzzer is able to know what code paths are executed
in each execution and create more eficient tests that discover new code paths).
If you are interested in libfuzzer, you can check the following tutorials:
They are really good, since they already have prepared examples so you can focus on learning libfuzzer instead of worrying about compile the target programs.
But to show how to compile target programs is the goal of this post, since in that is something you will need to do in the real world, and that can be a daunting task if don't know how to do it. In this case, we will see how to incorporate libfuzzer to a project that is compiled by using autotools.
You can identify projects using autotools since they require the following steps to compile them:
autoreconf -i -f -v
./configure
make
In my case I wanted to fuzz libhtp, a library to parse the HTTP protocol in a safe and eficient way. This library is used in software like the Suricata IDS so it can detect network attacks.
Therefore, libhtp needs to be able to parse even malformed data without breaking, since this could be used by an attacker to send a malicious packet that breaks software like Suricata.
What libfuzzer does to fuzz is to execute the same function in a loop in the same process. This makes libfuzzer faster than other tools like AFL that spawns a process per each test case. However, as a consequence, if the program breaks by any reason, then libfuzzer ends its execution.
Consequentially libfuzzer should be used in programs/libraries that are not allowed to break under any reason, such as libhtp, since if you are able to break it, that means that you have detected a failure that needs to be fixed.
On the contrary, if you want to fuzz a software that is allowed to break with malformed data, like a video player or pdf reader, libfuzzer is not the best option, since it is possible that the fuzzing process will be stopped frequently without getting a relevant issue and it would be very complicated to make it work. In this scenario a better approach would be to use another fuzzer like AFL.
Ok, let's back to the topic. The first thing to do is to clone libhtp with git:
git clone https://github.com/OISF/libhtp
And following the instructions in the README, it is possible to compile the project with the following commands:
./autogen.sh
./configure
make
In this case we can know that the project is using autotools since
autogen.sh
executesautoreconf -i -f -v
and there are several files likeMakefile.am
andconfigure.ac
, that are used by autotools.
Ok, compile the project in the regular way seems easy, but how do I compile this
with clang and libfuzzer? And where should I put the libfuzzer
LLVMFuzzerTestOneInput
function? In a project file? In an external file? And
what should I write in the LLVMFuzzerTestOneInput
function?
Step by step. Firstly, what should we have to write in LLVMFuzzerTestOneInput
?
That is not an easy question since it depends on the project. The point is that
we need to pass to the library a data buffer and check if it breaks. This
probably will involve to initialize the library and then call to a function that
processes the data buffer. It sounds easier than it actually is.
In this case I was able to write a test following the examples contained in the
project (specifically in the extras
folder). After a while inspecting how it
works it I was able to write this function:
#include <stddef.h>
#include <stdint.h>
#include <htp/htp.h>
#include <htp/htp_list.h>
#include <htp/htp_table.h>
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
htp_cfg_t *cfg = htp_config_create();
htp_connp_t *connp = htp_connp_create(cfg);
htp_connp_req_data(connp, 0, data, size);
// Release
htp_connp_destroy_all(connp);
htp_config_destroy(cfg);
return 0; // Non-zero return values are reserved for future use.
}
Remember that all the memory that we reserve in the
LLVMFuzzerTestOneInput
function must be freed before it ends, since there is a risk to create a memory leak and run out of memory after execute this function several times (which is what libfuzzer does).
Ok, now that we have a function to parse data, where do we put it? We should put
it in an external file not related with the library (and not in the library code
as I did firstly). After many attempts and errors, I finally create a the file
fuzz_htp.c
in the project root directory.
Now that our test is ready is time to compile the library with clang and
libfuzzer. Firstly we indicate that we want to use clang, so we set the clang
value in the CC
environment variable with the export CC=clang
command (we
use CC
since it is a C project, in case of a C++ project, we should set
CXX
).
Secondly, we indicate that we want to use libfuzzer in the CFLAGS
variable, so
we execute export CFLAGS="-g -fsanitize=fuzzer,address"
(in case of a C++
project, we would use the CXXFLAGS
variable). These parameters indicate that
we want to add debug symbols (-g
), so the error output gives us information
about failures in the source code, and on the other side
-fsanitize=fuzzer,address
express that we want to use libfuzzer and
AddressSanitizer.
AddressSanitizer allows to detect more read and write out of limits errors. It creates a shadow memory that keeps information related to the current state of the normal memory and it is able to discover if the program access to memory addresses that are not reserved or out of buffer limits.
Then, to compile, the next commands will be used:
export CC=clang
export CFLAGS="-g -fsanitize=fuzzer,address"
./autogen.sh
./configure
make
Notwithstanding, it raises an error:
~/libhtp $ export CC=clang
~/libhtp $ export CFLAGS="-g -fsanitize=fuzzer,address"
~/libhtp $ ./autogen.sh
autoreconf: Entering directory `.'
autoreconf: configure.ac: not using Gettext
autoreconf: running: aclocal --force -I m4
autoreconf: configure.ac: tracing
autoreconf: running: libtoolize --copy --force
libtoolize: putting auxiliary files in '.'.
libtoolize: copying file './ltmain.sh'
libtoolize: putting macros in AC_CONFIG_MACRO_DIRS, 'm4'.
libtoolize: copying file 'm4/libtool.m4'
libtoolize: copying file 'm4/ltoptions.m4'
libtoolize: copying file 'm4/ltsugar.m4'
libtoolize: copying file 'm4/ltversion.m4'
libtoolize: copying file 'm4/lt~obsolete.m4'
autoreconf: running: /usr/bin/autoconf --force
autoreconf: running: /usr/bin/autoheader --force
autoreconf: running: automake --add-missing --copy --force-missing
configure.ac:86: installing './compile'
configure.ac:7: installing './missing'
htp/Makefile.am: installing './depcomp'
autoreconf: Leaving directory `.'
~/projects/libhtp $ ./configure
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /usr/bin/mkdir -p
checking for gawk... no
checking for mawk... mawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking for gcc... clang
checking whether the C compiler works... no
configure: error: in `/home/user/projects/libhtp':
configure: error: C compiler cannot create executables
See `config.log' for more details
If we check the config.log
file, we can find the following:
configure:3410: checking whether the C compiler works
configure:3432: clang -g -fsanitize=fuzzer,address -O2 -O2 conftest.c >&5
/usr/bin/ld: /tmp/conftest-859fac.o: in function `main':
/home/user/projects/libhtp/conftest.c:14: multiple definition of `main'; /usr/lib/llvm-10/lib/clang/10.0.0/lib/linux/libclang_rt.fuzzer-x86_64.a(fuzzer.o):(.text.main+0x0): first defined here
/usr/bin/ld: /usr/lib/llvm-10/lib/clang/10.0.0/lib/linux/libclang_rt.fuzzer-x86_64.a(fuzzer.o): in function `main':
(.text.main+0x12): undefined reference to `LLVMFuzzerTestOneInput'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
There are two errors, on one hand multiple definition of `main'
and on the
other undefined reference to `LLVMFuzzerTestOneInput'
.
The first error is caused by a conflict between libfuzzer, that wants to create
a binary with a main
function, and autotools, that defines a main function in
an auxiliar file.
The second error is about libfuzzer trying to link with our function
LLVMFuzzerTestOneInput
, but since our file is not in the compilation process
(yet), libfuzzer cannot find it.
In order to solve these errors we can divide the compilation in two separate
steps. Firstly, we need to compile the library and then our fuzzer by linking
our file fuzz_htp.c
with the library.
To do this we need to start by indicating to libfuzzer that we don't want to
generate the fuzzer executable with the library, but we want the instrumentation
code (so libfuzzer can find new code paths). So we change the CFLAGS
value
from -g -fsanitize=fuzzer,address
to -g -fsanitize=fuzzer-no-link,address
.
Then, the following commands are used to compile the library:
export CC=clang
export CFLAGS="-g -fsanitize=fuzzer-no-link,address"
./autogen.sh
./configure
make
If we executed this, then it works (I skip the output for the sake of space).
Now is time to link our fuzzer with the library, but, where is the library?
Well, autotools usually move the generated libraries to the .libs
hidden
directory, that can be in the root folder or in the source code folder. Anyway,
we can use find
to get the libraries:
~/libhtp $ find . -name *.a
./htp/lzma/.libs/liblzma-c.a
./htp/.libs/libhtp-c.a
./htp/.libs/libhtp.a
#+end_scr
Here they are. We want to test =libhtp.a= so we execute the following:
#+begin_src
clang -g -fsanitize=fuzzer,address fuzz_htp.c -I . htp/.libs/libhtp.a -lz -o fuzz-htp
In this command we specify that we want to generate the fuzzing binary that
calls our function in fuzz_htp.c
, we link the library htp/.libs/libhtp.a
and
we indicate that we want to use the header files (.h
) in this project with
-I .
(-lz
is to link a required compression library).
Finally, we execute ./fuzz-htp
and wait for the library to break.
~/libhtp$ ./fuzz-htp
INFO: Seed: 2543270220
INFO: Loaded 1 modules (3551 inline 8-bit counters): 3551 [0x5f8f10, 0x5f9cef),
INFO: Loaded 1 PC tables (3551 PCs): 3551 [0x5f9cf0,0x607ae0),
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
INFO: A corpus is not provided, starting from an empty corpus
#2 INITED cov: 112 ft: 113 corp: 1/1b exec/s: 0 rss: 31Mb
NEW_FUNC[1/12]: 0x55fcd0 in htp_connp_REQ_FINALIZE /home/user/projects/libhtp/htp/htp_request.c:838
NEW_FUNC[2/12]: 0x5626d0 in htp_connp_REQ_PROTOCOL /home/user/projects/libhtp/htp/htp_request.c:726
#4 NEW cov: 166 ft: 182 corp: 2/5b lim: 4 exec/s: 0 rss: 33Mb L: 4/4 MS: 2 ShuffleBytes-CrossOver-
#6 NEW cov: 166 ft: 186 corp: 3/9b lim: 4 exec/s: 0 rss: 33Mb L: 4/4 MS: 2 CopyPart-ChangeBinInt-
#8 NEW cov: 166 ft: 196 corp: 4/11b lim: 4 exec/s: 0 rss: 33Mb L: 2/4 MS: 2 ShuffleBytes-CopyPart-
#13 NEW cov: 166 ft: 197 corp: 5/13b lim: 4 exec/s: 0 rss: 33Mb L: 2/4 MS: 5 EraseBytes-ChangeByte-ChangeByte-ChangeByte-InsertByte-
#20 NEW cov: 166 ft: 199 corp: 6/16b lim: 4 exec/s: 0 rss: 33Mb L: 3/4 MS: 2 ChangeByte-InsertByte-
NEW_FUNC[1/4]: 0x55ee90 in htp_connp_req_receiver_finalize_clear /home/user/projects/libhtp/htp/htp_request.c:131
NEW_FUNC[2/4]: 0x5634c0 in htp_connp_REQ_IGNORE_DATA_AFTER_HTTP_0_9 /home/user/projects/libhtp/htp/htp_request.c:910
#47 NEW cov: 184 ft: 218 corp: 7/20b lim: 4 exec/s: 0 rss: 33Mb L: 4/4 MS: 2 ChangeBinInt-CrossOver-
#52 NEW cov: 191 ft: 229 corp: 8/24b lim: 4 exec/s: 0 rss: 33Mb L: 4/4 MS: 5 CrossOver-ChangeBit-EraseBytes-ShuffleBytes-CrossOver-
#53 NEW cov: 191 ft: 231 corp: 9/28b lim: 4 exec/s: 0 rss: 33Mb L: 4/4 MS: 1 CopyPart-
#56 NEW cov: 191 ft: 232 corp: 10/30b lim: 4 exec/s: 0 rss: 33Mb L: 2/4 MS: 3 ChangeBit-ShuffleBytes-EraseBytes-
#83 NEW cov: 192 ft: 233 corp: 11/34b lim: 4 exec/s: 0 rss: 33Mb L: 4/4 MS: 2 CopyPart-ChangeBit-
We should generate some initial test cases to improve the fuzzing process, but we are going to let that part for another post.
See you and happy fuzzing.