How to test the Linux kernel entry code 07/02 Update SLTechnology News&Howtos

How to test the Linux kernel entry code

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article introduces the knowledge of "how to test Linux kernel entry code". Many people will encounter this dilemma in the operation of actual cases, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

General register + delta status

One thing we haven't covered so far is to set other general-purpose registers to random values as well. The entry code does use some general-purpose registers during its work, and if we do have a problem somewhere, it is likely to crash due to random values.

We may also want to find more subtle vulnerabilities-- while these vulnerabilities do not crash the kernel completely, they may leak the kernel address to a register that user space has never seen before. One way to check that the kernel is correct, whether our registers / flags have been saved, etc., is to write out the state of the registers after returning from kernel mode. This is not difficult to achieve because we can store the values of all (or at least most) registers in a fixed address (for example, in data pages that we have used for other purposes). The difficulty here is how to combine it with running multiple entry attempts (entry attempts) / system calls in a child process, because you need to interweave sanity checks with entry attempts, which can be cumbersome.

Minimize the probability of collapse

As we mentioned in the second article, the cost of crashing a child process is quite high because it means starting a completely new child process. Therefore, avoiding crashes as much as possible (and running as many entry attempts as possible in the same child process) may be a viable strategy to improve fuzzer performance. This consists of two main parts:

Save / restore the desired state on the line, for example, you want to save and restore% rsp so that subsequent pushf/popf instructions can continue to work.

Recovering from a signal handler, for example, by installing a handler, can restore the process to a known good state.

Check the generated assembly code

Although the code is easy to make errors when generating assembly code, it is difficult to notice because the program crashes and you can't see that what you get is an unexpected result. I've had a similar problem, but I haven't noticed it for two years: I accidentally used the wrong byte order when coding the address of the ljmp Operand, so it never actually runs anything in 32-bit compatibility mode!

An easy way to examine assembly code is to use a disassembly library such as udis86 and then validate the generated code manually.

# include... Ud_t u; ud_init (& u); ud_set_vendor (& u, UD_VENDOR_INTEL); ud_set_mode (& u, 64); ud_set_pc (& u, (uint64_t) mem); ud_set_input_buffer (& u, (unsigned char *) mem, (char *) out-(char *) mem); ud_set_syntax (& u, UD_SYN_ATT) While (ud_disassemble (& u)) fprintf (stderr, "lx% s\ n", ud_insn_off (& u), ud_insn_asm (& u)); fprintf (stderr, "\ n")

Interaction of KVM/Xen/Intel/AMD

In one case, we saw interaction with KVM, where launching any KVM instance destroys the size of the GDTR (GDT register) and allows fuzzer to crash by using segments that exceed the expected size of the GDT. It turns out that this vulnerability is exploitable and can gain execution privileges of ring 0. In another case, we saw run-time interaction in a hardware-accelerated nested client (the client in the client).

In general, KVM needs to simulate some of the features of the underlying hardware, which adds considerable complexity. Fuzzer is likely to find vulnerabilities in hypervisors such as KVM or Xen, so it is valuable to run fuzzer under different bare metal CPU and multiple hypervisors.

To create a KVM instance programmatically, see the article KVM host in a few lines of code written by Serge Zaitsev.

A related interesting experiment might be to compile fuzzer for Windows or other operating systems running on x86 and see how they work. I simply tested the Linux binaries on WSL (Windows Subsystem for Linux) and nothing bad happened.

Configuration / Startup option

Configuration / startup options affect the specific operation of the entry code. Here are the options I found in the latest kernel:

$grep-o 'CONFIG_ [A-Z0-9 _] *' arch/x86/entry/entry_64*.S | sort | uniq CONFIG_DEBUG_ENTRY CONFIG_IA32_EMULATION CONFIG_PARAVIRT CONFIG_RETPOLINE CONFIG_STACKPROTECTOR CONFIG_X86_5LEVEL CONFIG_X86_ESPFIX64 CONFIG_X86_L1_CACHE_SHIFT CONFIG_XEN_PV

In fact, there are more options, all hidden in the header file. Building multiple kernels through different combinations of these options can help reveal broken combinations that may only occur in marginal situations triggered by fuzzer.

By looking at Documentation/admin-guide/kernel-parameters.txt, you can also find some options that may affect the entry code. Here is a Python script that generates a random combination of configuration options, which is useful for passing the kernel command line using KVM:

Import random flags = "" nopti nospectre_v1 nospectre_v2 spectre_v2_user=off spec_store_bypass_disable=off l1tf=off mds=off tsx_async_abort=off kvm.nx_huge_pages=off noapic noclflush nosmap nosmep noexec32 nofxsr nohugeiomap nosmt nosmt noxsave noxsaveopt noxsaves intremap=off nolapic nomce nopat nopcid norandmaps noreplace-smp nordrand nosep nosmp nox2apic "" .split () print ('. Join (random.sample (flags, 5)), "nmi_watchdog=%u"% (random.randrange (2),))

Ftrace

When Ftrace is enabled, some code is inserted into the entry code, such as for system calls and irqflags tracing. This may also be well worth testing, so I suggest you adjust these files (under the / sys/kernel/tracing path) before running fuzzer:

PTRACE_SYSCALL

We have seen that ptrace has changed the way it handles the entry / exit of system calls (because the process needs to be stopped and the tracker needs to be notified), so it is best to run part of the entry attempt using ptrace_syscall under ptrace (). When stopped by ptrace, it may also be interesting to try to adjust some / all registers of the process being tracked. It is very difficult to complete this task correctly, so I will not introduce it here.

Mkinitrd.sh

When I test in VM, I prefer to bind the program in initrd and run it as init (pid1), so that I don't need to copy it to the file system image. You can use a script like this:

#! / bin/bash set-e set-x rm-rf initrd/ mkdir initrd/ lm +-static-Wall-std=c++14-O2-g-o initrd/init main.cc-lm (cd initrd/ & & (find | cpio-o-H newc))\ | gzip-c\ > initrd.entry-fuzz.gz

If you are using Qemu/KVM, just pass in-initrd initrd.entry-fuzz.gz and it will run fuzzer immediately after booting.

Stain inspection

If fuzzer does encounter some kind of kernel crash or vulnerability, it's useful to make sure we don't miss them. I personally like to use the parameter ops=panic panic_on_warn panic=-1 on the kernel command line and pass-no-reboot to Qemu/KVM;, which ensures that any warning causes Qemu to exit immediately (leaving any diagnostics on the terminal). If you are running fuzzer using a dedicated bare metal (for example, using the initrd method above), you can make panic=0, which will only hang the machine.

If you are testing on an ordinary workstation and do not want the entire machine to die, you can check if the kernel is contaminated (whenever warnings or vulnerabilities occur), and then exit directly:

Int tainted_fd = open ("/ proc/sys/kernel/tainted", O_RDONLY); if (tainted_fd = =-1) error (EXIT_FAILURE, errno, "open ()"); char tainted_orig_buf [16]; ssize_t tainted_orig_len = pread (tainted_fd, tainted_orig_buf, sizeof (tainted_orig_buf), 0); if (tainted_orig_len =-1) error (EXIT_FAILURE, errno, "pread ()") While (1) {/ / generate + run test case... Char tainted_buf [16]; ssize_t tainted_len = pread (tainted_fd, tainted_buf, sizeof (tainted_buf), 0); if (tainted_len =-1) error (EXIT_FAILURE, errno, "pread ()"); if (tainted_len! = tainted_orig_len | | memcmp (tainted_buf, tainted_orig_buf, tainted_len) {fprintf (stderr, "Kernel became tainted, stopping.\ n") / / TODO: dump hex bytes or disassembly exit (EXIT_FAILURE);}}

Network log

If the kernel crashes and it's not clear what the problem is, it's useful to record everything you're trying to do on the network. I'll give you a simple framework for UDP logging:

Int main (...) {int udp_socket = socket (AF_INET, SOCK_DGRAM, 0); if (udp_socket = =-1) error (EXIT_FAILURE, errno, "socket (AF_INET, SOCK_DGRAM, 0)"); struct sockaddr_in remote_addr = {}; remote_addr.sin_family = AF_INET; remote_addr.sin_port = htons (21000) Inet_pton (AF_INET, "10.5.0.1", & remote_addr.sin_addr.s_addr); if (connect (udp_socket, (const struct sockaddr *) & remote_addr, sizeof (remote_addr)) =-1) error (EXIT_FAILURE, errno, "connect ()");...}

Then, after generating the code for each entry / exit, you can simply dump it on this socket:

Write (udp_socket, (char *) mem, out-(uint8_t *) mem)

We hope that the last data received by the log server (in this case, 10.5.0.1 purl 21000) will contain the assembly code that caused the crash. Depending on the specific use case, it is sometimes necessary to add a framework so that the specific start and end of the test case can be easily determined.

Check if fuzzer can catch known vulnerabilities

Over the years, many vulnerabilities have been found in the entry code. Therefore, we can build some old, vulnerable kernels and run fuzzer on them to ensure that it does catch these known vulnerabilities. We can also measure the efficiency of fuzzer by the time it takes to find vulnerabilities, but we must be careful not to over-optimize to prevent them from finding only these vulnerabilities.

Code coverage / pile insertion technical feedback

Pile insertion technology

One of the reasons fuzzer like AFL and syzkaller are so effective is that they use code coverage to measure very accurately the effect of adjusting the binaries of a test case. This is usually done by compiling C code with a special compiler flag that emits additional code to collect coverage data. This is a tricky issue for assembly code, especially entry code, because we won't know exactly what state CPU is in (and which registers / states we can break) without manually checking every instruction in the code.

However, if we really want to improve code coverage, there is one way to do this: the x86 instruction set contains an instruction that accepts both an immediate number and an immediate number address without affecting any other states (such as flags): movb$value, (addr). The only thing we need to pay attention to is to make sure that addr is a compile-time constant address, which is always mapped to some physical memory and marked as present in the page table, so that there are no page errors when we access it. Fortunately, Linux already provides a mechanism: fixmaps, that is, "compile-time virtual memory allocation". In this way, we can statically assign a compile-time constant virtual address that points to the same underlying physical page for all tasks and contexts. Because it is shared between tasks, we must clear or otherwise save / restore these values when switching between processes.

By using a combination of C macros and assembler macros, we can get a very intrusive override primitive that you can add anywhere in the entry code to record the code path taken. I have written a patch, but there are some edge cases that need to be addressed (for example, when SMAP is enabled, it is not fully effective). In addition, I doubt whether x86 maintainers will like to add these coverage comments to the entry code.

In terms of fuzzer, one thing that complicates pile insertion feedback is that you need a complete system to track test cases, results, and (possibly) what mutations you apply to each test case. Because of this, I chose to ignore code coverage for the time being; in any case, this is a broad fuzzing topic that has little to do with x86 or entry code in particular.

Performance counter / hardware feedback

A completely different way to collect code coverage is to use performance counters. I know that two projects have done this recently:

Resmack Fuzz Test

KAFL

The biggest benefit here is obviously that there is no need for detection (kernel modification). The biggest disadvantage is that the performance counter is not completely certain (possibly due to external factors such as hardware interruptions). Maybe it doesn't work on the entry code either, because it only takes a short time to assemble the code.

That's all for "how to test the Linux kernel entry code". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.