.:: Phrack Magazine ::.

Issues: [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 6 ] [ 7 ] [ 8 ] [ 9 ] [ 10 ] [ 11 ] [ 12 ] [ 13 ] [ 14 ] [ 15 ] [ 16 ] [ 17 ] [ 18 ] [ 19 ] [ 20 ] [ 21 ] [ 22 ] [ 23 ] [ 24 ] [ 25 ] [ 26 ] [ 27 ] [ 28 ] [ 29 ] [ 30 ] [ 31 ] [ 32 ] [ 33 ] [ 34 ] [ 35 ] [ 36 ] [ 37 ] [ 38 ] [ 39 ] [ 40 ] [ 41 ] [ 42 ] [ 43 ] [ 44 ] [ 45 ] [ 46 ] [ 47 ] [ 48 ] [ 49 ] [ 50 ] [ 51 ] [ 52 ] [ 53 ] [ 54 ] [ 55 ] [ 56 ] [ 57 ] [ 58 ] [ 59 ] [ 60 ] [ 61 ] [ 62 ] [ 63 ] [ 64 ] [ 65 ] [ 66 ] [ 67 ] [ 68 ] [ 69 ] [ 70 ] [ 71 ] [ 72 ]
Get tar.gz
Current issue : #59 | Release date : 2002-07-28 | Editor : Phrack Staff
Introduction	Phrack Staff
Loopback	Phrack Staff
Linenoise	Phrack Staff
Handling the Interrupt Descriptor Table	kad
Advances in kernel hacking II	palmers
Defeating Forensic Analysis on Unix	the grugq
Advances in format string exploitation	riq & gera
Runtime process infection	anonymous author
Bypassing PaX ASLR protection	Tyler Durden
Execution path analysis	Jan K. Rutkowski
Cuts like a knife, SSHarp	stealth
Building ptrace injecting shellcodes	anonymous author
Linux/390 shellcode development	johnny cyberpunk
Writing Linux Kernel Keylogger	rd
CRYPTOGRAPHIC RANDOM NUMBER GENERATORS	DrMungkee
Playing with Windows /dev/(k)mem	crazylord
Phrack World News	Phrack Staff
Phrack magazine extraction utility	Phrack Staff
Title : Execution path analysis
Author : Jan K. Rutkowski
                             ==Phrack Inc.==

               Volume 0x0b, Issue 0x3b, Phile #0x0a of 0x12


|=------=[ Execution path analysis: finding kernel based rootkits ]=-----=|
|=-----------------------------------------------------------------------=|
|=----------=[ Jan K. Rutkowski <[email protected]> ]=----------=|


--[ Introduction

Over the years mankind has developed many techniques for masking presence
of the attacker in the hacked system. In order to stay invisible modern
backdoors modify kernel structures and code, causing that nobody can trust
the kernel. Nobody, including IDS tools...

In the article I will present a technique based on counting executed
instructions in some system calls, which can be used to detect various
kernel rootkits. This includes programs like SucKIT or prrf (see [SUKT01]
and [PALM01]) which do not modify syscall table. I will focus on Linux
kernel 2.4, running on Intel 32-bit Family processor (ia32). 

Also at the end of the article the PatchFinder source code is included - a
proof of concept for described technique.

I am not going to explain how to write a kernel rootkit. For details I send
reader to the references. However I briefly characterize known techniques
so their resistance to presented detection method can be described.

--[ Background

Lets take a quick look at typical kernel rootkits. Such programs must solve
two problems: find a way to get into the kernel and modify the kernel in a
smart way. On Linux the first task can be achieved by using Loadable Kernel
Modules (LKM) or /dev/kmem device.

----[ getting into the kernel

Using LKM is the easiest and most elegant way to modify the running kernel.
It was probably first discussed by halflife in [HALF97].  There are many
popular backdoors which use LKM (see [KNAR01], [ADOR01], [PALM01]). However
this technique has a weak point - LKM can be disabled on some systems.

When we do not have LKM support we can use technique, developed by Silvio
Cesare, which uses /dev/kmem to access directly kernel memory (see
[SILV98]).  There is no easy work-around for this method, since patching
do_write_mem() function is not sufficient, as it was recently showed by
Guillaume Pelat (see [MMAP02]). 

----[ modifying syscall table

Providing that we can write to kernel memory, we face the problem what to
modify.

Many rootkits modifies syscall table in order to redirect some useful
system calls like sys_read(), sys_write(), sys_getdents(), etc... For
details see [HALF97] and source code of one of the popular rootkit
([KNAR01], [ADOR01]).  However this method can be traced, by simply
comparing current syscall table with the original one, saved after kernel
creation.

When there is LKM mechanism enabled in the system, we can use simple
module, which read syscall table (directly accessing kernel memory) and
then puts it into the userland (due to /proc filesystem for example).

Unfortunately when LKM is not supported we can not read kernel memory
reliably, since we use sys_read() or sys_mmap() to read or mmap /dev/kmem.
We can not be sure that malicious code we are trying to find, does not
alter sys_read()/sys_mmap() system calls.

----[ modifying kernel code

Instead of changing pointers in the syscall table, malicious program can
alter some code in the kernel, like system_call function. In this case
analysis of syscall table would not show anything. Therefore we would like
to scan scan kernel memory and check whether the code area has been
modified.

It is simple to implement if there is LKM enabled. However, if we do not
have LKM support, we must access kernel memory through /dev/kmem and again
we face the problem of unreliable sys_read()/sys_mmap().

SucKIT (see [SUKT01]) is an example of rootkit which uses /dev/kmem to
access kernel and then changing system_call code, not touching original
syscall table.  Although SucKIT does not alter sys_read() and sys_mmap()
behavior, this feature can be added, making it impossible to detect such
backdoor by conventional techniques (i.e. memory scanning through
/dev/kmem)...

----[ modifying other pointers

In the previous issue of Phrack palmers presented nice idea of changing
some pointers in /proc filesystem (see [PALM01]). Again if our system has
LKM enabled we can, at least theoretically, check all the kernel structures
and find out if somebody has changed some pointers. However it could be
difficult in implementation, because we have to foresee all potential
places the rootkit may exploit.

With LKM disabled, we face the same problem as explained in the above
paragraphs.

--[ Execution path analysis (stepping the kernel)

As we can see, detection of kernel rootkits is not trivial. Of course if we
have LKM support enabled we can, theoretically, scan the whole kernel
memory and find the intruder. However we must be very careful in deciding
what to look for. Differences in the code indicates of course that
something is wrong.  Although change of some data should also be treated as
alarm (see prrf.o again), modifications of others structures might be
result of normal kernel daily tasks.

The things become even more complicated when we disable LKM on our kernel
(to be more secure:)). Then, as I have just said, we can not read kernel
memory reliable, because we are not sure that sys_read() returns real bytes
(so we can't read /dev/kmem). We are also not sure that sys_mmap2() fills
mapped pages with correct bytes...

Lets try from other side. If somebody modified some kernel functions, it is
very probable, that the number of instructions executed during some system
calls (for e.g. sys_getdents() in case an attacker is trying to hide files)
will be different than in the original kernel. Indeed, malicious code  must
perform some additional actions, like cutting off secret filenames, before
returns results to userland. This implies execution of many more
instructions compared to not infected system. We can measure this
difference!

----[ hardware stepper 

The ia32 processor, can be told to work in the single-step mode.  This is
achieved by setting the TF bit (mask 0x100) in EFLAGS register.  In this
mode processor will generate a debug exception (#DB) after every execution
of the instruction. 

What is happened when the #DB exception is generated? Processor stops
execution of the current process and calls debug exception handler. The #DB
exception handler is described by trap gate at interrupt vector 1.

In Intel's processors there is an array of 256 gates, each describing
handler for a specific interrupt vector (this is probably the Intel's
secret why they call this scalar numbers 'vectors'...).

For example at position 0x80 there is a gate which tells where is located
handler of the 0x80 trap - the Linux system call. As we all know it is
generated by the process by means of the 'int 0x80' instruction. This array
of 256 gates is called Interrupt Descriptor Table (IDT) and is pointed by
the idtr register. 

In Linux kernel, you can find this handler in arch/i386/kernel/entry.S
file.  It is called 'debug'. As you can see, after some not interesting
operations it calls do_debug() function, which is defined in
arch/i386/kernel/traps.c.

Because #DB exception is devoted not only for single stepping but to many
other debugging activities, the do_debug() function is a little bit
complex.  However it does not matter for us. The only thing we are
interested in, is that after detecting the #DB exception was caused by
single stepping (TF bit) a SIGTRAP signal is sent to traced process. The
process might catch this signal. So, it looks that we can do something like
this, in our userland program:

	volatile int traps = 0;

	int trap () {
		traps++;
	}

	main () {
		...
		signal (SIGTRAP, sigtrap);

		xor_eflags (0x100);
		/* call syscall we want to test */
		read (fd, buff, sizeof (buff));
		xor_eflags (0x100);

		printf ("testing syscall takes %d instruction\n", traps);
	}

It looks simple and elegant. However has one disadvantage - it does not
work as we want. In variable traps we will find only the number of
instructions executed in userland. As we all know, read() is only a wrapper
to 'int 0x80' instruction, which causes the processor calls 0x80 exception
handler. Unfortunately the processor clears TF flag when executing 'int x'
(and this instruction is causing privilege level changing).

In order to stepping the kernel, we must insert some code into it, which
will be responsible for setting the TF flag for some processes. The good
place to insert such code is the beginning of the 'system_call' assembler
routine (defined in arch/i386/kernel/entry.S.), which is the entry for the
0x80 exception handler.

As I mentioned before the address of 'system_call' is stored in the gate
located at position 0x80 in the the Interrupt Descriptor Table (IDT). Each
gateway (IDT consist of 256 of them) has the following format:

	struct idt_gate {
		unsigned short  off1;
		unsigned short  sel;
		unsigned char   none, flags;
		unsigned short  off2;
	} __attribute__ ((packed));

The 'sel' field holds the segment selector, and in case of Linux is equal
to __KERNEL_CS. The handler routine is placed at (off2<<16+off1) within the
segment, and because the segments in Linux have the base 0x0, it means that
it is equal to the linear address. 

The fields 'none' and 'flags' are used to tell the processor about some
additional info about calling the handler. See [IA32] for detail.

The idtr register, points to the beginning of IDT table (it specifies
linear address, not logic as was in idt_gate):

	struct idtr {
		unsigned short  limit;
		unsigned int    base;	/* linear address of IDT table */
	} __attribute__ ((packed));

Now we see, that it is trivial to find the address of system_call in our
Linux kernel. Moreover, it is also easy to change this address to a new
one. Of course we can not do it from userland. That is why we need a kernel
module (see later discussion about what if we have LKM disabled), which
changes the address of 0x80 handler and inserts the new code, which we use
as the new system_call. And this new code may look like this: 

	ENTRY(PF_system_call)		
		pushl %ebx
		movl $-8192, %ebx
		andl %esp, %ebx			# %ebx <-- current
		
		testb $PT_PATCHFINDER,24(%ebx)	# 24 is offset of 'ptrace'
		je continue_syscall
		pushf
		popl %ebx
		orl $TF_MASK, %ebx		# set TF flag
		pushl %ebx
		popf
		
	continue_syscall:
		popl %ebx
		jmp *orig_system_call

As you can see, I decided to use 'ptrace' field within process descriptor,
to indicate whether a particular process wants to be single traced. After
setting the TF flag, the original system_call handler is executed, it calls
specific sys_xxx() function and then returns the execution to the userland
by means of the 'iret' instruction. Until the 'iret' every single
instruction is traced.

Of course we have to also provide our #DB handler, to account all this
instructions (this will replace the system's one):

	ENTRY(PF_debug)
		incl PF_traps 
		iret
		
The PF_traps variable is placed somewhere in the kernel during module
loading.

To be complete, we also need to add a new system call, which can be called
from the userland to set the PT_PATCHFINDER flag in current process
descriptor's 'ptrace' variable, to reset or return the counter value.

	asmlinkage int sys_patchfinder (int what) {
		struct task_struct *tsk = current;

		switch (what) {
			case PF_START:
				tsk->ptrace |= PT_PATCHFINDER;
				PF_traps = 0;
				break;
			case PF_GET:
				tsk->ptrace &= ~PT_PATCHFINDER;
				break;
			case PF_QUERY:
				return PF_ANSWER;
			default:
				printk ("I don't know what to do!\n");
				return -1;
		}
		return PF_traps;
	}

In this way we changed the kernel, so it can measure how many instructions
each system call takes to execute. See module.c in attached sources for
more details.

----[ the tests

Having the kernel which allows us to counter instructions in any system
call, we face the problem what to measure. Which kernel functions should we
check?

To answer this question we should think what is the main task of every
rootkit? Well, its job is to hide presence of attacker's
process/files/connections in the rooted system. And those things should be
hidden from such tools like ls, ps, netstat etc. These programs collect the
system information through some well known system calls. 

Even if backdoor does not touch syscall directly, like prrf.o, it modifies
some kernel functions which are activated by one of the system call. The
problem lies in the fact, that these modified functions does not have to be
executed during every system call.  For example if we modify only some
pointer to reading functions in procfs, then attacker's code will be
executed only when read() is called in order to read some specific file,
like /proc/net/tcp.

It complicates detection a little, since we have to measure execution time
of particular system call with different arguments. For example we test
sys_read() by reading "/etc/passwd", "/dev/kmem" and "/proc/net/tcp" (i.e.
reading regular file, device and pseudo proc-file).

We do not test all system calls (about 230) because we assume that some
routine tasks every backdoor should do, like hiding processes or files,
will use only some little subset of syscalls.  

The tests included in PatchFinder, are defined in tests.c file. The
following one is trying to find out if somebody is hiding some processes
and/or files in the procfs:

	int test_readdir_proc () {
		int fd, T = 0;
		struct dirent de[1];

		fd = open ("/proc", 0, 0);
		assert (fd>0);	

		patchfinder (PF_START);	
		getdents (fd, de,  sizeof (de));
		T = patchfinder (PF_GET);
		
		close (fd);
		return T;	
	}

Of course it is trivial to add a new test if necessary. There is however,
one problem: false positives. Linux kernel is a complex program, and most
of the system calls have many if-then clauses which means different patch
are executed depending on many factors. These includes caches and 'internal
state of the system', which can be for e.g. a number of open TCP
connections.  All of this causes that sometime you may see that more (or
less) instructions are executed. Typically this differences are less then
10, but in some tests (like writing to the file) it may be even 200!.

This could be minimizing by increasing the number of iteration each test is
taken. If you see that reading "proc/net/tcp" takes longer try to reset the
TCP connections and repeat the tests. However if the differences are
significant (i.e. more then 600 instructions) it is very probably that
somebody has patched your kernel.

But even then you must be very careful, because this differences may be
caused by some new modules you have loaded recently, possibly unconscious. 

--[ The PatchFinder

Now the time has came to show the working program. A proof of concept is
attached at the end of this article.  I call it PatchFinder. It consist of
two parts - a module which patches the kernel so that it allows to debug
syscalls, and a userland program which makes the tests and shows the
results. At first you must generate a file with test results taken on the
clear system, i.e. generated after you installed a new kernel. Then you can
check your system any time you want, just remember to insert a
patchfinder.o module before you make the test. After the test you should
remove the module. Remember that it replaces the Linux's native debug
exception handler!

The results on clear system may look like this (observe the little
differences in 'diff' column):

	    test name      | current | clear | diff  | status 
	------------------------------------------------------
	open_file          |    1401|    1400|      1|  ok 
	stat_file          |    1200|    1200|      0|  ok 
	read_file          |    1825|    1824|      1|  ok 
	open_kmem          |    1440|    1440|      0|  ok 
	readdir_root       |    5784|    5774|     10|  ok 
	readdir_proc       |    2296|    2295|      1|  ok 
	read_proc_net_tcp  |   11069|   11069|      0|  ok 
	lseek_kmem         |     191|     191|      0|  ok 
	read_kmem          |     322|     321|      1|  ok 

The tests on the same system, done when there was a adore loaded shows the
following:

	    test name      | current | clear | diff  | status 
	------------------------------------------------------
	open_file          |    6975|    1400|   5575| ALERT!
	stat_file          |    6900|    1200|   5700| ALERT!
	read_file          |    1824|    1824|      0|  ok 
	open_kmem          |    6952|    1440|   5512| ALERT!
	readdir_root       |    8811|    5774|   3037| ALERT!
	readdir_proc       |   14243|    2295|  11948| ALERT!
	read_proc_net_tcp  |   11063|   11069|     -6|  ok 
	lseek_kmem         |     191|     191|      0|  ok 
	read_kmem          |     321|     321|      0|  ok 

Everything will be clear when you analyze adore source code :). Similar
results can be obtained for other popular rootkits like knark or palmers'
prrf.o (please note that the prrf.o does not change the syscall table
directly).

The funny thing happens when you try to check the kernel which was
backdoored by SucKIT. You should see something like this: 

				---== ALERT! ==--
	It seems that module patchfinder.o is not loaded. However if you
	are sure that it is loaded, then this situation means that
	with your kernel is something wrong! Probably there is a rootkit
	installed!

This is caused by the fact that SucKIT copies original syscall table into
new position, changes it in the fashion like knark or adore, and then
alters the address of syscall table in the system_call code so that it
points to this new copy of the syscall table. Because this copied syscall
table does not contain a patchfinder system call (patchfinder's module is
inserted just before the tests), the testing program is unable to speak
with the module and thinks it is not loaded. Of course this situation easy
betrays that something is wrong with the kernel (or that you forgot to load
the module:)).

Note, that if patchfinder.o is loaded you can not start SucKIT. This is due
its installation method which assumes how the system_call's binary code
should look like.  SucKIT is very surprised seeing PS_system_call instead
of original Linux 0x80 handler...

There is one more thing to explain. The testing program, before the
beginning of the tests, sets SCHED_FIFO scheduling policy with the highest
rt_priority. In fact, during the tests, only the patchfinder's process has
CPU (only hardware interrupts are serviced) and is never preempted, until
it finishes the tests. There are three reasons for such approach.

TF bit is set at the beginning of the system_call, and is cleared when the
'iret' instruction is executed at the end of the exception handler. During
the time the TF bit is set, sys_xxx() is called, but after this some
scheduling related stuff is also executed, which can lead to process
switch.  This is not good, because it causes more instruction to be
executed (in the kernel, we do not care about instructions executed in the
switched process of course).

There is also a more important issue. I observed that, when I allow process
switching with TF bit set, it may cause processor restart(!) after a few
hundred switches. I did not found any explanation of such behavior.  The
following problem does not occur when SET_SCHED is set.

The third reason to use realtime policy is to guarantee system state as
stable as possible. For example if our test was run in parallel with some
process which opens and reads lots of files (like grep), this could affect
some tests connected with sys_open()/sys_read().

The only disadvantage of such approach is that your system is inaccessible
during the tests. However it does not take long since a typical test
session (depending on the number of iterations per each test) takes less
then 15 seconds to complete.

And a technical detail: attached source code is using LKM to install
described kernel extensions. At the beginning of the article I have said,
that on some systems LKM is not compiled into the kernel. We can use only
/dev/kmem. I also said that we can not relay on /dev/kmem since we are
using syscalls to access it. However it should not be a problem for tool
like patchfinder, because if rootkit will disturb in loading of our
extensions we should see that the testing program is not working. See also
discussion in the next section.

--[ Cheating & hardening patchfinder program

Now I will try to discuss a possible methods of compromising presented
method in general and attached patchfinder program in particular. I will
also try to show how to defend against such attacks, describing the
properties of the next generation patchfinder...

The first thing a malicious code can do is to check if it is traced. It may
simply execute:

		pushf
		popl %ebx
		testb $0x100, %ebx
		jne i_am_traced
		# contine executing
		...

	i_am_traced:
		# deinstall for
		# a moment
		...
	
When malicious code realize that it is traced it may uninstall itself from
the specific syscall. However, before that, it will settle in the timer
interrupt handler, so after for e.g. 1 minute it will back to that syscall. 

How to defend such trick? Well, remember that we (i.e. patchfinder) are
tracing the code all the time. So the debug handler (which is provided by
us) can detect that 'pushf' instruction has been just executed. Then it may
alter the 'eflags' saved on the stack (by just executed 'pushf'), so that
for the traced process it looks like the TF flags was cleared.

I do not see at the moment any other general attack against execution trace
analysis. However, there are possibilities of cheating a specific
implementation, like patchfinder, which I am going to discuss now.

The attacker can alter sys_write() and check if process named 'patchfinder'
is trying to write a specific line to stdout - a line with some test
result.  Then it will change the write()'s buffer and user will see faked
results.  Similar, the rootkit may realized that a file with clear system
characteristic is to be read and change the buffer returned from
sys_read(), so the patchfinder will get false references.

Moreover, when patchfinder uses a fixed number for its system call, the
malicious rootkit can change the results returned by that system call and
in result cheat testing process about the number of instruction executed.

We can defend against such attacks, by causing that on each system
patchfinder's executable will get different name and also by encrypting the
file with clear system's characteristics. More over we should use different
numbers for patchfinder's system call. The rootkit could easily trace the
patchfinder's process by seeing that it calls specific system call.

We are not limited to only to empty system calls, since we can replace any
existing system call and first check if some magic numbers are in
arguments. In case there is no magic arguments we simply call original
sys_xxx().

Although using random number for patchfinder's syscall we makes it a little
bit harder for the rootkit to cheat the results returned to the patchfinder
process, it is still possible for the malicious code to find out which
syscall is being used. It can be achieved by pattern searching for the
specific binary instruction.  This will be easy since attacker knows
everything about the source code (and binary) of the patchfinder program.

Another method can exploit the fact that patchfinder marks a process to be
traced in some specific way (i.e. setting a bit in 'ptrace' field of the
process descriptor). Malicious rootkit can replace the system_call routine
with its own version. This new version will check if the process is marked
by patchfinder and then it will use original syscall table. If it is not
marked by testing process another syscall table will be used (which has
some sys_xxx() functions replaced). It will be hard for the #DB exception
handler to find out whether the rootkit is trying to check for e.g. the
'ptrace' field, since the code doing this can have many forms.

The debug exception handler's code can also betrays where is located the
counter variable (PF_traps) in memory. Knowing this address, smart rootkit
can decrease this variable at the end of its 'operational' code, by the
number of instructions in this additional code.

The only remedy I can see for the above weaknesses can be strong
polymorphism. The idea is to add a polymorphic code generator to the
patchfinder distribution which, for every system it is installed on, will
create a different binary images for patchfinder's kernel code.  This
generation could be based on some passphrase the administrator will provide
at the installation time. 

I have not yet implemented polymorphic approach, but it looks promising...

--[ Another solutions

The presented technique is a proposition of general approach to detect
kernel based rootkits. The main problem in such actions is that we want to
use kernel to help us detect malicious code which has the full control of
our kernel. In fact we can not trust the kernel, but on the other hand want
to get some reliable information form it.

Debugging the execution path of the system calls is probably not the only
one solution to this problem. Before I have implemented patchfinder, I had
been working on another technique, which tries to exploit differences in
the execution time of some system calls. The tests were actually the same
as those which are included with patchfinder. However, I have been using
processor 'rdtsc' instruction to calculate how many cycles a given piece of
code has been executed. It worked well on processor up to 500Mhz.
Unfortunately when I tried the program on 1GHz processor I noted that the
execution time of the same code can be very different from one test to
another. The variation was too big, causing lots of false positives. And
the differences was not caused by the multitasking environment as you may
think, but lays deeply in the micro-architecture of the modern processors.
As Andy Glew explained me, these beasties have tendencies to stabilizes the
execution time on one of the possible state, depending on the initial
conditions. I have no idea how to cause the initial state to be the same
for each tests or even to explore the whole space of theses initial states.
Therefore I switched to stepping the code by the hardware debugger. However
the method of measuring the times of syscall could be very elegant... If it
was working. Special thanks to Marcin Szymanek for initial idea about this
timing-based method.

Although it can be (possibly) many techniques of finding rootkits in the
kernel, it seems that the general approach should exploit polymorphism, as
it is probably the only way to get reliable information from the
compromised kernel.

--[ Credits

Thanks to software.com.pl for allowing me to test the program on different
processors. 

--[ References

[HALF97] halflife, "Abuse of the Linux Kernel for Fun and Profit",
	 Phrack 50, 1997. 

[KNAR01] Cyberwinds, "Knark-2.4.3" (Knark 0.59 ported to Linux 2.4), 2001.

[ADOR01] Stealth, "Adore v0.42",
	 http://spider.scorpions.net/~stealth, 2001.

[SILV98] Silvio Cesare, "Runtime kernel kmem patching",
	 http://www.big.net.au/~silvio, 1998.

[SUKT01] sd, devik, "Linux on-the-fly kernel patching without LKM"
	 (SucKIT source code), Phrack 58, 2001.

[PALM01] palmers, "Sub proc_root Quando Sumus (Advances in Kernel Hacking)"
	 (prrf source code), Phrack 58, 2001.

[MMAP02] Guillaume Pelat, "Grsecurity problem - modifying
	 'read-only kernel'",
	 http://securityfocus.com/archive/1/273002, 2002.

[IA32]   "IA-32 Intel Architecture Software Developer's Manual", vol. 1-3,
	 www.intel.com, 2001.

--[ Appendix: PatchFinder source code

This is the PatchFinder, the proof of concept of the described technique.
It does not implement polymorphisms. The LKM support is need in order to
run this program. If, during test you notice strange actions (like system
Oops) this probably means that somebody rooted your system. On the other
hand it could be my bug... And remember to remove the patchfinder's module
after the tests.

<++> ./patchfinder/Makefile
MODULE_NAME=patchfinder.o
PROG_NAME=patchfinder

all: $(MODULE_NAME) $(PROG_NAME) 

$(MODULE_NAME) : module.o traps.o
	ld -r -o $(MODULE_NAME) module.o traps.o

module.o : module.c module.h
	gcc -c module.c -I /usr/src/linux/include 

traps.o : traps.S module.h
	gcc -D__ASSEMBLY__ -c traps.S


$(PROG_NAME): main.o tests.o libpf.o
	gcc -o $(PROG_NAME) main.o tests.o libpf.o

main.o: main.c main.h
	gcc -c main.c -D MODULE_NAME='"$(MODULE_NAME)"'\
			  -D PROG_NAME='"$(PROG_NAME)"'
tests.o: tests.c main.h
libpf.o: libpf.c libpf.h


clean:
	rm -fr *.o $(PROG_NAME) 
<--> ./patchfinder/Makefile
<++> ./patchfinder/traps.S
/*                                                            */
/*            The Kernel PatchFinder version 0.9              */
/*                                                            */
/* (c) 2002 by Jan K. Rutkowski <[email protected]>  */
/*                                                            */

#include <linux/linkage.h>
#define __KERNEL__
#include "module.h"

tsk_ptrace  = 24 			# offset into the task_struct

ENTRY(PF_system_call)
	pushl %ebx
	movl $-8192, %ebx
	andl %esp, %ebx			# %ebx <-- current
	
	testb $PT_PATCHFINDER,tsk_ptrace(%ebx)
	je continue_syscall
	pushf
	popl %ebx
	orl $TF_MASK, %ebx		# set TF flag
 	pushl %ebx
	popf
	
continue_syscall:
	popl %ebx
	jmp *orig_system_call

ENTRY(PF_debug)
	incl PF_traps 
	iret
	
	
<--> ./patchfinder/traps.S
<++> ./patchfinder/module.h
/*                                                            */
/*            The Kernel PatchFinder version 0.9              */
/*                                                            */
/* (c) 2002 by Jan K. Rutkowski <[email protected]>  */
/*                                                            */

#ifndef __MODULE_H
#define __MODULE_H

#define PT_PATCHFINDER	0x80	/* should not conflict with PT_xxx
				   defined in linux/sched.h */

#define TF_MASK		0x100	/* TF mask in EFLAGS */

#define SYSCALL_VECTOR	0x80
#define DEBUG_VECTOR	0x1

#define PF_START	0xfee
#define PF_GET		0xfed
#define PF_QUERY	0xdefaced
#define PF_ANSWER	0xaccede

#define __NR_patchfinder 250


#endif

<--> ./patchfinder/module.h
<++> ./patchfinder/module.c
/*                                                            */
/*            The Kernel PatchFinder version 0.9              */
/*                                                            */
/* (c) 2002 by Jan K. Rutkowski <[email protected]>  */
/*                                                            */

#define MODULE
#define __KERNEL__
#ifdef MODVERSIONS
#include <linux/modversions.h>
#endif

#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/sched.h>
#include "module.h"

#define DEBUG 1

MODULE_AUTHOR("Jan Rutkowski");
MODULE_DESCRIPTION("The PatchFinder module");
	
asmlinkage int PF_system_call(void);
asmlinkage int PF_debug (void);
int (*orig_system_call)();
int (*orig_debug)();
int (*orig_syscall)(unsigned int);
extern void *sys_call_table[];
int PF_traps;		

/* this one comes from arch/i386/kernel/traps.c */
#define _set_gate(gate_addr,type,dpl,addr) \
do { \
  int __d0, __d1; \
  __asm__ __volatile__ ("movw %%dx,%%ax\n\t" \
	"movw %4,%%dx\n\t" \
	"movl %%eax,%0\n\t" \
	"movl %%edx,%1" \
	:"=m" (*((long *) (gate_addr))), \
	 "=m" (*(1+(long *) (gate_addr))), "=&a" (__d0), "=&d" (__d1) \
	:"i" ((short) (0x8000+(dpl<<13)+(type<<8))), \
	 "3" ((char *) (addr)),"2" (__KERNEL_CS << 16)); \
} while (0)

struct idt_gate {
        unsigned short  off1;
        unsigned short  sel;
        unsigned char   none, flags;
        unsigned short  off2;
} __attribute__ ((packed));

struct idtr {
        unsigned short  limit;
        unsigned int    base;
} __attribute__ ((packed));

struct idt_gate * get_idt () {
	struct idtr idtr;
        asm("sidt %0" : "=m" (idtr));
	return (struct idt_gate*) idtr.base;
}

void * get_int_handler (int n) {
	struct idt_gate * idt_gate = (get_idt() + n);
	return (void*)((idt_gate->off2 << 16) + idt_gate->off1);
}

static void set_system_gate(unsigned int n, void *addr) {
	printk ("setting int for int %d -> %#x\n", n, addr);
	_set_gate(get_idt()+n,15,3,addr);
}

asmlinkage int sys_patchfinder (int what) {
	struct task_struct *tsk = current;

	switch (what) {
		case PF_START:
			tsk->ptrace |= PT_PATCHFINDER;
			PF_traps = 0;
			break;
		case PF_GET:
			tsk->ptrace &= ~PT_PATCHFINDER;
			break;
		case PF_QUERY:
			return PF_ANSWER;
		default:
			printk ("I don't know what to do!\n");
			return -1;
	}
	return PF_traps;
}

int init_module () {
		
    	EXPORT_NO_SYMBOLS;

	orig_system_call = get_int_handler (SYSCALL_VECTOR);
	set_system_gate (SYSCALL_VECTOR, &PF_system_call);

	orig_debug = get_int_handler (DEBUG_VECTOR);
	set_system_gate (DEBUG_VECTOR, &PF_debug);

	orig_syscall = sys_call_table[__NR_patchfinder];
	sys_call_table [__NR_patchfinder] = sys_patchfinder;
	
	printk ("Kernel PatchFinder has been succesfully"
		"inserted into your kernel!\n");
#ifdef DEBUG
	printk (" orig_system_call : %#x\n", orig_system_call);
	printk (" PF_system_calli  : %#x\n", PF_system_call);
	printk (" orig_debug       : %#x\n", orig_debug);
	printk (" PF_debug         : %#x\n", PF_debug);
	printk (" using syscall    : %d\n", __NR_patchfinder);

#endif
	return 0;
}

int cleanup_module () {
	set_system_gate (SYSCALL_VECTOR, orig_system_call);
	set_system_gate (DEBUG_VECTOR, orig_debug);
	sys_call_table [__NR_patchfinder] = orig_syscall;
	
	printk ("PF module safely removed.\n");
	return 0;
}




<--> ./patchfinder/module.c
<++> ./patchfinder/main.h
/*                                                            */
/*            The Kernel PatchFinder version 0.9              */
/*                                                            */
/* (c) 2002 by Jan K. Rutkowski <[email protected]>  */
/*                                                            */

#ifndef __MAIN_H
#define __MAIN_H

#define PF_MAGIC     "patchfinder"
#define M_GENTTBL    1	
#define M_CHECK	     2
#define MAX_TESTS    9	
#define TESTNAMESZ   32 

#define WARN_THRESHOLD 	 20
#define ALERT_THRESHHOLD 500
#define TRIES_DEFAULT	 200


typedef struct {
	int t;		
	double ft;
	char name[TESTNAMESZ];
	int (*test_func)();
} TTEST;

typedef struct {
	char  magic[sizeof(PF_MAGIC)];
	TTEST test [MAX_TESTS];	
	int   ntests;
	int   tries;
} TTBL;

#endif


<--> ./patchfinder/main.h
<++> ./patchfinder/main.c
/*                                                            */
/*            The Kernel PatchFinder version 0.9              */
/*                                                            */
/* (c) 2002 by Jan K. Rutkowski <[email protected]>  */
/*                                                            */


#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>
#include <fcntl.h>
#include <sched.h>
#include "main.h"
#include "libpf.h"

void die (char *str) {
	if (errno) perror (str);
	else printf ("%s\n", str);
	exit (1);
}

void usage () {
 printf ("(c) Jan K. Rutkowski, 2002\n");
 printf ("email: [email protected]\n");
 printf ("%s [OPTIONS] <filename>\n", PROG_NAME);
			
 printf ("  -g save current system's characteristics to file\n");
 printf ("  -c check system against saved results\n");
 printf ("  -t change number of iterations per each test\n");
 exit (0);	

}

void write_ttbl (TTBL* ttbl, char *filename) {
	int fd;
	fd = open (filename, O_WRONLY | O_CREAT);
	if (fd < 0) die ("can not create file");
	strcpy (ttbl->magic, PF_MAGIC);
	if (write (fd, ttbl, sizeof (TTBL)) < 0)
			die ("can not write to file");
	close (fd);
}

void read_ttbl (TTBL* ttbl, char *filename) {
	int fd;
	fd = open (filename, O_RDONLY);
	if (fd < 0) die ("can not open file");
	if (read (fd, ttbl, sizeof (TTBL)) != sizeof(TTBL))
			die ("can not read file");
	if (strncmp(ttbl->magic, PF_MAGIC, sizeof (PF_MAGIC)))
		die ("bad file format\n");
	close (fd);
}

main (int argc, char **argv) {
	TTBL current, clear;	
	int tries = 0, mode = 0;
	int  opt, max_prio, i, j, T1, T2, dt;
	char *ttbl_file;
	struct sched_param sched_p;

	while ((opt = getopt (argc, argv, "hg:c:t:")) != -1)
		switch (opt) {
			case 'g':
				mode = M_GENTTBL;
				ttbl_file = optarg;
				break;
			case 'c':
				ttbl_file = optarg;
				mode = M_CHECK;
				break;
			case 't':
				tries = atoi (optarg);
				break;
			case 'h':
			default :
				usage();
		}
	
 	if (getuid() != 0) 
		die ("For some reasons you have to be root");
			
	if (!mode) usage();

	if (patchfinder (PF_QUERY) != PF_ANSWER) {
		printf (
			"\n			---== ALERT! ==--\n"
			"It seems that module %s is not loaded. "
	       		"However if you are\nsure that it is loaded,"
		        "then this situation means that with your\n"
       			"kernel is something wrong! Probably there is "
		       	"a rootkit installed!\n", MODULE_NAME);
		exit (1);
	}		
	
	current.tries = (tries) ? tries : TRIES_DEFAULT;
	if (mode == M_CHECK) {
		read_ttbl (&clear, ttbl_file);
		current.tries = (tries) ? tries : clear.tries;
		
	}
	
	max_prio = sched_get_priority_max (SCHED_FIFO);
	sched_p.sched_priority = max_prio;
	if (sched_setscheduler (0, SCHED_RR, &sched_p) < 0)
		die ("Setting realtime policy\n");

	fprintf (stderr, "* FIFO scheduling policy has been set.\n");	
		
	generate_ttbl (&current);
	
	sched_p.sched_priority = 0;
	if (sched_setscheduler (0, SCHED_OTHER, &sched_p) < 0)
		die ("Dropping realtime policy\n");
	fprintf (stderr, "* dropping realtime schedulng policy.\n\n");

	if (mode == M_GENTTBL) {
		write_ttbl (&current, ttbl_file);
		exit (0);
	}

	printf (
	"    test name      | current | clear | diff  | status \n");
	printf (
	"------------------------------------------------------\n");

	for (i = 0; i < current.ntests; i++) {
		if (strncmp (current.test[i].name,
				clear.test[i].name, TESTNAMESZ))
			die ("ttbl entry name mismatch");

		T1 = current.test[i].t;
		T2 = clear.test[i].t;	
		dt = T1 - T2;
		printf ("%-18s | %7d| %7d|%7d|",
			current.test[i].name, T1, T2, dt);
	
		dt = abs (dt);	
		if (dt < WARN_THRESHOLD) printf ("  ok ");
		if (dt >= WARN_THRESHOLD && dt < ALERT_THRESHHOLD)
			printf ("  (?) ");
		if (dt >= ALERT_THRESHHOLD) printf (" ALERT!");
		
		printf ("\n");
	}

}




<--> ./patchfinder/main.c
<++> ./patchfinder/tests.c
/*                                                            */
/*            The Kernel PatchFinder version 0.9              */
/*                                                            */
/* (c) 2002 by Jan K. Rutkowski <[email protected]>  */
/*                                                            */

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <linux/types.h>
#include <linux/dirent.h>	
#include <linux/unistd.h>
#include <assert.h>
#include "libpf.h"
#include "main.h"

int test_open_file () {
        int tmpfd, T = 0;

	patchfinder (PF_START);
	tmpfd = open ("/etc/passwd", 0, 0);
	T = patchfinder (PF_GET);

	close (tmpfd);
	return T;	
}

int test_stat_file () {
	int T = 0;
	char buf[0x100];	/* we dont include sys/stat.h */
	
	patchfinder (PF_START);
	stat ("/etc/passwd", &buf);
	T = patchfinder (PF_GET);

	return T;	
}

int test_read_file () {
	int fd, T = 0;
	char buf[0x100];

	fd = open ("/etc/passwd", 0, 0);
	if (fd < 0) die ("open");

	patchfinder (PF_START);
	read (fd, buf , sizeof(buf)); 
	T =  patchfinder (PF_GET);

	close (fd);
	return T;	
}

int test_open_kmem () {
	int tmpfd;
	int T = 0;

	patchfinder (PF_START);
	tmpfd = open ("/dev/kmem", 0, 0);
	T = patchfinder (PF_GET);

	close (tmpfd);
	return T;
}

_syscall3(int, getdents, int, fd, struct dirent*, dirp, int, count)
int test_readdir_root () {
	int fd, T = 0;
	struct dirent de[1];

	fd = open ("/", 0, 0);
	if (fd < 0) die ("open");

	patchfinder (PF_START);
	getdents (fd, de,  sizeof (de));
	T = patchfinder (PF_GET);

	close (fd);
	return T;	
}

int test_readdir_proc () {
	int fd, T = 0;
	struct dirent de[1];

	fd = open ("/proc", 0, 0);
	if (fd < 0) die ("open");

	patchfinder (PF_START);
	getdents (fd, de,  sizeof (de));
	T = patchfinder (PF_GET);
	
	close (fd);
	return T;	
}

int test_read_proc_net_tcp () {
	int fd, T = 0;
	char buf[32];

	fd = open ("/proc/net/tcp", 0, 0);
	if (fd < 0) die ("open");
	
	patchfinder (PF_START);
	read (fd, buf , sizeof(buf)); 
	T = patchfinder (PF_GET);

	close (fd);
	return T;	
}

int test_lseek_kmem () {
	int fd, T = 0;

	fd = open ("/dev/kmem", 0, 0);
	if (fd <0) die ("open");

	patchfinder (PF_START);
	lseek (fd, 0xc0100000, 0); 
	T = patchfinder (PF_GET);

	close (fd);
	return T;
}

int test_read_kmem () {
	int fd, T = 0;
	char buf[256];

	fd = open ("/dev/kmem", 0, 0);
	if (fd < 0) die ("open");
	lseek (fd, 0xc0100000, 0);
	
	patchfinder (PF_START);
	read (fd, buf , sizeof(buf)); 
	T = patchfinder (PF_GET);

	close (fd);
	return T;	
}

int generate_ttbl (TTBL *ttbl) {
	int i = 0, t;
	
#define set_test(testname) {				\
	ttbl->test[i].test_func = test_##testname;	\
        strcpy (ttbl->test[i].name, #testname);		\
	ttbl->test[i].t = 0;				\
	ttbl->test[i].ft = 0;				\
	i++;						\
}

	set_test(open_file)
	set_test(stat_file)
	set_test(read_file)
	set_test(open_kmem)
	set_test(readdir_root)
	set_test(readdir_proc)
	set_test(read_proc_net_tcp)
	set_test(lseek_kmem)
	set_test(read_kmem)

	assert (i <= MAX_TESTS);
	ttbl->ntests = i;
#undef set_test
	
	fprintf (stderr, "* each test will take %d iteration\n",
		       	ttbl->tries);
	usleep (100000);		
	for (i = 0; i < ttbl->ntests; i++) {	
		for (t = 0; t < ttbl->tries; t++)  
			ttbl->test [i].ft +=
			       	(double)ttbl->test[i].test_func();
		
		fprintf (stderr, "* testing... %d%%\r",
			       	i*100/ttbl->ntests);
		usleep (10000);
	}

	for (i = 0; i < ttbl->ntests; i++) 
		ttbl->test [i].t =
	       	(int) (ttbl->test[i].ft/(double)ttbl->tries);	

	fprintf (stderr, "\r* testing... done.\n");	
	
	return i;

}


<--> ./patchfinder/tests.c
<++> ./patchfinder/libpf.h
/*                                                            */
/*            The Kernel PatchFinder version 0.9              */
/*                                                            */
/* (c) 2002 by Jan K. Rutkowski <[email protected]>  */
/*                                                            */

#ifndef __LIBPF_H
#define __LIBPF_H

#include "module.h"

int patchfinder(int what);

#endif

<--> ./patchfinder/libpf.h
<++> ./patchfinder/libpf.c
/*                                                            */
/*            The Kernel PatchFinder version 0.9              */
/*                                                            */
/* (c) 2002 by Jan K. Rutkowski <[email protected]>  */
/*                                                            */

#include <asm/unistd.h>
#include <errno.h> 
#include "libpf.h"

_syscall1(int, patchfinder, int, what)


<--> ./patchfinder/libpf.c
.:: Execution path analysis ::.