RootKits 101 (I). The Basics
A rootkit is a tool or set of tools intended to keep root access to a compromised machine. The rootkit becomes relevant when a machine has been compromised (exploiting some vulnerability, social engineering, password cracking...) and it is, at that point, when the rootkit can be deployed. Sometimes, rootkits are combined with a exploit to perform these two steps (exploitation of the machine and installation of the rootkit) in one single step.
In general, a rootkit has two main functions:
- Make some activities in the compromised machine invisible to other users, including the system administrators.
- Easily enable root access to the machine via some pre-defined shortcut... i.e. without having to use a exploit again.
However, a rootkit can indeed be used for other purposes. From my point of view, on those cases there is probably a better name (backdoor, virus, zombie,...) so I will follow the classical definition in this text, and stick to the two functions mentioned above.
There are many different kinds of rootkits. From the very simple user space rootkits (that modifies monitoring tools to hide some information) to sophisticated ones targeting specific firmware at the hardware level. You can read more about the different kinds of rootkits in this Wikipedia page.
In this series we are going to look into kernel mode rootkits. Those are the most common ones, and will give us enough fun to be busy for a while.
The first thing a rootkit has to do is to hide some activities (network connections, process being executed) and items (files, including itself). For doing that, the rootkit's developer, needs a deep knowledge of how a system is monitored and how the monitoring tools work. I'm not going to cover all possible monitoring tools out there. I couldn't do even if I wanted.
Instead I'm going to introduce the very basics. This covers quite some monitoring tools and also, hopefully, will give you enough background to investigate yourself how analyse other tools yourself. Note that, at some level, monitoring tools can go quite far and extract information out of heuristics or simulations. That is far beyond the point we (I) are.
So, on a GNU/Linux system, most of the standard monitoring tools make use of the /proc/ pseudo-filesystem. It is called pseudo-filesystem because it only exists in memory, none of the files you will see under /proc are actually in your har-drive.
The /proc pseudo-filesystem is the place where the kernel puts relevant information about the system to easily allow applications in the user space to access it. Actually it is not the only one but for our current discussion it does not really matter.
Now, is time to open a terminal and type man proc . Browse through the man page to get an idea of what we are talking about. You will figure out yourself, after the reading, how netstat, ss, ps, top, who, last or lsof may work.
That's a very good question. There is no single answer, but these are some hints:
- Read the man page about proc and also the kernel documentation. Go fetch the kernel sources, uncompress it and go straight into the Documentation/filesystems/proc.txt. Then keep reading all the stuff under the Documentation directory. You will find a lot of interesting information in there.
- Study the source code of the open source monitoring tools. That is also a gold mine of useful information.
- In case you do not have access to the source code of the tool you are interested on. Remember. strace is your friend (you know... man strace )
Regarding this last point, you may use a little more help. Take a very simple tool. For instance ifstat and run it with strace:
strace ifstat -i eth0 1 1
Then look at the output... Do you recognize something?.OK, this is not one of the tools a intruder will want to cheat with a rootkit, but it is a relevant example of what you will find when looking to more complex ones.
The next thing a rootkit's developer needs to know is the insights of the system. Yes, from a hacking point of view, coding a rootkit is a goal, in the sense that you really need to deeply understand your system at a very detailed level. Even for creating a very simple one.
One of the things the rootkit's developer need to know in detail is the GNU/Linux-UNIX filesystem. We are not talking about the names of the standard folders. We are talking about the filesystem guts. And those guts are called inodes (well, there are more stuff there, but those are the ones we will be interested on).
Knowing the details of how the filesystem works is important because:
- The very first thing a rootkit has to do is to hide its own file in the system, and also hide any system modification required to survive a reboot (boot scripts or whatever).
- Well, you had read the previous section, don't you?. Monitoring tools basically read PseudoFiles.
We will come back to this in a future post when we'll look at a real rootkit.
Other important part of the system, a rootkit's developer needs to know about, is how to add code to the kernel. Well, that is because we are talking about kernel mode rootkits. In a GNU/Linux platform this is usually done writing a LKM (Linux Kernel Module). You can find a pretty neat introduction in this tutorial. It gives a good overview from the user/administrator point of view. We will go deeper in this topic later in this series.
Back to the topic. A LKM is a small program intended to be executed inside the kernel. Normally they are used to add drivers for different hardware into the kernel, but that is just one possible use.
Go to the console now and type lsmod. Depending on what distribution you are running and your hardware, you may see quite a long list.
There are many good tutorial out there to get introduced into this topic. In case you do not want to read them, and go straight away to modify somebody else code.... Well, do not do that. That will not help you to master the technology. Furthermore, you will probably crash your machine many times and probably end up losing data. It may be a bit frustrating. Better start for the basics and go ahead slowly, understanding what you are doing at each step... But, that is up to you.
To give you a little bit of what goes on within the kernel, these are some of the differences with a normal user space application.
- There is no main function and you need some preparation just to compile (the kernel headers as a minimum).
- Your code is within the kernel... There is no standard C library (no printf, fopen), no math library. Nowadays, many common function are available, but do not expect to use libcurl within your module. You are in your own together with the system calls (section 2 of the manual is your best friend... remember section 3 is for libraries and is not useful).
- Your code is running in the kernel space and your addressing space is no longer virtual. OK, we will get to that later.
Developing LKM (or Drivers if you prefer) requires quite some training. It may not be your thing, but I will suggest to at least, build the "Hello World Module" example (you will find that one in any single LKM tutorial out there), and, at least, add a proc interface to get a minimal understanding of how information flows from user space into the kernel and back. Then you will, at least, have an idea on the topic.
I will finish this first introduction with the very minimum list of the actions a rootkit has to perform and that may provide some structure to future posts (if the topic is of interest to the community).
- Get loaded into memory as a LKM (we have to create a LKM)
- Hide the LKM module in the disk so cannot be seen with ls
- Hide itself so cannot be seen with lsmod
- Do whatever manipulations the rootkit is intended for (hide files, hide network connections, add a special file that whenever you write some secret string into it, you'll get, automagically, root access :o)
That is it for now. Let me know if there is some interest on the topic and at what level... For instance, do you want to see the hello world LKM or you guys prefer to go straight into the whole thing.