Windows Exploit Development (primer) : Debugging Threads and Analyzing Memory

Hi folks I thought its about time to start blogging about the little experience I have in low level exploitation and analysis - so here goes. To start off on your windows exploitation journey you need to be able to get to grips with a tool and some tricks to get you look at your target the right way. In this post I cover somethings that may help a ton! 

Debugging Threads

To get started you are probably going to need a couple things sorted out first, namely a simple windows VM setup with debug tools (tons of tutorials out there on the internet) and a target to exploit:

Before we can start crashing programs and controlling EIPs we need to make sure we have the right view of the target we are exploiting. Windows debugger is actually pretty useful in this regard so open it up, open the target program and attach the debugger to it like so:

Once you've got your debugger hooked up; lets go over a few basic Windbg commands. When you're debugging something especially for exploit development purposes you're essentially trying to get memory to look a certain way most of the time - either loading up arguments for a function call you are crafting or tracking a certain variables values in memory. Before we can do any of that fancy stuff we need to get the debug target Easy MPEG to DVD to give the debugger a chance to get control of it - the way this is done is by issuing a break command or setting a break point essentially. Once you're connected go to the menu bar in Windbg and execute Debug -> Break.  

If you're attaching to a process that already started this should happen automatically like so: 

On Windows systems a process is not executing on its own. Actually speaking more directly, processes are only executed as different Thread contexts. So each process is merely an environment offered by the operating system in order to host execution in different Threads. One process can have multiple threads, to check out the threads available to you (while your program is broken in), you issue the shorthand "~*", which produces output like this for the program we are looking at:

There are a number of other details the only ones of these for each thread that will benefit us now are:
  • This shows a list of the threads that were in execution for the process we are debugging - it has 4 threads. A process could have 100s! The "." shows the thread we are currently with in context of, this means all the stack dumping and memory contents will happen with regards to the registers and memory configuration of this thread.
  • The left most column shows the cardinal number of the thread (Microsofts engineers probably choose the word "cardinal" because they like reminding folks of how insane Godel and Cantors work was
The command line prompt "0:000>" shows in the first number from the right, "0" the thread context we are in as well to change it we issue the "~[CARDINAL NUMBER]s" command, like so:

Notice how the "." moved to next to cardinal number "1" and the left most number of the prompt changed to "1" as well?  Okay schweet!! We can jump between threads, lets check out the stack of each one; here's how I did that (I'm just listing the function stack contents here purely as an example to show each thread has its own unique stack executing its own nonsense - don't freak out too bad about what it means I explain that in the very next section!) :

It could be pretty annoying to change threads manually every-time you need to do this what if you wanted to fire off commands for each thread from a single thread context? You can do that by using the ~* command and then giving it something to do per thread as follows ~* k, or in the general sense you issue any combination of commands using the ~* [command string] format. Comes in very handy! 

Memory Analysis and the Stack

We know how to do the thread thing, lets take a look at the stack in more detail. To list the stack contents we use the k command (I chose thread 0 here if you want to see the exact same output you should switch to the 0-thread context too!) :

What we are looking at is summary of the current call stack. Each line describes a given function that has some stack presence in the thread being analyzed. You will notice in the leftmost column a list of the "top of stack pointers" (ESP/RSP registers) for each function being called appears. Further to the right from that a list of the return addresses for each function, and lastly in the right most column the actual place where each function was executing when it called the function above it in the stack. 

You might not know how this data structure works or why it is so fundamental to all computer architectures, below I give a few hints about why it works a certain way.  As much as intels engineers want to claim pioneering of these things they are doing nothing more than putting in practice principles of computer science that existed before computers even did.  Here's a low down on stacks and how they work (in a broad view).

A little bit about stacks

Taking a look at the stack in the dumps above, something to note is how the stack "unwinds". The values in stack memory are there in order to operate a certain data structure, the "stack" is obviously named so because it is actually a stack in some sense a kin to that of one studied in computer science. Its just that here the stack is used to track which functions are currently executing. 

In general: Computers are defined on an endless process of recursion, each instruction you give a computer no matter how fundamental or atomic is actually in lower levels composed of a recursive composition of functions - whats more is the language you describe it in (if spoken by humans) is also defined on this kind of fundamental recursion (we notice this more obviously in structures like proto-language trees--why didn't the linguists use a simple list?--context free grammars, pushdown automata etc). Its a property of language to some extent and its what computation demands of our actuation of it today. When it comes to enumerating and keeping track of recursive operations a stack is as complex as you need to get. 

The function stack here is composed of each function being executed and the functions that called it, as well as any variables and memory contents local to its definition. The machine needs this structure in order to determine who's turn it is in which order. If you've ever played magic the gathering you will be very thankful things like stacks exist - especially if you're going up a person who is a rockstar with instant spells lol

Think about it for a second, if you want to impose strict rules on how to isolate a variable as existing within the execution of a given function only: The simplest way is to use a stack address for it and just halt hard every time its referenced outside its function context (here is where von Neumann and the Harvard machine take the lead in forming a design that can actuate the principles of computation coming to the fore in things like function stacks and memory address lists). But how do set it up so that you can address each functions memory according to its unique contents? This is partly achieved by using CPU registers with certain values to locate stuff on the stack (the rsp/esp - top of stack pointer and rbp/ebp - bottom of stack pointer are instrumental in this!).  In a number of machine architectures today the mangling of CPU registers that handle stack addressing is achieved by the calling conventions as actuated by the compiler - the code compiler handles setting up the instructions for each function so that, when it begins executing or starts executing/calling another function; some code is automatically added to set up, and tear down all the stack niceness it needs. 

So lets look at how the stack is unwound in memory using Windbg. You might want some kind of proof that the addresses in the left most column actually do follow the order I describe. So lets find a way to unpack these values and show that they do relate to each other in this way. Here i do this by using the "?" command which decodes some values for us into decimal (makes it easier for us users to read). 
And I also make use of the "dc" command, this disassembles memory within a given range you give it. Lets look at the memory around the function that supposedly called our current one, see if it adds up:

UPDATE: In a the previous version of the post I got it slightly wrong, forgot that the actual call to the function listed in the stack summary would actually appear a little higher up (the screenshot was updated in accordance. Notice I needed to subtract a small number from the address I disassembled from

As you can see in the above screenshot, the "u" command shows very clearly that the function above it is actually called in the code pointed to by its function name in the stack summary! Proof END! hehe. So we know a little about the way the stack is displayed - if we ever crash a program one of the first things you're going to want to know is 
  1. What was executing when the crash happened  - (you can do this by looking at the stack, disassembling memory around the instruction pointer). 
  2. If its a stack overflow - what does the stack looking because of the crash? 
Make sure you can whip out WINDBG and scry out some stack weirdness. You're gonna need it you want to write exploits! 

Looking at variables in stack memory 

We might want to know what is in memory pointed to by the values on the stack or values that appear in the stack directly. Sometimes functions love using the stack directly as variable storage sometimes writing and transferring values between variables directly on the stack. That means we need to know how to locate values and chase down where they are stored else where in memory. There are few different kinds of variables - loosely speaking you might want a variable in memory that has these properties:
  • Local Scope -  variables that can only live inside a local function scope. 
  • Extra-local Scope - every other place a variable appears except the stack
    • Global scope variables - are meant to be reference-able  by every function in the thread. For the most part, references to their actual locations are stored in the stack  if a function is dealing with them (this is also party because of the size of objects or arrays that programs process they tend to be pretty dynamic in size so its a bit silly to put them on a stack - heaps are waaay better place for them). Different architectures have more or less unique regions in a threads memory for each unique kind of global variable. 
      • Uninitialized variables - variables that hang around in a special area of memory until you actually fill them with contents. 
      • Static immutable variables - usually variables that hold values you don't want to change during execution of the program.
    • Heap - Usually used a place to store variables requested during runtime with variable length. 

Lets check out some the stack contents being used by our function, first we need to dump the contents of the current functions stack, you do that by issuing the dc command and giving it the esp register (optionally you can add another address to specify where it must stop reading). The dc command displays memory in double word format, because many of the functions we are poking at here (for lack of a better example) are actually part of a 64 bit format library we are better off looking at memory the way it does, in nice neat 64 bit units. What if we want to look at some of things detailed in the stack summary? What about the arguments passed to each function on the stack how do we see them? (Often you will need to ensure you can trigger and craft certain execution in the machine once you've subverted control of the EIP, to confirm you are stuffing bytes into memory in the right way viewing the stack and see which arguments are being used is a quick way to confirm this). 

Dumping the contents of the stack arguments works by issuing the k v n command like so:

Excuse the small picture - its much better to get a look at it yourself anyway :) 

Lets chase down a couple of these values and see what they point to, what happens if we call dc on some of the arguments under "Args to Childfor the top most function?  (I would guess ARGS TO CHILD and CHILD-SP means the "child" here is the row that was called by the previous function in the stack i.e. it is the "child" of the previous function) . Anyway heres's what the arguments look like when deferenced using the "dc" command:

For now this mostly looks like nonsense, this is because sometimes the values pointed to are structures or linked lists that don't lend themselves to be understood so easily on base sight - you understand them by unpacking their relation to other contents in memory perhaps or how they are worked during runtime. We have been given a little mercy here by being able to easily identify what looks like some ascii strings in memory (there's a d* command for dump memory as strings btw). Obviously it goes without saying the strings shown here are pretty common byte patterns for windows executables. My guess is its probably environment variables but that will again require more fiddling!

And so we've dumped the contents of the stack and shown how the stack summary unwinds. Pretty straight forward stuff!  Thats it for this post folks look out for a follow up to this one soon!