Introduction to The ELF Format (Part VI): The Symbol Table and Relocations (Part 2)


This post is part of a series on the ELF format, if you haven't checked out the other parts of the series here they are:

  1. (Part I) : ELF Header  https://blog.k3170makan.com/2018/09/introduction-to-elf-format-elf-header.html
  2.  (Part II) : Program Headers  https://blog.k3170makan.com/2018/09/introduction-to-elf-format-part-ii.html 
  3. (Part III) : Section Header Table  https://blog.k3170makan.com/2018/09/introduction-to-elf-file-format-part.html 
  4. (Part IV) : Section Types and Special Sections https://blog.k3170makan.com/2018/10/introduction-to-elf-format-part-iv.html
  5. (Part V) : C Start up https://blog.k3170makan.com/2018/10/introduction-to-elf-format-part-v.html 
  6. (Part VI) : The Symbol Table and Relocations Part 1 https://blog.k3170makan.com/2018/10/introduction-to-elf-format-part-vi.html 
  7. this 

In this post I'm going to explain a little bit more about how Relocations and Symbols work. We talked about the symbol table specifically in the previous post, but weren't fair about why Relocations are needed and who they are used.

Introduction

"The real is what resists symbolization absolutely" - Jacques Lacan (1)

When compiling and linking a program; the attributes used in each component object is placed at a given offset away from its original position in the final object. The ELF format records this offset and a mechanism for its resolution in Relocation records. Relocation records hold information used by various utilities to help aim at the right part of the Elf file containing the definition of a symbol. It also allow compilers and C developers to extend the functionality of symbol resolution - with extra hooks and plugins and what have you, so like exploit dev but except you actually want to write to a function pointer with data lol. So symbol information, but for symbols themselves!

Relocations can take on a number of types subsets of which are colloquialized and implemented across architectures - so many archs will have their own symbol resolution mechanisms applied to the relocation record format discussed here. Besides this already sparse field of definitions; relocation records (referred to as "relocs" from here on out...sometimes) are used for various reasons through a programs life cycle. Some relocs are used to prep the runtime, others for plain old dynamic linking and lazy loading and there may very likely be more yet defined and unmentioned functions.

Lets take a look at how this Relocation format works and which sections are meant to hold information for it.

The Relocation Table (.rel, .rela.dyn and friends)

To give an overview of how complex these fields are here's a small cheat sheet:



So as we know the section header table will be able to point us at different parts of an ELF file and elaborate what they are meant for. Some section headers mentioned here are specifically for holding relocation information; and because relocation as we said can have multiple purposes, there are multiple relocation sections. The naming scheme should be pretty much the same in that it mentions more or less what its relocation entries are for in the .rela.[name] scheme:

  • .rel.dyn .rela.dyn - relocation entries for dynamic symbols 
  • .rel.plt .rela.plt  - relocation entries for PLT meta-data (usually prepping JMP gadgets)
  • other types exist but I find they are rarely used or hard to find examples for.
rela with an "a" at the end; indicates relocation with the addend fields are used in the section. Relocation with addend is the one commonly used on x86_64 it seems; although the actual r_addend field is almost always 0 - glibc also maintains some flags to configure whether this field is used as part of the relocation.

Basically that means, you will see relocation with addends used in "format"; but the actual addend will most likely always be set to 0. Which is more effectively just a normal reloc but with a NULL word at the end of each one. Potentially useful depending on how code trusts that NULL at the end when it loops through records.

Anyway, seeing that there could be a number of different rela.[name] fields down to just about any crazy purpose I decided to go looking for some weird  [name] values . So I scanned my own machine quite liberally for ELF objects and found that few of them use any wilder form of .rel section -  compared to the common rel(a).dyn, rel(a).plt:

The .fffff sections are from me doing research for this blog series but I freaked out a little when I saw them at first lol
Moving on, we should probably look at the struct the C runtime  and glibc use to handle Relocation records:
(extract from glibc-2.28/elf/elf.h)



This is what I've gathered each of the fields in the struct are meant for:
  • Elf64_Addr (8 bytes wide) r_offset - the offset to the final function. this could hold a number of different kinds of address values or offsets that aid relocation resolution. I expand on these a little later on this post, but to be fair to them please check out the documentation on this.
  • Elf64_XWord (8 bytes wide) r_info - a bit field the run time will pull through some macros to determine the kind of relocation being defined. The field holds typing information for the Reloc entry as well as the symbol index it is meant to refer to. Quite crucial a field because if you can write to you can make relocs point to different symbols which is pretty powerful depending on context. 
  • Elf64_SXWord r_addend (8 bytes wide) - the addend, a parameter included in the calculation of the relocation - pretty much always ignored in the x86_64 format I'm using. I will explore how true this claim is later on
To expand on how the r_info field is used for determining type information for the reloc, here's an annotated screenshot:


Nothing too fancy, the r_info field (as with many C-esque ELF Metadata *_info field things) is just a bit field that gets pulled through some shifting / anding operations to isolate the bits that are contingent on certain properties of the field.

The ELF64_R_SYM macro is actually for pulling out the symbol that this relocation applies to (I hinted to that in the cheat sheet at the beginning of the section - because I got them foreshadowing skills yo). Here's an example from a random binary I pulled of my machine (notice that the Info field in the readelf dump and how it correlates with the symbol indexes):




Some more insight on how this is probably meant to be used internally to the c runtime can be seen in an extract from glibc-2.28_afl/glibc-2.28/elf/do-rel.h:




We know what a C program will most likely use in terms of its own terminology but what does the format actually look like in raw hex?



One can see the extra NULL 8 bytes at the end, this is the r_addend set to 0 - you will now know why readelf mentions the addend value but its almost always 0.

I mentioned that the [name] part of the relocation .rel(a).[name] mentioned the purpose of the field so I thought I could cook up an example of this in use. We can look at a large sample of the R_X86_64_JUMP_SLOT entries, I grabbed this from a random binary on my machine (literally used a bash script that takes a list and passes it through shuf lol):

Color choice was on point with this one.  

Clearly this section provides some insight on how the JUMP instructions that point to the GOT work. I believe that R_X86_64_JUMP_SLOT entries are specifically for preparing the PLT jump gadgets.

Anyway all these beautiful fairy tales about Elfs make great bed time stories for unquestioning children; but lets see if the format is really treated this way. Next section looks at some of the horrible things that could happen when someone messes with the reloc metdata.

Relocation hex sorcery

Lets see which evil spirits we can summon by flipping some bits in the reloc format for an Elf.

r_info mangling


First off I lets see what happens when we change the r_info field up. Here I have two symbols that have reloc records in the .rela.plt and I'm mangling the r_info field so they point to the same function, namely puts; and then seeing what appens (I'm changing the the byte in the r_info field that indicates the symbol pointed to by the reloc record):





In the screen shot I'm trying to show what the picture was before and after editing the relocation metadata. We can see here that gdb actually feels the affect of the reloc because it used it on the symbol for putchar.

What happened here is when gdb tried to resolve the function it made use of the index value we changed. So we made the reloc point to a different index in the symbol table and it used this to resolve its definition resulting in the puts function being targeted instead.

So we've learned that the r_info field is pretty powerful when it comes to driving function identification in some contexts(2). Beyond that we can also look at how malformed r_offset values affect execution.

r_offset mangling

Another thing I can show here is how repointing the r_offset value to the same function affects resolving GOT and PLT stuff. Because we are re-pointing a symbol relocation record here, it affects how the runtime recognizes that a symbol and as a result the runtime is invoked everytime we use a dynamic symbol in code. This is me editing the r_offset's for puts and putchar to point to the same value:



And this is the result in gdb:



On the right we have the binary that was edited, on the left we have the original. In this gdb session I set a breakpoint to the call in the PLT at 0x400420 ; this invokes the __dl_runtime_resolve which handles patching, and looking up symbols.  As you can see, comparing both of them when we messed with the symbol r_offset, it causes the dl_runtime call to happen one more time than in the original.

r_type mangling

As for the r_type value (which is defined as a certain bit offset in r_info), I pretty much tried injecting others; but learned that the runtime has consistency checks on the types. There many other kinds of reloc sections that may allow for arbitrary r_types and all kinds of symbol remapping. If they exist and when I find them I'll dedicate a blog post to them.

For now lets look at how miserably I failed:


As you can see, whatever I try is outwardly rejected by the runtime, it won't have any of this nonsense lol. Anyway that's it for this one folks, stay tuned for the next post in this series covering some of the internals of dynamic linking and lazy loading ;).

Footnotes:

  1. To expand Lacan's quote here (purely for the Elf Format Philosophiles): Reality is never what we symbolize it to be, it is what always escapes our symbolization. What is left from our inevitable failure to completely symbolize it perfectly absolutely well with out mistakes exactly right clearly - you get it (why are there so many perfected works for perfection itself)?  In this post I will essentially show in some ways that symbols can profoundly betray the functions/variables they are meant to point to: this is because even the symbols themselves, must have symbols that point their own meanings! So there seems to be a contingency on symbols having meaning but nothing that cements their right to point to anything as a specific meaning. They are free to point to any meaning or function (he says as he repoints Lacan's philosophy at the Elf world).  But in the practical world in which we use them of course: they can, as an aggregated collection of symbols in some way expose a singular function; we can "recognize" that symbols in a certain category can be "summed up" or "replaced for"  (in a context-free grammatical sense) more or less by a collective theme. Such themes are symbols too! But if under an already assumed theme, a collection of symbols misses consistency or paradoxes in a certain way with this theme (which it will always inevitably do - because the hosting theme to every theme is reality itself - which always paradoxes) the whole picture is broken; the theme becomes absurdity instead of what it originally hoped to be. In Lacan's case he argued that this is what in some sense defines our access to "the real" reality and that addressing this too directly caused a reflexive denial of how reality works (we cannot accept the realism of complete non-fantasy or the extreme fantasy either). In the case of linux executable formats it means we need to get some person to reverse engineer the whole binary to determine what functions do from the ground up - which is maddening in and of its own! lol 
  2. This means there must be other things we can do with it, perhaps inject functions into the binary or re-point functions at a key time in their lazy loading life-cycle or force a re-invocation of the run-time in a way that side channels information about the functions being called and therefore data being processed? maybe maybe lol Probably better addressed in a separate post.

Comments