Introduction to the ELF Format (Part VI) : The Symbol Table and Relocations (Part 1)

This post is part of a series on the ELF format, if you haven't checked out the other parts of the series here they are:

  1. (Part I) : ELF Header
  2.  (Part II) : Program Headers 
  3. (Part III) : Section Header Table 
  4. (Part IV) : Section Types and Special Sections
  5. (Part V) : C Start up 
  6. this

In this and the next post I'm going to explore how Elf files manage to pull off the magic of symbol resolution as well as the format, offsets and records in the Elf that represent this information. There are many facets to this mechanism in the format, and before I get into each of them I'd like to provide a gentle intro to frame your thinking around why things work the way they do.


Symbolically locating the purpose of relocation

If you are already pretty clued up on why this important  feel free to move onto the next section.

I know you probably want to jump right in and look at all the awesome C definitions and byte offsets but I found that its much easier to understand how all these obscure offsets and hex values work if you know a little bit of the intention behind them and appreciate the real complexity of the problem being solved. So lets talk about why C programs need relocation and symbol tables.

We know that code can become pretty big and to make things more refactor-able and reusable, we spit it up into smaller parts(1). There is a natural need that develops: to able to break code up into smaller sub-classes/files or general organizational units. In C/C++ this terminology is referred to as shared libraries and the Elf file format offers this functionality through Relocation, Symbols and Dynamic Linking. That is to say that the "things" being relocated, are symbol and symbols are for the most part variables and functions of different flavors.

Suffice it to say "relocations" will be found in the Relocation Tables and the symbols these refer to will be found in the ELFs Symbol Tables - I should mention also there is more than one symbol table and more than one relocation table, for nothing else than efficiency and extended capability in configuring symbol resolution.

The object of compiling and linking objects

It also helps to picture what the compiler does to achieve a preparation of this information. Knowing what the final goal is helps us suffer through the annoying complex and obscure steps that aid towards it.

When throwing together different shared libraries and object files, the linker decouples the actions of resolving symbols from linking the files together. So essentially there will be a first "sweep"(2) that slaps the different shared libraries and object files into a contiguous sequence.
That action means in the final Elf object file, we are simply adding an offset to the original addresses of symbols that displaces them from where the originally appear in their own files.

An even simpler way of thinking about it would be to say its basically like grabbing a bunch of arrays and sticking them inside another array to build an array of arrays (which is a common action many languages). I've depicted a minimal link and compile process with gcc commands included and even stuck in the real offsets some of the functions got mapped to:

basic compile and link work flow with gcc. To get myself out of trouble I've included a relocate() fake function here to say "when this gets relocated it produces this address for the objects mapping in the final ELF file"

So this is essentially the work flow of the compiler at a very high level, what should focus your investigation further would what those deeper details and hex obscurities are that achieve this aggregated behavior. Why does this picture appear to work so smoothly? Well it must be hiding its hideous details away! 

In closing, what you need to imagine here is that for each attribute there must be some bits and bytes that allow quick determination of the settings for each attribute as well as how it managed to end up in its place in the final Elf object file. In the next section we cover how this format works and what allows it to offer this amazing functionality and we are going to show how horrible it can get when this breaks!

  1. This has nothing to do with development and more to do with the burden of processing language as a whole. 
  2. (borrowing some terms that foreshadow your journey into the world of compilers should you get crazy enough for that ride) 

Symbol Table and friends (.symtab, .dynsym)

So the Elf format needs to find a clever compact way to bundle information so it represents the plethora of things that determine the type and scope/binding of a symbol and what must be done to resolve it as well. The symbol table is meant to show us the symbols we want to relocate.

I should mention that there are two symbol tables namely the main symbol table (.symtab in the section headers) and .dynsym the dynamic symbol table, which is just a smaller subset of the entries in the main symbol table. This is a smaller copy relevant only to the dynamic linker. It follows exactly the same encoding and format as the main one, but I won't discuss it here I'll give a full swing in a later post about dynamic linking instead.

Before we dig into things, here's a cheat sheet showing you the scope and break down of the Symbol Table:

Symbol Table Entry Field Cheat Sheet

The following struct is used in libelf, it should expose some important information about how Symbol Table Entries work (extract from glibc/elf/elf.h:529-536):

typedef struct

 Elf64_Word st_name;  /* (4 bytes) Symbol name  */
 unsigned char st_info;  /* (1 byte) Symbol type and binding */
 unsigned char st_other; /* (1 byte) Symbol visibility */
 Elf64_Section st_shndx; /* (2 bytes) Section index */ 
 Elf64_Addr st_value; /* (8 bytes) Symbol value */
 Elf64_Xword st_size; /*  (8 bytes) Symbol size */

} Elf64_Sym;

I've added the type size so you don't need to scratch through the typedefs to figure this out, you're welcome!

So the way I like to think about this is: Because the order and sizes of this field we can quickly notice that the first 8 bytes (st_name, st_info,st_other,st_shndx) acts like kind of a meta-data header, it allows determination of the attributes of the symbol and everything after that points to the actual value that the symbol holds (its address, offset etc - this depends on the values in the first 8 bytes some what).

Okay so what do these fields mean?
  • st_name - the index in the .strtab that holds the first byte in the null terminated name of the symbol. Not all symbols have names, when they don't this section will hold a value of 0x0000.
  • st_info - Field of bits that determines a few attributes for the symbol. Namely the "scope" and the type of symbol in the c program this is meant to aid relocation for. It will indicate whether it is a function or variable or something else. The way this works is pretty much like every bit field, in true C style, it gets passed through a Macro. This Macro applies bitmasks, shifts to isolate the offsets in the bitfield dedicated to certain attributes. Here's the code for processing this field on 64bit architectures (extract from glibc/elf/elf.h:570-579):
    570 /* How to extract and insert information held in the st_info field.  */
    572 #define ELF32_ST_BIND(val)\
                    (((unsigned char) (val)) >> 4)
    573 #define ELF32_ST_TYPE(val)\
                    ((val) & 0xf)
    574 #define ELF32_ST_INFO(bind, type) \
                     (((bind) << 4) + ((type) & 0xf))
    576 /* Both Elf32_Sym and Elf64_Sym use the same one-byte st_info field.  */

    577 #define ELF64_ST_BIND(val) ELF32_ST_BIND (val)
    578 #define ELF64_ST_TYPE(val) ELF32_ST_TYPE (val)
    579 #define ELF64_ST_INFO(bind, type) ELF32_ST_INFO ((bind), (type)) 

  • st_other -  This is a bit field used to determine the visibility of the symbol. An attribute that controls how code is allowed to reference the variable per certain contexts. Here's the macro glibc uses to pull out the visibility value:
    617 /* How to extract and insert information held in the st_other field.  */
    619 #define ELF32_ST_VISIBILITY(o)   ((o) & 0x03)
    621 /* For ELF64 the definitions are the same.  */
    622 #define ELF64_ST_VISIBILITY(o)   ELF32_ST_VISIBILITY (o)

         Visibility types for symbols include (also available from the diagram above):

    • STV_DEFAULT 0x00 - which means this is the default visibility rules
    • STV_INTERNAL 0x01 - Processor specific hidden class
    • STV_HIDDEN 0x02 - means this symbol is not available for reference in other modules
    • STV_PROTECTED 0x03 - Documentation refers to this as a protected symbol. I believe the only thing that differs between this and a normal STV_DEFAULT symbols is that it won't be allowed to be overridden when referenced from within its own shared library. 
  • st_shndx Field indicates the section index associated to this symbol. Symbols are associated to sections this way because everything defined as a symbol will probably have an associated section - for instance where would variable values be defined? Probably the .data*-esq sections no? There are a couple of special section numbers that indicate something about the section related to the symbol these can take a couple values, please check out glibc/elf.h:414+ for the range of these values.
  • st_value Value of the symbol this has different interpretations depending on the symbol type: 
    • In executable files and shared objects this file holds the virtual address for the symbol's definition. 
    • For relocatable files this value will for the most part indicate the offset for where the symbol is defined.
    • For Symbols who's st_shndx is a SHN_COMMONst_value will hold alignment constraints for when its relocated. 
  • st_size  Size of of the symbol, indicates how many bytes will be occupied by what this symbol represents depending again on symbol type - for the most part either the size of the data field for a variable or the size of code for a function. 

Lets take a look at how this information is represented on disk in raw binary:

I've skipped the first record because its always going to be a null symbol (same goes for the .dynsym). For the symbol highlighted here we can see the following:

  • st_value of the symbol is set to 0x400238, which means it will appear this virtual address
  • st_size is set to 0 which means it won't take up any space in the binary during execution and probably doesn't define a variable.
  • st_info is set to  0x03  which means the symbol type is SECTION which means its a symbol associated to a section. And Bind type is then LOCAL which means it is defined in the current object file.
  • st_other is set to 0x00 which means its visibility will be STV_DEFAULT  
  • st_name is set to 0x000000 which means
  • st_shndx is set to 0x01 which means it is associated to the section defined at index 1 in the section table. If you haven't guessed this is for the .interp section. 
I took the first non-null symbol entry and expanded on it but there are always more elaborate examples to draw on, make sure to pop open hexdump and reverse engineer some of these structures yourslef ;)

We are not going to cover relocations just yet I thought the post might be a bit lengthy and bloated. For now we are going to treat the symbols as a piece of meta-data on its own and worry about how the dynamic linker might make use of them.

That's pretty much it as far as the symbol table goes lets see if we can pull off some tricks!

Elf Symbol Sorcery

"Signs and symbols rule the world, not words nor laws" - Confucius

So we know that there are some programs that rely on symbol information; these are things like objdump and gdb . What we're going to do is replace a symbol for a function with another one, and then see what objdump and gdb makes of this.

So this is me placing the address of the main method in the symbol table with the one for never_call:

Just in case you're curious, yes the binary does still run completely as intended; never_call() is uhm never called, but something interesting happens when we disassemble main in gdb:

Huh? I ask it to disassemble main and it give some code for never_call? I never called for that! (I'm milking this too hard aren't I? hehe). Anyway gdb fell victim to that old symbol magic!

We can also see that if we ask objdump about the main method it doesn't seem to have some code for it (if you run this grep on an unedited never_call.elf it will show the main() method of course, here it only shows the stub code for __libc_start_main, which eventually calls main itself - but is a fundamentally different function. ):

When I was trying out tricks for this one I accidentally replaced start, so just to confirm that objdump does completely trust the symbol table check out what it says about _start.

Now you might wonder how main still gets called? If in my mistake and the previous example we are replacing the symbol pointers for main, why does the proper main still get called?

Well if you look at the screenshot above you'll see some of the instruction encoding data in the second output column. Look closely at the one at 0x40046d  (which reads c7 c7 30 04 40 00 ). This shows that the address for main, which is passed to rdi  ( which is 0x400430 ) is baked into the binary, as in it is passed to _start from outside of the potentially broken functionally of the symbol table. So it will happily march on calling the real main instead of the redirected on in the symbol table.

Anyway that's it for this post, stay tuned for the next one! I'll extend our discussion on the Symbols and include a break down of how relocation work. 

References and Reading: