Introduction to the ELF Format (Part VI) : More Relocation tricks - r_addend execution (Part 3)


So I lied a little about what would be the next in the series, I realized there was something I should have added to the previous one - which ironically was the addends about the r_addend field :) So here it is, the section on mangling r_addend fields with some other tricks I left out.

Some things you might need are:

  1. Executable code we will disect, tihs is the definition for the never_call.c https://gist.github.com/k3170makan/c7712b7aa14f1c2e7c0e7ae725f2fac1 
  2. binutils 
  3. GCC
An average linux distro will have these things already ready to roll besides maybe hexedit/hexdump. 

Mangling dynamic symbol relocation r_addends

 r_addend you glad it didn't say 0xAAAA...?

In the previous post, I mentioned the basics of the relocation entry format and showed how complex they can become and how one ELF object can have a bunch of different .rela.[name] sections. All of which will not only have relocs applied to different stages of the ELFs life cycle, for instance calling functions but can also help the runtime perform initialization. For the first example we are going to focus on the .rela.dyn section and what happens when we are too liberal with the values in the r_addend.

The r_addend if you weren't aware; is a field in relocation entries for ELF symbols that specify an additional auxiliary parameter to a relocation calculation. I also mentioned that this field is not actually used much on the x86_64 platform and for the most part (as far as I can see) - is nulled out. So you will have a .rela.* ('a' meaning with r_addend) sections to your binary, it will just always have its r_addend fields set to 0 most of the time.

Poking and prodding these r_addend fields as they appear in some binaries; I found is that you can actually get the run-time to execute from the r_addend value if you made it non-zero. Here's the proof of concept:



In this screenshot I am changing the value for __gmon_start__(2)'s relocation r_addend which appears at address 0x3B0. Its not so important where it gets called, I am pretty sure its just after _start and before the main method.  

Whats good to know about that is that according to that theory the never_call function should in no way ever be called - we can pretty much bet there is no simple logical progression leading to never_call's execution, this is because the code for this binary is only written to print two strings and then exit.

Now, you should check the readelf output as well (in the screenshot); it confirms that we are changing this field correctly. Also notice  that we have only edited the .rela.dyn's r_addend value for this field; meaning the actual symbol value for __gmon_start__ is untouched in both the dynamic symbol table (.dynsym) and full symbol table (.symtab).

This pretty much does straight up execute the r_addend value, I've confirmed this in many other ways (for instance we can see that the segfault happens at this instruction point value consistently):




It is of course implied that I am forcing it to take the completely unnatural instruction pointer values of 0xaa.. 0xbb... etc.


This behavior is isolated to a couple of relocation types (r_types). I furthered my investigation into which r_types allow for this in some capacity, and I got execution by using the following relocation types:

  • R_X86_64_64  0x01 - Direct 64 bit Reloc
  • R_X86_64_IRELATIVE 0x25 - Adjust indirectly by program base 
  • R_X86_64_RELATIVE 0x08 - Adjust by program base

I'll get into deep detail about exaclty why these end up getting executed but its going to take a little more research before I can confidently talk about that lol.

We know of course the rela sections will appear in the live memory image (this is because they form part of a PT_LOAD section(1)), so we know that it will potentially be "referencable" from inside running code. This means it offers data to target that could potentially affect execution flow. 

Footnotes

  1. Not directly because the section they appear in is marked ALLOC as some would refer to it.
  2. which is to cut a long story short afaik a profiling function that gets called during the runtime initialization from the _init().

References and Reading


This post is part of a series on the ELF format, if you haven't checked out the other parts of the series here they are:

  1. (Part I) : ELF Header  https://blog.k3170makan.com/2018/09/introduction-to-elf-format-elf-header.html
  2.  (Part II) : Program Headers  https://blog.k3170makan.com/2018/09/introduction-to-elf-format-part-ii.html 
  3. (Part III) : Section Header Table  https://blog.k3170makan.com/2018/09/introduction-to-elf-file-format-part.html 
  4. (Part IV) : Section Types and Special Sections https://blog.k3170makan.com/2018/10/introduction-to-elf-format-part-iv.html
  5. (Part V) : C Start up https://blog.k3170makan.com/2018/10/introduction-to-elf-format-part-v.html 
  6. (Part VI) 
    1. The Symbol Table and Relocations Part 1 https://blog.k3170makan.com/2018/10/introduction-to-elf-format-part-vi.html 
    2. Symbols and Relocs Part 2 https://blog.k3170makan.com/2018/10/introduction-to-elf-format-part-vi_18.html
    3. (Part VI) : this
So if these sound like another language to you, try starting a little further up in the chain ;)

Comments