ELF Tricks
[AD REMOVED]
Program Headers
The describe to the loader how to load the ELF into memory:
readelf -lW lnstat
Elf file type is DYN (Position-Independent Executable file)
Entry point 0x1c00
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0x0000000000000040 0x0000000000000040 0x0001f8 0x0001f8 R 0x8
INTERP 0x000238 0x0000000000000238 0x0000000000000238 0x00001b 0x00001b R 0x1
[Requesting program interpreter: /lib/ld-linux-aarch64.so.1]
LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x003f7c 0x003f7c R E 0x10000
LOAD 0x00fc48 0x000000000001fc48 0x000000000001fc48 0x000528 0x001190 RW 0x10000
DYNAMIC 0x00fc58 0x000000000001fc58 0x000000000001fc58 0x000200 0x000200 RW 0x8
NOTE 0x000254 0x0000000000000254 0x0000000000000254 0x0000e0 0x0000e0 R 0x4
GNU_EH_FRAME 0x003610 0x0000000000003610 0x0000000000003610 0x0001b4 0x0001b4 R 0x4
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
GNU_RELRO 0x00fc48 0x000000000001fc48 0x000000000001fc48 0x0003b8 0x0003b8 R 0x1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.gnu.build-id .note.ABI-tag .note.package .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame
03 .init_array .fini_array .dynamic .got .data .bss
04 .dynamic
05 .note.gnu.build-id .note.ABI-tag .note.package
06 .eh_frame_hdr
07
08 .init_array .fini_array .dynamic .got
The previous program has 9 program headers, then, the segment mapping indicates in which program header (from 00 to 08) each section is located.
PHDR - Program HeaDeR
Contains the program header tables and metadata itself.
INTERP
Indicates the path of the loader to use to load the binary into memory.
LOAD
These headers are used to indicate how to load a binary into memory.\ Each LOAD header indicates a region of memory (size, permissions and alignment) and indicates the bytes of the ELF binary to copy in there.
For example, the second one has a size of 0x1190, should be located at 0x1fc48 with permissions read and write and will be filled with 0x528 from the offset 0xfc48 (it doesn't fill all the reserved space). This memory will contain the sections .init_array .fini_array .dynamic .got .data .bss
.
DYNAMIC
This header helps to link programs to their library dependencies and apply relocations. Check the .dynamic
section.
NOTE
This stores vendor metadata information about the binary.
GNU_EH_FRAME
Defines the location of the stack unwind tables, used by debuggers and C++ exception handling-runtime functions.
GNU_STACK
Contains the configuration of the stack execution prevention defense. If enabled, the binary won't be able to execute code from the stack.
GNU_RELRO
Indicates the RELRO (Relocation Read-Only) configuration of the binary. This protection will mark as read-only certain sections of the memory (like the GOT
or the init
and fini
tables) after the program has loaded and before it begins running.
In the previous example it's copying 0x3b8 bytes to 0x1fc48 as read-only affecting the sections .init_array .fini_array .dynamic .got .data .bss
.
Note that RELRO can be partial or full, the partial version do not protect the section .plt.got
, which is used for lazy binding and needs this memory space to have write permissions to write the address of the libraries the first time their location is searched.
TLS
Defines a table of TLS entries, which stores info about thread-local variables.
Section Headers
Section headers gives a more detailed view of the ELF binary
objdump lnstat -h
lnstat: file format elf64-littleaarch64
Sections:
Idx Name Size VMA LMA File off Algn
0 .interp 0000001b 0000000000000238 0000000000000238 00000238 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
1 .note.gnu.build-id 00000024 0000000000000254 0000000000000254 00000254 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
2 .note.ABI-tag 00000020 0000000000000278 0000000000000278 00000278 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
3 .note.package 0000009c 0000000000000298 0000000000000298 00000298 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
4 .gnu.hash 0000001c 0000000000000338 0000000000000338 00000338 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
5 .dynsym 00000498 0000000000000358 0000000000000358 00000358 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
6 .dynstr 000001fe 00000000000007f0 00000000000007f0 000007f0 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
7 .gnu.version 00000062 00000000000009ee 00000000000009ee 000009ee 2**1
CONTENTS, ALLOC, LOAD, READONLY, DATA
8 .gnu.version_r 00000050 0000000000000a50 0000000000000a50 00000a50 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
9 .rela.dyn 00000228 0000000000000aa0 0000000000000aa0 00000aa0 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
10 .rela.plt 000003c0 0000000000000cc8 0000000000000cc8 00000cc8 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
11 .init 00000018 0000000000001088 0000000000001088 00001088 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
12 .plt 000002a0 00000000000010a0 00000000000010a0 000010a0 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
13 .text 00001c34 0000000000001340 0000000000001340 00001340 2**6
CONTENTS, ALLOC, LOAD, READONLY, CODE
14 .fini 00000014 0000000000002f74 0000000000002f74 00002f74 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
15 .rodata 00000686 0000000000002f88 0000000000002f88 00002f88 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
16 .eh_frame_hdr 000001b4 0000000000003610 0000000000003610 00003610 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
17 .eh_frame 000007b4 00000000000037c8 00000000000037c8 000037c8 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
18 .init_array 00000008 000000000001fc48 000000000001fc48 0000fc48 2**3
CONTENTS, ALLOC, LOAD, DATA
19 .fini_array 00000008 000000000001fc50 000000000001fc50 0000fc50 2**3
CONTENTS, ALLOC, LOAD, DATA
20 .dynamic 00000200 000000000001fc58 000000000001fc58 0000fc58 2**3
CONTENTS, ALLOC, LOAD, DATA
21 .got 000001a8 000000000001fe58 000000000001fe58 0000fe58 2**3
CONTENTS, ALLOC, LOAD, DATA
22 .data 00000170 0000000000020000 0000000000020000 00010000 2**3
CONTENTS, ALLOC, LOAD, DATA
23 .bss 00000c68 0000000000020170 0000000000020170 00010170 2**3
ALLOC
24 .gnu_debugaltlink 00000049 0000000000000000 0000000000000000 00010170 2**0
CONTENTS, READONLY
25 .gnu_debuglink 00000034 0000000000000000 0000000000000000 000101bc 2**2
CONTENTS, READONLY
It also indicates the location, offset, permissions but also the type of data it section has.
Meta Sections
- String table: It contains all the strings needed by the ELF file (but not the ones actually used by the program). For example it contains sections names like
.text
or.data
. And if.text
is at offset 45 in the strings table it will use the number 45 in the name field. - In order to find where the string table is, the ELF contains a pointer to the string table.
- Symbol table: It contains info about the symbols like the name (offset in the strings table), address, size and more metadata about the symbol.
Main Sections
.text
: The instruction of the program to run..data
: Global variables with a defined value in the program..bss
: Global variables left uninitialized (or init to zero). Variables here are automatically intialized to zero therefore preventing useless zeroes to being added to the binary..rodata
: Constant global variables (read-only section)..tdata
and.tbss
: Like the .data and .bss when thread-local variables are used (__thread_local
in C++ or__thread
in C)..dynamic
: See below.
Symbols
Symbols is a named location in the program which could be a function, a global data object, thread-local variables...
readelf -s lnstat
Symbol table '.dynsym' contains 49 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000001088 0 SECTION LOCAL DEFAULT 12 .init
2: 0000000000020000 0 SECTION LOCAL DEFAULT 23 .data
3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND strtok@GLIBC_2.17 (2)
4: 0000000000000000 0 FUNC GLOBAL DEFAULT UND s[...]@GLIBC_2.17 (2)
5: 0000000000000000 0 FUNC GLOBAL DEFAULT UND strlen@GLIBC_2.17 (2)
6: 0000000000000000 0 FUNC GLOBAL DEFAULT UND fputs@GLIBC_2.17 (2)
7: 0000000000000000 0 FUNC GLOBAL DEFAULT UND exit@GLIBC_2.17 (2)
8: 0000000000000000 0 FUNC GLOBAL DEFAULT UND _[...]@GLIBC_2.34 (3)
9: 0000000000000000 0 FUNC GLOBAL DEFAULT UND perror@GLIBC_2.17 (2)
10: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_deregisterT[...]
11: 0000000000000000 0 FUNC WEAK DEFAULT UND _[...]@GLIBC_2.17 (2)
12: 0000000000000000 0 FUNC GLOBAL DEFAULT UND putc@GLIBC_2.17 (2)
[...]
Each symbol entry contains:
- Name
- Binding attributes (weak, local or global): A local symbol can only be accessed by the program itself while the global symbol are shared outside the program. A weak object is for example a function that can be overridden by a different one.
- Type: NOTYPE (no type specified), OBJECT (global data var), FUNC (function), SECTION (section), FILE (source-code file for debuggers), TLS (thread-local variable), GNU_IFUNC (indirect function for relocation)
- Section index where it's located
- Value (address sin memory)
- Size
Dynamic Section
readelf -d lnstat
Dynamic section at offset 0xfc58 contains 28 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x0000000000000001 (NEEDED) Shared library: [ld-linux-aarch64.so.1]
0x000000000000000c (INIT) 0x1088
0x000000000000000d (FINI) 0x2f74
0x0000000000000019 (INIT_ARRAY) 0x1fc48
0x000000000000001b (INIT_ARRAYSZ) 8 (bytes)
0x000000000000001a (FINI_ARRAY) 0x1fc50
0x000000000000001c (FINI_ARRAYSZ) 8 (bytes)
0x000000006ffffef5 (GNU_HASH) 0x338
0x0000000000000005 (STRTAB) 0x7f0
0x0000000000000006 (SYMTAB) 0x358
0x000000000000000a (STRSZ) 510 (bytes)
0x000000000000000b (SYMENT) 24 (bytes)
0x0000000000000015 (DEBUG) 0x0
0x0000000000000003 (PLTGOT) 0x1fe58
0x0000000000000002 (PLTRELSZ) 960 (bytes)
0x0000000000000014 (PLTREL) RELA
0x0000000000000017 (JMPREL) 0xcc8
0x0000000000000007 (RELA) 0xaa0
0x0000000000000008 (RELASZ) 552 (bytes)
0x0000000000000009 (RELAENT) 24 (bytes)
0x000000000000001e (FLAGS) BIND_NOW
0x000000006ffffffb (FLAGS_1) Flags: NOW PIE
0x000000006ffffffe (VERNEED) 0xa50
0x000000006fffffff (VERNEEDNUM) 2
0x000000006ffffff0 (VERSYM) 0x9ee
0x000000006ffffff9 (RELACOUNT) 15
0x0000000000000000 (NULL) 0x0
The NEEDED directory indicates that the program needs to load the mentioned library in order to continue. The NEEDED directory completes once the shared library is fully operational and ready for use.
Relocations
The loader also must relocate dependencies after having loaded them. These relocations are indicated in the relocation table in formats REL or RELA and the number of relocations is given in the dynamic sections RELSZ or RELASZ.
readelf -r lnstat
Relocation section '.rela.dyn' at offset 0xaa0 contains 23 entries:
Offset Info Type Sym. Value Sym. Name + Addend
00000001fc48 000000000403 R_AARCH64_RELATIV 1d10
00000001fc50 000000000403 R_AARCH64_RELATIV 1cc0
00000001fff0 000000000403 R_AARCH64_RELATIV 1340
000000020008 000000000403 R_AARCH64_RELATIV 20008
000000020010 000000000403 R_AARCH64_RELATIV 3330
000000020030 000000000403 R_AARCH64_RELATIV 3338
000000020050 000000000403 R_AARCH64_RELATIV 3340
000000020070 000000000403 R_AARCH64_RELATIV 3348
000000020090 000000000403 R_AARCH64_RELATIV 3350
0000000200b0 000000000403 R_AARCH64_RELATIV 3358
0000000200d0 000000000403 R_AARCH64_RELATIV 3360
0000000200f0 000000000403 R_AARCH64_RELATIV 3370
000000020110 000000000403 R_AARCH64_RELATIV 3378
000000020130 000000000403 R_AARCH64_RELATIV 3380
000000020150 000000000403 R_AARCH64_RELATIV 3388
00000001ffb8 000a00000401 R_AARCH64_GLOB_DA 0000000000000000 _ITM_deregisterTM[...] + 0
00000001ffc0 000b00000401 R_AARCH64_GLOB_DA 0000000000000000 __cxa_finalize@GLIBC_2.17 + 0
00000001ffc8 000f00000401 R_AARCH64_GLOB_DA 0000000000000000 stderr@GLIBC_2.17 + 0
00000001ffd0 001000000401 R_AARCH64_GLOB_DA 0000000000000000 optarg@GLIBC_2.17 + 0
00000001ffd8 001400000401 R_AARCH64_GLOB_DA 0000000000000000 stdout@GLIBC_2.17 + 0
00000001ffe0 001e00000401 R_AARCH64_GLOB_DA 0000000000000000 __gmon_start__ + 0
00000001ffe8 001f00000401 R_AARCH64_GLOB_DA 0000000000000000 __stack_chk_guard@GLIBC_2.17 + 0
00000001fff8 002e00000401 R_AARCH64_GLOB_DA 0000000000000000 _ITM_registerTMCl[...] + 0
Relocation section '.rela.plt' at offset 0xcc8 contains 40 entries:
Offset Info Type Sym. Value Sym. Name + Addend
00000001fe70 000300000402 R_AARCH64_JUMP_SL 0000000000000000 strtok@GLIBC_2.17 + 0
00000001fe78 000400000402 R_AARCH64_JUMP_SL 0000000000000000 strtoul@GLIBC_2.17 + 0
00000001fe80 000500000402 R_AARCH64_JUMP_SL 0000000000000000 strlen@GLIBC_2.17 + 0
00000001fe88 000600000402 R_AARCH64_JUMP_SL 0000000000000000 fputs@GLIBC_2.17 + 0
00000001fe90 000700000402 R_AARCH64_JUMP_SL 0000000000000000 exit@GLIBC_2.17 + 0
00000001fe98 000800000402 R_AARCH64_JUMP_SL 0000000000000000 __libc_start_main@GLIBC_2.34 + 0
00000001fea0 000900000402 R_AARCH64_JUMP_SL 0000000000000000 perror@GLIBC_2.17 + 0
00000001fea8 000b00000402 R_AARCH64_JUMP_SL 0000000000000000 __cxa_finalize@GLIBC_2.17 + 0
00000001feb0 000c00000402 R_AARCH64_JUMP_SL 0000000000000000 putc@GLIBC_2.17 + 0
00000001feb8 000d00000402 R_AARCH64_JUMP_SL 0000000000000000 opendir@GLIBC_2.17 + 0
00000001fec0 000e00000402 R_AARCH64_JUMP_SL 0000000000000000 fputc@GLIBC_2.17 + 0
00000001fec8 001100000402 R_AARCH64_JUMP_SL 0000000000000000 snprintf@GLIBC_2.17 + 0
00000001fed0 001200000402 R_AARCH64_JUMP_SL 0000000000000000 __snprintf_chk@GLIBC_2.17 + 0
00000001fed8 001300000402 R_AARCH64_JUMP_SL 0000000000000000 malloc@GLIBC_2.17 + 0
00000001fee0 001500000402 R_AARCH64_JUMP_SL 0000000000000000 gettimeofday@GLIBC_2.17 + 0
00000001fee8 001600000402 R_AARCH64_JUMP_SL 0000000000000000 sleep@GLIBC_2.17 + 0
00000001fef0 001700000402 R_AARCH64_JUMP_SL 0000000000000000 __vfprintf_chk@GLIBC_2.17 + 0
00000001fef8 001800000402 R_AARCH64_JUMP_SL 0000000000000000 calloc@GLIBC_2.17 + 0
00000001ff00 001900000402 R_AARCH64_JUMP_SL 0000000000000000 rewind@GLIBC_2.17 + 0
00000001ff08 001a00000402 R_AARCH64_JUMP_SL 0000000000000000 strdup@GLIBC_2.17 + 0
00000001ff10 001b00000402 R_AARCH64_JUMP_SL 0000000000000000 closedir@GLIBC_2.17 + 0
00000001ff18 001c00000402 R_AARCH64_JUMP_SL 0000000000000000 __stack_chk_fail@GLIBC_2.17 + 0
00000001ff20 001d00000402 R_AARCH64_JUMP_SL 0000000000000000 strrchr@GLIBC_2.17 + 0
00000001ff28 001e00000402 R_AARCH64_JUMP_SL 0000000000000000 __gmon_start__ + 0
00000001ff30 002000000402 R_AARCH64_JUMP_SL 0000000000000000 abort@GLIBC_2.17 + 0
00000001ff38 002100000402 R_AARCH64_JUMP_SL 0000000000000000 feof@GLIBC_2.17 + 0
00000001ff40 002200000402 R_AARCH64_JUMP_SL 0000000000000000 getopt_long@GLIBC_2.17 + 0
00000001ff48 002300000402 R_AARCH64_JUMP_SL 0000000000000000 __fprintf_chk@GLIBC_2.17 + 0
00000001ff50 002400000402 R_AARCH64_JUMP_SL 0000000000000000 strcmp@GLIBC_2.17 + 0
00000001ff58 002500000402 R_AARCH64_JUMP_SL 0000000000000000 free@GLIBC_2.17 + 0
00000001ff60 002600000402 R_AARCH64_JUMP_SL 0000000000000000 readdir64@GLIBC_2.17 + 0
00000001ff68 002700000402 R_AARCH64_JUMP_SL 0000000000000000 strndup@GLIBC_2.17 + 0
00000001ff70 002800000402 R_AARCH64_JUMP_SL 0000000000000000 strchr@GLIBC_2.17 + 0
00000001ff78 002900000402 R_AARCH64_JUMP_SL 0000000000000000 fwrite@GLIBC_2.17 + 0
00000001ff80 002a00000402 R_AARCH64_JUMP_SL 0000000000000000 fflush@GLIBC_2.17 + 0
00000001ff88 002b00000402 R_AARCH64_JUMP_SL 0000000000000000 fopen64@GLIBC_2.17 + 0
00000001ff90 002c00000402 R_AARCH64_JUMP_SL 0000000000000000 __isoc99_sscanf@GLIBC_2.17 + 0
00000001ff98 002d00000402 R_AARCH64_JUMP_SL 0000000000000000 strncpy@GLIBC_2.17 + 0
00000001ffa0 002f00000402 R_AARCH64_JUMP_SL 0000000000000000 __assert_fail@GLIBC_2.17 + 0
00000001ffa8 003000000402 R_AARCH64_JUMP_SL 0000000000000000 fgets@GLIBC_2.17 + 0
Static Relocations
If the program is loaded in a place different from the preferred address (usually 0x400000) because the address is already used or because of ASLR or any other reason, a static relocation corrects pointers that had values expecting the binary to be loaded in the preferred address.
For example any section of type R_AARCH64_RELATIV
should have modified the address at the relocation bias plus the addend value.
Dynamic Relocations and GOT
The relocation could also reference an external symbol (like a function from a dependency). Like the function malloc from libC. Then, the loader when loading libC in an address checking where the malloc function is loaded, it will write this address in the GOT (Global Offset Table) table (indicated in the relocation table) where the address of malloc should be specified.
Procedure Linkage Table
The PLT section allows to perform lazy binding, which means that the resolution of the location of a function will be performed the first time it's accessed.
So when a program calls to malloc, it actually calls the corresponding location of malloc
in the PLT (malloc@plt
). The first time it's called it resolves the address of malloc
and stores it so next time malloc
is called, that address is used instead of the PLT code.
Program Initialization
After the program has been loaded it's time for it to run. However, the first code that is run isn't always the main
function. This is because for example in C++ if a global variable is an object of a class, this object must be initialized before main runs, like in:
#include <stdio.h>
// g++ autoinit.cpp -o autoinit
class AutoInit {
public:
AutoInit() {
printf("Hello AutoInit!\n");
}
~AutoInit() {
printf("Goodbye AutoInit!\n");
}
};
AutoInit autoInit;
int main() {
printf("Main\n");
return 0;
}
Note that these global variables are located in .data
or .bss
but in the lists __CTOR_LIST__
and __DTOR_LIST__
the objects to initialize and destruct are stored in order to keep track of them.
From C code it's possible to obtain the same result using the GNU extensions :
__attributte__((constructor)) //Add a constructor to execute before
__attributte__((destructor)) //Add to the destructor list
From a compiler perspective, to execute these actions before and after the main
function is executed, it's possible to create a init
function and a fini
function which would be referenced in the dynamic section as INIT
and FIN
. and are placed in the init
and fini
sections of the ELF.
The other option, as mentioned, is to reference the lists __CTOR_LIST__
and __DTOR_LIST__
in the INIT_ARRAY
and FINI_ARRAY
entries in the dynamic section and the length of these are indicated by INIT_ARRAYSZ
and FINI_ARRAYSZ
. Each entry is a function pointer that will be called without arguments.
Moreover, it's also possible to have a PREINIT_ARRAY
with pointers that will be executed before the INIT_ARRAY
pointers.
Initialization Order
- The program is loaded into memory, static global variables are initialized in
.data
and unitialized ones zeroed in.bss
. - All dependencies for the program or libraries are initialized and the the dynamic linking is executed.
PREINIT_ARRAY
functions are executed.INIT_ARRAY
functions are executed.- If there is a
INIT
entry it's called. - If a library, dlopen ends here, if a program, it's time to call the real entry point (
main
function).
Thread-Local Storage (TLS)
They are defined using the keyword __thread_local
in C++ or the GNU extension __thread
.
Each thread will maintain a unique location for this variable so only the thread can access its variable.
When this is used the sections .tdata
and .tbss
are used in the ELF. Which are like .data
(initialized) and .bss
(not initialized) but for TLS.
Each variable will hace an entry in the TLS header specifying the size and the TLS offset, which is the offset it will use in the thread's local data area.
The __TLS_MODULE_BASE
is a symbol used to refer to the base address of the thread local storage and points to the area in memory that contains all the thread-local data of a module.
[AD REMOVED]