Satori's tech blog

A tech blog I always forget to update

Creating an iOS Debugger: Part 1: Backtracing

Before we begin, this tutorial assumes that you are reasonably familiar with assembly language (specifically ARM).

In most blog series, the writer tends to start from the bottom, the very beginning of their content, and work up from there. We do things differently however, over at satorify. We won’t be starting with a basic skeleton for our debugger. Instead, we’re going to start by investigating different aspects of what a debugger does.

![](http://i1.wp.com/sd.keepcalm-o-matic.co.uk/i/keep-calm-and-hack-the-system-3.png?resize=74%2C86)The cringe pics have begun
For the first part of the series, we are going to be investigating something known as backtracing (or frame unwinding).

What is backtracing?

Let’s start off at the very basic stuff. If you’ve never heard of backtracing before, the chances are you may have still seen it without knowing what it was.

Essentially, backtracing is viewing a list of functions that have been previously called. This is something most people will only see in xcode, after a breakpoint or exception is triggered.

Backtracing is useful as viewing the previously called functions allows you to get a sense of how the program has executed previously to a point.

Well that’s great, but Satori, how do we make our own backtrace tool?

I’m glad you asked. To perform a backtrace in your own application, there are 2 methods. You could either use the easy method, or the correct method.

The easy method

The vast majority of people won’t need to backtrace their own applications, and those that do will likely use this method. In unix systems, there is a function that will backtrace for you called:

backtrace(void **buffer, int size)

backtrace() takes a void* array to fill with pointers to previously called functions, and a size integer as the number of previous functions to save.

Here is some crappy untested example code I came up with for backtrace(), please don’t judge it:

#include <execinfo.h>

void Callstack() {
    int num = 10;
    void **addresses = malloc(sizeof(void*) * num);
    int count = backtrace(addresses, num);
    printf("there were %d pervious functions calledn", count);
    int i = 0;
    while (count--) {
        printf("function called at %pn" addresses[i]);
        i++;
    }
    free(addresses);
}
int main() {
    Callstack();
}

The above code would output something like:

there were 4 previous functions called
function called at 0x23E0
function called at 0x2204
function called at 0x7EEE09F0 etc...

using backtrace() is quite alright if you only want to see what functions were previously called on your own thread. For a debugger however, this is completely useless. We need to be able to backtrace other threads that are part of entirely different processes, and backtrace() is simply useless for this.

If you want to read more on backtrace() please visit the libc man page.

The correct method

Thankfully, there is another way to backtrace. Before we get too deep into this, we first have to understand the basics of stack based memory. A stack is a part of memory given to a function to hold things like local variables and other misc. data.

![](https://upload.wikimedia.org/wikipedia/commons/thumb/2/29/Data_stack.svg/200px-Data_stack.svg.png)A common representation of a stack
In most modern assembly languages, there is a dedicated stack pointer, which points to the bottom (or top if the stack is descending) of the current stack. I will be using the ARM architecture for the following stack examples, and in ARM, the stack pointer is descending (meaning it starts off pointing to a high part of memory, and moves down). At the beginning of a function, the stack pointer is decremented by a predetermined size. This predetermined size is the amount of bytes required by data that will be eventually stored on the stack.

Here is an example from a function in ARM. The PUSH instruction pushes some registers on to the stack. R4 is just a general purpose register. R7 I will come back to. LR, or link register holds a pointer to the previously called function. LR is key to backtracing.

The stack should now look something like this:

The ADD instruction sets the value of R7 to SP + 4. This is important but I will come to that later.

The SUB instruction decrements the stack pointer by 0x28 bytes. This means there is 0x28 bytes of memory for the function to store local variables in. But this is unimportant. The real golden goose here is the register R7.

R7 for ARM is known as the base pointer*. While SP points to the top of a stack, R7 points to the bottom. (Or the other way around if the stack is a descending stack, which ARM stacks usually are). This means that R7 now points to where SP originally pointed to before the SUB. Actually, that was a lie. Because R7 was incremented by 4, it now points to where SP+4 originally pointed to. Which in the above example, was the previous value of R7. The current address of R7 + 4 will be the value of LR, which as stated earlier, will be a pointer to the previously called function.

This is unsurprisingly very confusing, so I will try to summarize it. The values contained in R4, R7 and LR are pushed onto the stack, then R7 is set to point to the original value of R7. This is key. We can use R7 as a makeshift backtracer.

From now on, we will think of R7 like a structure.

In C/C++ R7 would look like this:

 struct R7 { 
    R7 *next_r7; 
    unsigned int function_address; 
};

To sum up the block of text above, if we can gain access to R7 from a thread, we can backtrace that thread from anywhere (as long as the memory R7 points to remains).

Here is a quick example I have created to show how to backtrace using R7. It functions very similarly to backtrace() in unix (probably because backtrace() does the same thing).

#define noinline \
  __attribute__((noinline))  // tell the compiler not to inline functions

typedef struct Frame {
  Frame* next;
  uint32_t offset;
} Frame;

noinline int A();
noinline int B();

void show_stack() { 
/* volatile tells the compiler not to optimize */
  volatile Frame* frame = 0; 
/* this assembly snippet moves the value of R7 into "frame" */
  asm("mov %0, r7\n" : "=r"(frame)); 
/* while frame points to something */
  while (frame) {
    printf("Function call at 0x%x\n", frame->offset);
    frame = frame->next;
  } 
/* printing the address of A() and B() */
  printf("%p-%p\n", A, B);
}

noinline int A() {
  return B();
}

noinline int B() {
  show_stack();
  return 0;
}

int main() {
  A();
}

The above will have an output similar to:

Function call at 0x4E004
Function call at 0x4A326
Function call at 0x12345678 ... 0x4E000-0x4DF90

It would be a good idea to note that the addresses returned through backtracing will not be the starting address of functions, but rather the address of the instruction that calls the next function. Also, for x86, the implementation is greatly similar, but instead of R7 you would use EBP which essentially does the same as R7 in ARM.

Real world usage

There are many examples of where this can be used in the real world. Debuggers are an obvious example. Debuggers will use this technique (referred to frame pointer unwinding) to view the previously called functions of an application. This is why I decided to investigate backtracing, so I could implement it in my debugger. Backtracing isn’t just limited to debuggers though. Crash log generators are another example of where backtracing is useful. A crash logger can catch an exception thrown in an application and use frame pointer unwinding to find out what the previously called functions were. This is really useful for crash logs as it provides extremely helpful information for the developer. A great example of a crash logger doing this is KSCrash

Conclusion

So then, today we have learned what backtracing is, how to backtrace using predefined functions, and then explored how to manipulate a stack to backtrace manually. I hope some of this has been informative for you, and I promise that not all posts in this series will be as heavy as this one.

Oh, and please share this series with your friends/colleagues. The more the merrier.
Oh, Oh, also follow me on twitter for updates and other iOS dickery @Razzilient

Part 2: Breakpoints

comments powered by Disqus