Sunday, September 12, 2010

Buffer Overflow Vulnerability : Unleashed

Today our topic of discussion is: BUFFER OVERFLOW!!!!
Believe me, these words always used to fascinate me right from the starting of my career as I know that one of the major Hacking Road comes through here only.

So Today I will be discussing Buffer Overflow here, I will discuss what we mean by buffer, what actually BUFFER OVERFLOW stands for and what benefits I can have
(NOTE: AS AN ETHICAL HACKER) of this vulnerability.

First of all let me tell you that BUFFER OVERFLOW itself has 2 variations: one named as STACK OVERFLOW and the other named as HEAP OVERFLOW (just wait they will become as clear as crystal to you within some time).
Here I will be discussing STACK OVERFLOW, Just forget about HEAP OVERFLOW as of now.

The first important thing here is: WHAT THE HECK THIS BUFFER STANDS FOR? Well, Buffer, as we used this word in our usual life also, is a fixed capacity storage area. From coding point of view I may say that Buffer is an array of fixed length.

So if I write:

char buffer[100];

Here variable “buffer” has a storage area and with the maximum length of 100 bytes.

Now let’s dive into system’s internals, you know, the best part of this vulnerability is that this is platform independent because the vulnerability does not lie in OS Platform but it lies in Microprocessor Architecture and the way they store data while executing commands.

So as long as the OS version and the vulnerable program version (Where we found some BUFFER OVERFLOW issue) will remain same, this backdoor will work.


IMP: I will be clearing very soon that why OS version is so important, as of now just proceed.


IMP: This attack will be discussed over INTEL X86 ARCHITECTURE and STACK/OPREATIONS stands for STACK/OPERATIONS of X86 PROCESSOR

Now let’s see that how the Processor provides a way to store this storage/buffer.
In Processor we have something called STACK (don’t worry as of now; it will become clear to you soon). This STACK only holds all the data which we save inside the Buffer or Storage.

In the above code:

char buffer[100];

Internally Processor will provide 100 bytes space inside the storage area STACK to hold all these data.

Now let’s focus on STACK(as we all read, it is a LIFO data structure which means that the last ingoing item inside stack will be the first out coming one and the 2 operations STACK supports are : PUSH (to Insert data) and POP (to take out data) )

In pictorial form the operations are:


                 PUSH Operation (DATA INSERTION)



                    POP Operation (DATA RETRIEVAL)

Now lets have a look that how this STACK is being implemented in X86 PROCESSOR family:
The first very important thing to notice is that in X86 families, STACK always grows downwards (from higher memory area to Lower memory area). The second important thing is that there are two registers (or variables) that are used to keep track of the stack. These are ESP and EBP (or SP and BP). They stand for Stack Pointer and Base Pointer respectively.
Now let’s discuss these registers:
ESP: This register points to the top of the stack. The ESP can be changed in a number of ways both directly and indirectly.
When something is PUSHed onto the stack, the stack increases accordingly by modifying ESP. When something is POPed off of the stack, the stack shrinks by modifying ESP. The PUSH and POP operations modify the ESP indirectly.
You may also be able to manipulate the ESP directly, like
SUB esp, 04h
This makes the stack larger by four bytes or one word. (Remember I told that the top of the stack is at a lower memory address than the bottom. The stack increase (grows/pushes out) by adding to the top of the stack. This means going to a lower memory address.)
So you may visualise this growth of stack as



You can see that the STACK is growing towards the lower area of memory



EBP: This holds the then memory address of the bottom of the stack - more accurately it points to a base point in the stack that we can use a reference point within our function. Let’s visualize this concept with these images



Let’s start with some data in the stack:




Now we have initiated some operation due to which some data has to be pushed into the stack, in this scenario the EBP will be moved from old stack’s bottom (0x99999999) to the relative new stack bottom (0x55555551)



And the value at 0x55555551 will be: 0x99999999(Old bottom of the stack).
So suppose our stack grows due to PUSH and now its like:



Remember that the value at location 0x55555551 is 0x99999999 and the value of EBP is: 0x55555551 so in this manner we have a track of the current bottom (0x55555551) and the relative bottom (0x99999999) of the stack.
Now, when we run the POP operation, this will fetch out all the data and simply reverts ESP by copying EBP into ESP.





Now ESP has value 0x55555551 and EBP has value
VAL [0x55555551] = 0x99999999.
So after this operation the values would be:




This makes the stack look exactly like before, except with some junk in lower memory addresses (which are ignored since the computer thinks the top of the stack is at 0x555555).




Apart from this, there is one more register of our interest that is EIP.

EIP: This stands for Instruction Pointer, whenever we call a function this pointer is saved on the stack for later usage. When the function returns, this saved address is used to determine the location of next executed instruction.

So now let’s start our main discussion that why this STACK is so important to us, the reason is its implementation in X86 FAMILY.

After every function calls the stack looks like:



Using assembly knowledge to change it to generic layout. It would be like:




(Small Fix: the first local variable would be EBP – 0X00000008 and not EBP-0X00000004)


And here comes the real power for us. We all know now that if the code is calling a function exactly how this function is going to get loaded in the memory for X86 FAMILY PROCESSOR (I bet you all should be enjoying by now).

So let’s have a quick look at my simple demo code:

IMP: WRITTEN IN VC++, VISUAL STUDIO 2008, run this in DENUG MODE

void test(int i , int j ,int k)
{

int buffer = 9;
printf("%d == %d ==%d",i,j,k);


}

void main()
{
test(10,20,30);
}

I have put a break point at line:

printf("%d == %d ==%d",i,j,k);

And at that time when I dumped the memory I got this valuable information:
The registry editor reveals this info for me:


EAX = CCCCCCCC EBX = 7FFD8000 ECX = 00000000 EDX = 00000001 ESI = 00000000
EDI = 0026F710 EIP = 008B13D5 ESP = 0026F638 EBP = 0026F710 EFL = 00000216

Now time to see the magic!!!!!

I have passed the parameters as:
10 (HEX: 0a), 20(HEX: 14), 30(HEX: 1e)

As per above discussion if I add 0x00000008 to EBP, I should be getting address of 1st function argument (10).
So let’s try it with same code, when I executed the command EBP+8 to see the memory of this location, I was redirected to:




We can see the value at memory location 0x0026F718 is 0a (10) bingo!!!!!
Now let’s try to find out other parameter!!!!

When I executed the command:

0X0026F710 + 0x000000C [Address (EBP) + 12(HEX)]
OR
0x0026F718 + 0x00000008 [Address(First Parameter) + 8 (HEX)]

I was able to see the next parameter value which is 14 (20 in Decimal). I also gave a local variable inside the function. So just try to find out that local variable value using EBP.
When I executed the command:
EBP – 0X00000008, I was successfully redirected to the location of first local variable (09).




So now you have pretty much got a clear idea how to navigate and access all the passed parameters values and local variables values using registers and standard System STACK.

But the attack has not started yet. Isn’t it? The basic of STACK OVERFLOW attack lies under the fact that if anyhow I will be able to write and modify the EIP (Instruction pointer) of the current stack, I will be successfully redirect code execution to some arbitrary memory area or in more sophisticated attack to my memory area where my exploit code will be waiting for its turn (I hope by this time STACK OVERFLOW ATTACK WOULD BE CLEAR TO ALL OF YOU)
Now the question is in how many ways I can do it. Pretty simple in 2 ways: Either I will keep writing in Local variables exceeds its limit, modify old EBP value and then finally modify EIP value. The second way is to keep writing to the input parameters of the function, exceeds the limit and directly rewrite the EIP. On professional and outside attacks generally the approach of exceeding the limit of input parameters is applied.

In my first example I will be showing how I will be tampering some memory location by exceeding the limit of Local variable, and then will proceed to sophisticated attack.
I wrote a very simple code as:

void crash_code()
{
char buffer[10];
scanf("%s",buffer);
//This will feed data into local variable “buffer” , try
// to exceeds the limit

}

Now whenever I will be giving some input less than or equal to 10 bytes, this code will work perfectly but giving something more than 10 bytes this will try to access invalid memory location and finally the code will crash.





But so what???? How this is beneficial for me!!!!
The answer lies via a well crafted STACK OVERFLOW attack.

To Be Continued …...  




After a long time I got a chance to complete this document, so we directly jump to our point.

Today I will be writing a code to demonstrate buffer exploit.
I wrote a very simple code to demonstrate this exploit.

#include "iostream"
#include "string"

void exploit()
{
      printf("BO Attack is successfull");
}



int main(int argc, char* argv[])
{

      printf(" The address of function to execute is: 0x%08x\n", exploit);
      char buff[2];
      scanf("%s",buff);
      return 0;
}

When execute with any char array with length up to 2 bytes this code will work perfectly.
Just have a look at the normal program execution:





My main motto is to crash this program using BO and execute the function “void exploit()” by overwriting program’s current IP with the function memory address which is: 0X00401000.

Now I will try to overflow this buffer and insert the exploit code address to run the function via buffer overflow.

When I run this program with an input more than 2 bytes it will crash.




In the crash information the “Exception Offset” is the location what I have to overwrite in order to execute my function. In simple words I have to overwrite this offset location with the function “exploit” address i.e.:  0x00401000



After some time of hit and trail method I found the exact length of string which can be used to overwrite this location.




You can see that the offset is now overwritten with 41414141 which are AAAA. Now the simple thing I have to do is that replace the last 4 A’s in my input (because these last 4 A’s only overwrote the offset) with the address of the exploit code
i.e.:  0x00401000. The program input for this would be:

“AAAAA………………..AAAA\0x00\x10\x40” (Remember the little Indian and big Indian formatJ).

And when I ran this codeJ

Appending address to the input string is very simple and you can use
char[] function_address = “\x00\x10\x40”;
strcat(inputstring, function_address) to append address at the end of the string.

(I am not discussing this here , because there should be some effort from your side also J isn’t it…More ever if you are able to do this then welcome to the world of Buffer Overflow exploit finding else need to learn “C” again)
The only one thing keep in mind the diff. b/w little Indian and big Indian architecture and the address should be “\x00\x10\x40”

So when I run this program with this crafted string



With this information we have come to the end of our discussion on Stack Overflow attack (One type of Buffer overflow).
Will be discussing Heap Overflow soon with you guys.
Comments and suggestions are welcome.