How To: Security-Oriented C Tutorial 0x09 - More on Strings

Security-Oriented C Tutorial 0x09 - More on Strings

What's up guys! It's time to discuss strings in more detail.

Review

Just a revision in case you have forgotten about what buffers (arrays) are. A buffer is a container to hold data which are adjacent to each other in memory (we have seen this in the previous tutorial on memory). They have a set size which cannot be changed so it is important that you know how much space you need before declaring the array.

Strings

What is the definition of a string in C? A string is any combination of characters proceeded by a null terminating byte (\0). This means it can be a combination of ten characters, or one character, or even no characters! A string is defined with the combination of characters followed by a null terminator within a pair of double quotation marks, e.g. "This is a string".

Defining Strings

The following content will detail two common ways to define strings and what the situation and differences are per definition.

Character Array

We have seen the character array string definition before so here it is again in code:

Just a reminder, the empty square brackets means that it will automatically complete the required number of elements needed to store the string including the null byte. You may also first declare the array with a predefined element number and then manually setting each element to a character.

Pointer to a Character

We have yet to talk about pointers however, for the sake of completeness, we will go over how to define a string with a character pointer.

The variable "str" is declared as a "pointer to a character", so technically it is a pointer to the letter "T"of the string. However, when printing strings, remember that it will only stop printing wherever there is a null terminator therefore it still works. The difference between this and the character array is that the string content should not be modified. If you do so, it may result in what we call a segmentation fault which means that there had been a write access error as the content is in read-only memory. It will inevitably lead to the crashing of the program. This type of string is called a "string literal ".

You'll find that it's possible to reference an element with the character pointer variable just like an array.

The String Library

The string.h header file contains prototypes (will explain this on a tutorial on functions) of many string manipulation functions such as string comparisons, copying strings and concatenating strings. If you're interested, head to Google, search it up and see what you can find.

Conclusion

Have a look over arrays and strings again, get familiar with how they work, especially the material I covered on memory in the last tutorial. When we go over buffer overflows, we will be analyzing the contents in memory and seeing exactly what is happening under the hood.

dtm.

2 Comments

Sorry

Hello there ... here I am again trying to share my thoughts while I am currently learning C++ programming by reading this excellent tutorial and on the way I have some Ideas of how can I better illustrate myself about the concepts that DTM (the author) presents and maybe It can be helpful for you too, maybe not, maybe you already know this stuff, but just in case that there is somebody who feels kind of lost, just like me for being a total beginner, maybe it could be helpful.

On the section Pointer to Character, here there is the code presented:

#include <stdio.h>
int main(void){
char *str = "This is a string" ;
printf("%s\n", str);
return 0;
}

I will try to explain how a pointer works under memory storage perspective:

We first compile and debug it, as recommended on tutorial:

Security-Oriented C Tutorial 0x08 - A Trip Down Memory Lane.

Also, Pointer.c is the name I gave to my program and the executable file is testpointer.

Compile:
root@kali:~/Desktop/c_files# gcc -m32 -gdwarf-2 pointer.c -o testpointer
And Debug:
root@kali:~/Desktop/c_files# gdb -q ./testpointer
Reading symbols from ./testpointer...done.
(gdb)

Then we list the program setences to have a quick look where we should set our debug breaker (at least that is what I think DTM is doing). We then set the breaker on line 4 and run the executable.

(gdb) list
1 #include <stdio.h>
2 int main(void){
3 char *str = "This is a string" ;
4 printf("%s\n", str);
5 return 0;
6 }(gdb)
(gdb) break 4
Breakpoint 1 at 0x5b5: file pointer.c, line 4.
(gdb) run
Starting program: /root/Desktop/c_files/testpointer

Breakpoint 1, main () at pointer.c:4
4 printf("%s\n", str);
(gdb)

Question for an expert: Is the following reasoning correct?:

I am not sure but I think that the program has run before the printf command on line 4. In other words, the program has initialized an entity on memory, or on my way to explain it I would say that is kind of a data space on memory that will allocate the char-data string and the way to address it is by pointing it by the means of a memory address pointer, which is *str and after that the program stops.

Well, we want to know our ESP pointer (top of the stack) and EBP pointer (base of the stack) addresses in order to know from where to where it extends the program data memory:

(gdb) info registers esp ebp
esp 0xffffd3a0 0xffffd3a0
ebp 0xffffd3b8 0xffffd3b8

Correct me if I'm wrong, but what I understand from the results on the above command is that the run program data was stored from memory address 0xffffd3a0 to 0xffffd3b8. And by doing the math :D, b8 - a0+1 = 19 HEX equals to 25 memory addresses that contains the data from our program (that number would be 7 words on memory stack starting from 0xffffd3a0) . Note that 0xffffd3b8 is the bold 00 and the the memory addresses from 0xffffd3b9 to 0xffffd3bb is the 0x00000000.

(gdb) x /7xw 0xffffd3a0
0xffffd3a0: 0x00000001 0xffffd464 0xffffd46c 0x56555660
0xffffd3b0: 0xffffd3d0 0x00000000 0x00000000

Our pointer should be on the above memory range. We look for it:

(gdb) print &str
$6 = (char *) 0xffffd3ac

I think the above command is showing that str is a pointer (*) to a char-data which memory address is store on memory address 0xffffd3ac. Let's examine it.

(gdb) x 0xffffd3ac
0xffffd3ac: 0x56555660

the above command (x only) presents the entire data word (and with word I mean 4bytes=32bits) stored starting from memory address 0xffffd3ac, and guess what. What it is stored on memory address 0xffffd3ac is another memory address 0x56555660. If we take a look and check, lets say the 04 words starting from memory address 0x56555660, we'll get:

(gdb) x /4xw 0x56555660
0x56555660: 0x73696854 0x20736920 0x74732061 0x676e6972

Then our char-data "This is a string" should be there, starting from memory address 0x56555660. By counting the characters of our string, there should be 17 memory positions (including null terminator). So we take a look on it.

(gdb) x /17cb 0x56555660
0x56555660: 84 'T' 104 'h' 105 'i' 115 's' 32 ' ' 105 'i' 115 's' 32 ' '
0x56555668: 97 'a' 32 ' ' 115 's' 116 't' 114 'r' 105 'i' 110 'n' 103 'g'
0x56555670: 0 '\000'

Then you will see that 54HEX=84, 68HEX=104, 69HEX=
105, and so on ....

Well, I hope this explanation of a pointer serves somebody.

Blinko

Share Your Thoughts

  • Hot
  • Latest