Security-Oriented C Tutorial 0x10 - Pointers and Addresses
In previous tutorials we have encountered these things called pointers and addresses but we still don't know what they really are. Let's learn a bit more to clear things up.
A pointer is a type of variable which as been combined with the dereference operator represented by the asterisk (*) character. When you declare a variable like this, it means that it points to a specific area in memory and the size of data that it refers to is dependent on the data type.
Addresses are blocks in memory which hold data like elements hold data in an array or like your house address in your street. Addresses in memory are (should) always be referred to by their hexadecimal value. If you try to show an address in its decimal form, no one will like you! I'm kidding, it's a standard convention like how you use metric in science and imperial in everyday life (only if live in places which still use this system!). You can get the address of a variable by tagging it with the address-of operator (&).
Bringing it all together, pointers are used to point to locations in memory which means they hold an address value. Why use pointers? What's the point? Well, pointers are used where we require the use of dynamic memory allocation, data structures and the efficient handling of such since they may contain large amounts of data. We've already discussed in Tutorial 0x0F that calling functions require parameters to be copied however, if our data structures contain a lot of data, there will be a lot of inefficiencies in time and space when doing so. We will be discussing this in another tutorial so for now, let's just focus on the basics.
Let's write up a few lines of code to show how this works.
We've declared two pointers intPtr and charPtr and initialized two variables n and c. We then set the pointer variables with the addresses of the data-type-matching variables so now intPtr has the value of the address of n and charPtr has the value of the address of c. It's pretty much the same thing as n holding the value of 12 and c holding the value of "A", try not to think too much of it or it'll mess you around (like it still does to me sometimes). So we call on some printfs to compare some values. Let's see the results.
Note: The %p format specifier represents the pointer type. It prints the hexadecimal value of what the corresponding pointer variable holds.
We compile with the GNU GDB flag so we can analyze the information in memory for a better understanding.
As you can see, the two comparisons are the same. But what's this I'm passing into the printf argument? These... *intPtr and *charPtr? What this is called is dereferencing a pointer and what it does is it asks what the value is of where the pointer is pointing. So, *inPtr means give me the value of whatever is inside the address of the variable n. Same thing applies to *charPtr. So let's see what this looks like in memory.
Just to clarify before we begin, a reminder that the blocks of data are the values the variable holds.
By printing out the address of each of the variables, we can see that the pointer variables hold the addresses of the variables they point to (blue and purple) like how n holds the value 12 and how c holds the value "A".
Then by dereferencing our pointer variables, we see the values that are inside the addresses of which they contain. The amount of data that the pointer points to is dependent on the declared data type of the pointer variable, for example, an int pointer points to a size of 32 bits whereas a char pointer points to a size of 8 bits. In this case, intPtr can reference 32 bits of information whereas charPtr can only reference 8 bits. If I were to have a char pointer point to an integer variable's address (yes, it's possible through a method called typecasting) and then dereference the pointer, it would only show me 8 bits of information.
So to clear things up, we declare a pointer variable by using the dereference operator. When we want to use the pointer variable, we only need to use the name of the variable like how we would normally do with normal, non-pointer variables. When you want to know the value of whatever is inside the address that they contain, only then would you use the dereference operator.
Back in Tutorial 0x0B - User Input, we discussed the usage of taking in values with the scanf function. When I informed you guys that you should not use the address-of operator with the char array, I didn't tell you the exact details of the reasoning. In fact, this applies to arrays in general.
As we can see, the array is actually a sort of reference to its own address at element 0.
Where 1 is equivalent to true. This is why we do not require the address-of operator in the scanf function.
If all of this is pretty confusing, then I can understand. It still slips my mind sometimes too. Just need to practise and play around until you're familiar with what's going on. Here is an awesome lecture on YouTube where the lecturer explains how pointers work and it looks like it's helped a lot of people so I thought it would help you guys too. Very entertaining!