Comp 15

Pointers Review

Pointers!?

Pointers and references are key to having a thorough understanding of how to program in C++. This page is a review on pointers and references, and will cover what they are, why we need them, and how they are used.

What are pointers and references?

We should start by distinguishing between pointer variables and pointer values. As you might recall, every variable has both an address and a value. The address identifies where the variable is stored in memory, and the value identifies what is actually being stored.

A pointer value is the address of some spot in memory. These look like 0x7fff3889b4b4 or 0x602010, and are hexadecimal (base 16) numbers that identify where something is. If you think of memory as one huge array, a pointer value is an index in that array. Pointer variables are variables that hold a pointer value. In the array, this would be the integer variable you might have declared that stores the index. A pointer variable points to something else, the pointer value says where that something else is.

A reference variable refers to another variable that already exists. It is basically making a new name for the same spot in memory, meaning that the value is shared and changing the value either changes the value of them both. We most often see these as function parameters, but might see them elsewhere as well.

It is important to remember the following as we continue:

Why use pointers?

So it sounds like pointers just add an unnecessary level of complexity. Why should you care where something is? There are a number of things that can only be done using pointers, and there are other things that are much easier or simpler when pointers are used rather than some workaround.

Call by Reference

Call by reference (CBR) as opposed to call by value (CBV) can be useful when you want to return multiple things from a function, or when you have a function parameter that is very big and would be inefficient to copy.

A standard function is CBV, meaning all the parameters are copied from the function call and put in new variables on the stack frame where the function runs. One result of this is that any changes made to variables are not kept. Imagine a situation where you want to have a function both get an input string and tell you if it was able to do so successfully. In this case, you want to return two things, a string containing the input, and a boolean indicating if the operation occurred successfully. C++ doesn't let you do that without making a struct, so instead it is often nice to pass a string variable as reference and return the boolean. In fact, you will likely use a function that does this very thing when dealing with I/O.

Another reason to use CBR is when a function argument is big and it would be inefficient to copy the whole thing any time the function is called. In fact, arrays are automatically CBR for exactly this reason. However, if you have a really big struct, maybe it contains multiple arrays as well as other things, by default the whole thing would be copied for every function call. Using call by reference, the new stack frame has a reference variable that lets you bypass copying the whole structure over.

Dynamic Memory

C++ doesn't let you change the size of something once you have made it. If you declare an array int arr[10], you have space for 10 integers, and if you realize you have 11, you can't ask C++ to add another spot to the end. Because of this, programming in C++ requires you to use dynamic memory if you don't know how big of a structure you need. We use pointers to allow us to interact with a structure whose size we determine at runtime.

How to use pointers

Okay, at this point, you have been convinced that you do in fact need to learn how to use pointers, no matter how annoying they may seem.

To use them, you need to understand:

& is the address-of operator, and is used to declare reference variables. As you might expect, the address-of operator is applied to variables that already exist and returns their memory address.

address_of.cpp

&, when used in a variable declaration, creates a variable with the type reference-to-a-char or reference-to-an-int etc. It is used when already have a variable, and want a new way to refer to it. This is most often seen in functions, where you might have a reference parameter to keep from copying your data over again.

ref_variable.cpp

* is the dereference operator, and it is used to declare a pointer variable. The dereference operator takes a memory address and gets the value stored there. Pointer variables have a value of a memory address. When used in a declaration, the * will make a pointer-to-a-char or pointer-to-an-int etc.

There is a special pointer which specifically points to 'nothing' - this is known as the null pointer (std::nullptr in C++). We use this as a comparison to see if a pointer variable is meaningful (you often set a pointer to null once you are done using it to signal that that memory address shouldn't be used). Attempts to access the null pointer will usually crash your program, which is better than perhaps modifying random memory locations.

It is also worth mentioning the -> syntax. This is used when you want to access a field or a function of a structure that you have a pointer to. For example, if you have a class object pointer named pc that has a member variable courseNum, you would access it using the syntax pc->courseNum (This does the exact same thing as (*pc).courseNum, but is easier to read and clearer, so you should use the arrow.)

pointers.cpp

Warning:
There is considerable confusion out there about the * symbol. It is not part of the type, though people often pronounce it, and even worse, write it that way. A declaration has a type followed by variables in sample expressions that would produce a value of that type.

      int  *p, x;  // declares a pointer to an int and an int
      int*  q, y;  // SAME!!  EVIL EVIL EVIL — NEVER WRITE THIS!!
    
Some will call the second example above a difference in style — it isn't. It's an abomination! I have seen experienced programmers waste hours because of it. The * goes with the variable, not the type, so write it there! (My theory is that it's a habit picked up when the person didn't understand what was going on, and when/if they did figure it out, they continued it out of a perverse notion of style and as a way to haze newcomers.)

Pointers and Dynamic Memory

Pointers are the way we interface with dynamic memory. To that end, there is some pointer syntax that's necessary to use that memory.

First, creating and destroying variables dynamically:

Creating and destroying arrays dynamcially:

Examples:

Warnings/Common Bugs