Before we go any further, it may be useful to discuss how the computer uses memory.
You may be aware that Nintendo game systems use cartridges. All of the instructions for the games Nintendo produces are on this cartridge. This idea goes all the way back to games like the Atari 2600 developed in 1977, but the idea of reading instructions from a dedicated program goes back a lot longer. Each of these systems also had random-access memory (RAM) which could be used during execution (for example, storing your score, the position of your characters, etc.) Such a relationship between instructions and memory used during execution is called a Harvard architecture. Such an architecture has benefits, such as being able to access both instructions and RAM simultaneously, allowing for increases in execution time, and many microcontrollers continue to use such an approach.
Most computers today, however, do not have this divide between memory used for instructions and memory used for execution. Instead, when you begin executing a program, the operating system loads that program into the same RAM that is simultaneously used for execution. Such an approach is called the von Neumann architecture, after one of the original designers of such an approach. This simplifies the design, as both instructions and memory for execution are located in the same RAM, so if you increase your RAM by 1 GiB, this can be used both to execute larger programs, but also to execute more memory-intensive programs.
We will now discuss how a program is executed in the von Neumann architecture. When you execute a program, either by clicking on an icon, or typing the name of an executable program at the command-line console, the operating system loads the program into memory:
Program Instructions ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ |
Constants (literals) appearing in program ⋮ ⋮ |
Other available memory ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ 0xffffffff |
Now, you make function calls, which will make other function calls, which themselves will also call other functions. Each of these function calls requires memory for parameters, local variables, and other information (such as where to continue executing once the function call ends).
One could just "find" memory for the next function call, and the parameters and local variables would be stored in that new-found memory.
There is, however, one convenience about function calls that can be exploited: when a function calls another function, the only means of communication is either through arguments to or return values from the function that is being called. One function call cannot access or manipulate the parameters or local variables of a different function call.
Also, when a function returns, it always returns to the function that called it. In this case, you can think of it as a stack of paper:
For example, you are working out a design, but your cell phone ran out of batteries, and you don't have access to a computer. At some point in the design, you need to compute $594.32 \times 3.1416$. You could do this calculation on the piece of paper you're currently working on; however, as all you really need is the result, you could start on a new piece of paper, work out that $594.32 \times 3.1416 = 1867.115712$ \asympt 1867$, so you copy that value back to your original page.
Similarly, compilers deal with function calls like is suggested for solving problems, but instead of a stack of paper, the compiler uses a stack of memory.
Program Instructions ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ |
Constants (literals) appearing in program ⋮ ⋮ |
Other available memory ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ↕ |
Call Stack ⋮ 0xffffffff |
The memory for any parameters or local variables in main() are located at the bottom of the stack. When main() makes a function call,
Other available memory ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ↕ |
Memory for main() ⋮ 0xffffffff |
When main() calls another function (let us say, abs(...)), the memory for that function call's parameters and local variables is put on top of that allocated to main():
Other available memory ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ↕ |
Memory for abs(...) ⋮ |
Memory for main() ⋮ 0xffffffff |
When abs(...) exits, once the return value is returned to the calling function, we're done. The memory that was allocated to that function call is no longer required, and if another function is called, that memory can be reused.