This is a quick review of run-time analysis.
To begin, let us consider the following relationship between people: their age. Given two people, Ali and Bailey, either:
This is a weak ordering. It allows us to group everyone into age-classes (equivalence classes) where everyone in the same age-class is the same age: for example, a typical 2nd-year cohort may look like what is shown in Figure 1.
Figure 1. A typical fall 2B cohort broken up into age-classes.
If all we care about is the age of an individual, we don't need the entire group. Instead, we can choose a representative (representative element) for each age-class, as is shown in Figure 2.
Figure 2. A typical fall 2B cohort with a representative chosen from each age-class.
Now, we don't have to worry about the entire cohort, but rather—so long as age is our primary concern—we can just consider the representatives, as is shown in Figure 3.
Figure 3. The representatives chosen from each age-class.
Now, in order to compare the age of two arbitrary people in the cohort, I just have to find the representatives for each of them and compare the age of the representatives. For example, suppose the seven representatives chosen above are:
Ali, Bailey, Casey, Dylan, Emerson, Finley, and Douglas
respectively. Leslie is the same age as Dylan and Morgan is the same age is Bailey; thus, as Dylan is older than Bailey, it follows that Leslie is older than Morgan. Similarly, if Riley and Sidney are both the same age as Emerson, we may conclude that Riley is the same age as Sidney.
We will use the following symbolism: having chosen our representatives, then
It is also likely true that for most people in this cohort, they are all o(Douglas).
This example demonstrates all of the properties of a weak ordering:
Why do you care? Well, for example, almost every container in the Standard Template Library (STL) in C++ assumes that the objects stored are related using a weak ordering.
Why do you care? Also, understanding weak orderings is necessary for a proper understanding of run-time analysis.
We will begin analyzing run-times by looking at a few simple examples of code where we can deduce the run-time.
Now, if you consider the C code that pops the top off of a queue, it looks like this:
void pop( struct Queue *q ) { if ( q->queue_size != 0 ) { q->head = (q->head + 1) % q->array_capacity; --(q->queue_size); } }
Here, all we do is make one comparison, and if it is true, we update two variables. In either case, only a fixed number of instructions are executed and the number of instructions executed does not depend on the number of objects in the queue: thus, we say the run-time is Θ(1). Any operation on a queue that takes a fixed number of instructions regardless of the number of items in the queue can be said to be in the same run-time class and the representative we choose to represent all of these is the member representing one instruction: 1.
Next, consider the following C code that counts the number of times that a particular pointer appears in a queue:
int count( struct Queue *q, void *ptr ) { if ( q->queue_size == 0 ) { return 0; } int c = 0; for ( i = q->head; true; i = (i + 1) % q->array_capacity ) { if ( q->array[i] == ptr ) { ++c; } if ( i == q->tail ) { return c; } } assert( false ); // we should never get here... }
The for loop checks all entries of the array. It will never end early as every entry must be checked. Consequently, the number instructions that are run is something of the form where and are constants. If the size of the queue is doubled, the amount of time this function must take is also approximately doubled. If the number of entries in the queue is increased by a factor of 10, the run-time will also be increased by a factor of 10. These are the same characteristics of any algorithm that takes exactly operations to run. Thus, we will say that the above function runs in Θ() time where is the number of entries in the queue.
If describes the run-time of one algorithm and describes the run-time of another algorithm, if grows faster than , then the first algorithm requires more instructions to execute, and therefore we will say that the first algorithm is slower than the second. We will on occasion mix terminology: "a faster algorithm has a run-time that grows slower than another" and "a slower algorithm has a run-time that grows faster than a faster algorithm".
At this point, we have noted that associated with each function is an algebraic expression that gives the number of operations that must be computed. For any algorithm where the run-time or memory depends on a parameter (perhaps an array of size or an matrix), we will represent the algebraic expression associated with the algorithm by the mathematical function . Interestingly enough, for the most part, we are not all that interested in the exact details of this function. Just like we may not be interested in students other than their age, we are only interested in comparing functions based on their growth characteristics.
Thus, we will say that two functions and both grow at the same rate if
.
That is, two functions grow at the same rate if, in the limit, the ratio does not go to either 0 or infinity. In most cases, it actually goes to a non-zero constant.
For example, all of the following functions have the same run-time:
, , , , , , , and .
As they all grow at the same rate, we will pick a representative, , and we will say that, for example, . In this case,
.
The following doubly nested for-loop multiples a vector by a matrix:
double *matrix_vector_multiplication( double **M, double *v, int n ) { int i, j; double *u = malloc( n*sizeof( double ) ); for ( i = 0; i < n; ++i ) { u[i] = 0.0; for ( j = 0; j < n; ++j ) { u[i] += M[i][j]*v[j]; } } return u; }
This requires multiplications and additions. Consequently, if the value of is doubled, then the algorithm will take four times as long to run. If the value of is increased by a factor of 10, the run-time will be increased by a factor of 100. Consequently, we will say that this algorithm runs in time.
Without proof, the algorithm you learned in your linear algebra course runs in time. There is an algorithm that will multiply two matrices slightly more quickly, namely, the Strassen algorithm, that runs in time where .
We have seen four five times:
, , , , and .
In the long run, an algorithm that is will run faster than an algorithm that is . In fact, consider any two run-times where the dominant terms are and where . Then
and because , and thus
.
Therefore, we can order all polynomial-time algorithms from the fastest run-times () to ever slower-running algorithms (at least, slower running in the long term). To say that one function runs strictly faster than another function in the long term, we will say
which says that
.
For example, , , , and .
It is reasonably easy to show that by definition of the limit:
.
More interesting is that the logarithm grows slower than any polynomial where , as we must use l'Hôpital's rule:
where the last follows because .
Therefore, we may say that, for example, , , and .
Similarly, but and .
Now, a binary search on a sorted list runs in time and three very fast sorting algorithms, quick sort, heap sort, and merge sort, all run in time. Selection sort, however, runs in time, and therefore selection sort is significantly slower than any one of quick, heap, or merge sort.
Therefore, our representative elements, from slowest growing (fastest running) to fastest growing (slowest running), are
, , , , , , , , and .
Note, however, that because , it follows that all logarithms, regardless of the base so long as , grow at the same rate:
which is a constant so long as and are constants.
Exercise: Normally, multiplying two -digit numbers requires an algorithm that runs in time. The Karatsuba algorithm multiples two -digit numbers in time. Where in the above list does this run-time fit?
Suppose you write an algorithm that runs in exponential time: . The solution is very simple: in the words of Mr. Trump, "You're fired!" Any polynomial time algorithm grows more slowly than any exponential function: —just apply l'Hôpital's rule one hundred times to confirm this. Also, if , then . Finally, for any value of .
There was a story of a subject who was offered a reward by a king. The subject asked that the king take a chess board and place one grain of barley on the first square, two grains of barley on the second, 22 on the third, 23 on the fourth, 24 on the fifth, and so on. The king agreed but when he realized he would have to pay out grains of barley (approximately metric tonnes or the current production of barley for eight-and-a-half millenniums), he promptly had the subject killed. There is, however, no reasonable moral to this story.
Similarly, a programmer told his manager that his algorithm was really fast: it took four nanoseconds to run on a problem of size 2, 8 ns to run on a problem of size 3, 16 ns to run on a problem of size 4, 32 ns to run on a problem of size 5, and so on. The manager was greatly impressed and earmarked the programmer a bonus; however, when the algorithm was fully implemented and it turned out it would take , or thirteen days, to run on a problem of size 50, the programmer was promptly fired. You may draw your own lesson.
Now, the next question is, how do we analyze an algorithm?
First, any single operator in C that does not involve a function call runs in time.
Next, if two blocks of instructions are run serially (i.e., one after the other), and the run-time of each of each of the two blocks is and , it follows that the run-time of the two blocks run in series is . This this expression, however, may be simplified.
For example, without further knowledge into the details of two blocks, say that one runs in and the other runs in . Thus, the two blocks run in serial run in time. However, , and therefore, we may simplify this to saying the blocks running serially run in time.
As a different example, suppose that we have four blocks of code:
If these four blocks of code are run serially, the run-time would be:
.
Examining this, with no further knowledge of the values of , , , and , we can still determine, for example, that , and thus, we can simplify this to
.
If any one of these variables is increased in size, the run-time of the block of code would be increased. If, in addition, you are given the additional information that and , then we can also deduce that , and therefore
.
Now, you may object: where would such an algorithm ever come up and how could such apparently bizarre requirements appear? These are actually taken from standard assumptions about sparse matrices and the above is a run-time analysis of adding two sparse matrices: matrix_matrix_operations.cpp. Any time you use finite-element analysis to simulate a physical system, the software will be using sparse systems of linear equations.
One consequence of the previous observation is that any finite set of C instructions run serially must also run in time. For example, the following block of C++ function
template <typename Type> void Splay_tree<Type>::Node::zig_zag( typename Splay_tree<Type>::Node * &ptr_to_this ) { Node *l = left; Node *lr = left->right; Node *lrl = left->right->left; Node *lrr = left->right->right; ptr_to_this = lr; lr->left = l; lr->right = this; l->right = lrl; left = lrr; }
is nothing more than eight assignments run serially. Consequently, we may say the run-time is .
A for-loop of the form:
for ( i = 0; i < n; ++i ) { // Some Theta(1) code }
runs in time. A for-loop of the following type:
for ( i = 1; i < n; i *= 2 ) { // Some Theta(1) code }
runs in time. Actually, it will run times, but
which is finite.
Now, these are the two prototypical cases. Suppose you have a more complex loop:
for ( i = 0; i < n; ++i ) { for ( j = 0; j < i; ++j ) { // Some Theta(1) code } }
So, the first time the outer loop is run, the inner loop does nothing, the second time, the inner loop iterates once, but the last time the outer loop iterates, the inner loop iterates times.
To solve such cases, we will simply use Maple. If the body of the inner loop runs in time,
> sum( sum( 1, j = 0..(i - 1) ), i = 0..(n - 1) );
This represents where the summand is a representative element of the run-time equivalence class . The limits of integration are the limits of the for loop. These similarities are given in Figure 4. If the upper limit of the loop is i < upper, then the upper limit of the summation should be .
Figure 4. The relationship between the run-times, summation notation, and Maple sum function.
From above, we know that and thus we may say that this loop runs in time: double the value of and this doubly nested for-loop will take four times as long to run.
If the body of the loop had code that ran in time instead of , we would change our calculation as follows:
> sum( sum( m + j, j = 0..(i - 1) ), i = 0..(n - 1) );
Examining this, we see that the run-time is therefore .
Now, you should be asking yourself the following question: why can I take just ? Could it not be that might have a significantly longer run run-time than ? This goes back to our weak ordering: the expression represents all functions that grow at time. We only need to choose one representative element from that class to observe how all members in the class will run. We could, for example, make the previous choice, and calculate
> sum( sum( 5324*m + 1323*j + 83715, j = 0..(i - 1) ), i = 0..(n - 1) );
but in the end, careful observation will show that
.
Suppose our code looks like
for ( i = 0; i < n; ++i ) { // Some Theta(1) code // Some Theta(ln(n)) code for ( j = 0; j < i; ++j ) { // Some Theta(ln(j)) code for ( k = 0; k < j; ++i ) { // Some Theta(1) code } } }
In Maple,
> sum( 1 + ln(n) + sum( ln(n) sum( 1, k = 0..(i - j - 1) ), j = 0..(i - 1) ), i = 0..(n - 1) );
represents the sum .
You can see that in each case, each for loop is represented by a sum extending from the limits of the for loop. Then, for each code that has a particular run-time, we add a representative entry into the sum in the appropriate location.
By looking at the result from Maple, we can see that this is ; however, if there was any question, we could always ask Maple what the asymptotic behaviour is:
> asympt( %, n ); # In Maple, % refers to the last calculated value.
As you can see, the dominant term appears first.
Exercise:
What are the run-times of the following two loops?
for ( i = 0; i < n; ++i ) { for ( j = 0; j < i*i; ++j ) { // Some Theta(1) code } } for ( i = 0; i < n*n; ++i ) { for ( j = 0; j < i^2; ++j ) { // Some Theta(1) code } }
A free lunch at the Grad House for the first student in a course who correctly answers this question: What is the run-time of
for ( i = 0; i < n; ++i ) { for ( j = 0; j < i^2; ++j ) { // Some Theta(1) code } }
? Only one guess per student and your answer should be sent to dwharder@uwaterloo.ca. If you do not get a response, you were either very late or your answer was incorrect; however, I will try to respond to all queries.
The only other really interesting case to look at are divide-and-conquer algorithms: algorithms that take a big problem, say, a problem of size , divides this big problem into sub-problems each of size , and then takes the solutions to the sub-problems and comes up with a solution to the bigger problem. The only issue is that the work required to either break the problem into sub-problems or to work to take those sub-problems and assemble a solution to the larger problem is . Now, the three constants , , and are fixed for any specific problem. For example:
Calculating the run-time of such divide-and-conquer systems is straight-forward giving the three constants:
If , | |
If , | |
If , |
In the first case, there are not that many sub-problems: is sufficiently small, the effort can be calculated alone by the effort required to either break the problem into sub-problems or recombining the solutions to the sub-problems to get a solution to the larger problem.
Binary search and merge sort (Examples 1 and 2 above) both satisfy the second condition:
With binary search, , and thus the run-time is .
With merge sort, , and thus the run-time is .
In the third case, Karatsuba's algorithm, , and thus the run-time is given by the third statement which is significantly faster than normal multiplication: