Introduction

In attempting to solve the system Mx = b, we must be able to account for errors in both the matrix M and the vector b. Unfortunately, the error propagation is not linear: a 10% relative error in either M or b does not translate to a 10% relative error in the solution. We will show how the error propagation involves a property of the matrix called the condition number defined using the norms of the matrix.

Please note that |x|, ||v||, and |||M||| represent the absolute value of a real number, the norm of a vector, and the norm of a matrix, respectively. Numerous texts use ||M|| to represent the norm of a matrix; however, the author uses three bars for clarity.

References

• Dennis S. Bernstein, Matrix Mathematics, Prinston University Press, 2005.
• Gerald and Wheatly, Applied Numerical Analysis, 6th Ed., Addison Wesley, p.160-4.
• Wikipedia, Ill-conditioned Matrices.

Theory

In some cases, the solution to a system of linear equations Mx = b may be very sensitive to small changes in either the matrix M or the vector b—a relatively change in either can result in a significant change in the solution x. This can cause problems in engineering as any physical system is made of components which are not exact: a silver-banded 120 Ω resistor could have an actual resistance anywhere on the range [108 Ω, 132 Ω].

A Well Conditioned Example

In order to motivate this discussion, we will look at two matrices: the first is well conditioned—small changes in either M or b result in correspondingly small changes in x.

Consider this first matrix M:

To determine the behaviour of this matrix, we will look at the image of the unit circle, as is shown in Figure 1.

Figure 1. The image of the unit circle under M.

Solving Mx = b means that we want to find that point x which maps to b, as is shown in Figure 2 which shows the pre-image of b = (1, 1)T. The solution to this system of linear equations is approximately the point x = (0.1765, 0.2353)T.

Figure 2. The pre-image of the point b = (1, 1)T.

Suppose the vector b is shifted slightly. Figure 3 shows a circle of points around b which have a 10% relative error when compared to b. The pre-image of each of these points forms an ellipse around the point x.

Figure 3. The pre-image of the point b = (1, 1)T.

For this matrix, a small error in b translates to a small error in x, but we have more information: the length of the major axes of the ellipse yeilds the maximum error while the length of the minor axes yeilds also the minimum error. Suppose, however, that there is an error in the matrix M. Consider this perturbed matrix:

Unlike 2-dimensional vectors, it is not possible to represent all matrices which have a particular relative error. The above matrix has a relative error of approximately 11.9%. If you look at Figure 4, you will see the image of the unit circle under this perturbed matrix.

Figure 4. The image of the unit circle under a perturbed M.

The pre-image of b = (1, 1)T under the perturbed M is close to the original, too. As we can see in Figure 5, the pre-image is x = (0.1493, 0.2239)T which has a relative error of only 11% as compared to x = (0.1764, 0.2353)T.

Figure 5. The image of the unit circle under a perturbed M.

Again, a small change in M results in a small change in x.

Figure 6 demonstrates how the images of the unit circle under both M and the perturbed M are close.

Figure 6. The images of the unit circle under M and a perturbed M.

An Ill-conditioned Example

The second is ill conditioned—small changes in either M or b result in very significant changes in x.

Consider this second matrix M:

To determine the behaviour of this second matrix, we will look at the image of the unit circle, as is shown in Figure 7.

Figure 7. The image of the unit circle under M.

This is very different from the image shown in Figure 1: the unit circle is stretched into a ellipse which almost approximates a straight line. Recall that a singular matrix would map the unit circle into either a line or a point; however, the determinant of this matrix M is 1—the same determinant as the identity matrix. (What is critical here is that the determinant cannot be used to determine the conditioning of a matrix.)

Solving Mx = b means that we want to find that point x which maps to b, as is shown in Figure 2 which shows the pre-image of b = (1, 1)T. The solution to this system of linear equations is the point x = (1, -1)T.

Figure 8. The pre-image of the point b = (1, 1)T.

Suppose the vector b is shifted slightly. Figure 9 shows a circle of points around b which have a 10% relative error when compared to b. The pre-image of each of these points forms an elongated ellipse around the point x.

Figure 9. The pre-image of the point b = (1, 1)T.

What is significantly different here is that a small error could translate into a relative error in x which is almost 100%. This is not the worst case, either—there are other vectors where a 10% relative error in b can translate into a relative error in x almost as large as 120%.

For this matrix, a small error in b translates to a significantly larger error in x. Suppose, similarly, that there is an error in the matrix M. Consider this perturbed matrix:

The relative error of this perturbed matrix is only 4.2% compared to the relative error of the previous perturbed matrix (11.9%). Thus, it is, in a relative sense, a smaller perturbation.

If you look at Figure 10, you will see the image of the unit circle under this perturbed matrix.

Figure 10. The image of the unit circle under a perturbed M.

The pre-image of b = (1, 1)T under the perturbed M is significatly different: Figure 11 shows the pre-image is x = (-1.735, 2.233)T which has a relative error close to 301% as compared to x = (1, -1)T.

Figure 11. The image of the unit circle under a perturbed M.

Again, a small change in M results in a massive change in x.

Figure 12 demonstrates how the images of the unit circle under both M and the perturbed M are significantly different as compared to the previous well conditioned example.

Figure 12. The images of the unit circle under M and a perturbed M.

The Condition Number

To quantify this effect, it is necessary to consider the image of the unit circle. In the first case (Figure 1), most vectors are stretched by a similar amount: the most a vector is streched (|||M|||2) is 5.390 is very close to the least amount that a vector is stretched, approximately 3.154. This second value may be described as 1/|||M-1|||2. These two values in Figure 1 are approximately 5.390 and 3.154, respectively. In Figure 7, however, these two values are signifantly different: 12.08 versus 0.08276. The ratio between these two values is defiend to be the condition number of a matrix, and in the first case, it is cond2(M) = 5.390/3.154 = 1.709 and in the second cond2(M) = 12.08/0.08276 = 146.0.

Results

If Δb is the error in b then ||Δb||2/||b||2 is the relative error in b and with this value and the condition number cond2(M) = |||M|||2|||M-1|||2, we may bound the relative error of x:

As the reader may have noticed, we are focusing on the 2-norm; however, these bounds are equally valid with both the 1- and ∞-norms so long as the same norm is used in all cases. The appropriate condition numbers are given as cond1(M) and cond(M), respectively.

The condition number of a non-invertable matrix is defined as ∞.

HOWTO

Given a system of linear equations Mx = b, the condition number of M, cond(M) = |||M|||·|||M-1|||, will give bounds on the error propagation: The relative error of the solution x is bounded above by the relative error of b multiplied by the condition number:

At the cost of introducing a more complex formula, we can also find a bound which accounts for errors in the matrix M. From Bernstein, we have the formula:

Error Analysis

Unlike most sections on error analysis, as this entire section already deals with error analysis, the author will devote this section to a different type of error propagation.

The formula above which relates the relative error of both b and M relate to the relative error of the solution is some times misinterpreted as the simpler

This statement is incorrect as the following counter example demonstrates:

```>> M = [5 2; 1 4];
>> dM = [-0.32805 0.16204; 0.24522 -0.46095];
>> b = [-0.49766 0.86737]';
>> x = M \ b;
>> a = (M + dM) \ b;
>> cond( M ) * norm( dM )/norm( M )   % 2-norm by default
ans =  0.20732
>> norm( x - a )/norm( x )    % should be <= 0.20732
ans = 0.26148
```

The correct result is

or the result from Gerald and Wheatley

which relates the relative error of the erroneous result x + Δx to the relative error of M.

Examples

1. The condition number of the matrix [5 1; -2 10] has a condition number of 2. When b = (0.0054784, 0.9009365)T, x = (-0.016272, 0.086839)T while a perturbed b = (0.095572, 0.901546)T has a relative error of 10% and the corresponding solution x = (0.0010417, 0.0903629)T has a corresponding relative error of almost 20%.

2. If M is a digaonal matrix, then the condition number is the ratio of the largest absolute diagonal entry over the smallest absolute diagonal entry. Therefore, the condition number of M = [5 0; 0 4] is 1.25.

3. One property of the 1-, 2-, and ∞-norms is that |||AB||| ≤ |||A|||·|||B|||. Consequently,
cond(AB) = |||AB|||·|||(AB)-1|||
= |||AB|||·|||B-1A-1|||
≤ |||A|||·|||B|||·|||B-1|||·|||A-1|||
= |||A|||·|||A-1|||·|||B|||·|||B-1|||
= cond(A)·cond(B).

As a consequence, scaling a matrix or multiplying it by a rotation matrix should not affect the condition number.

Questions

1. What is the condition number of the matrix [5 0; 0 2]?

2. What is the condition number of a rotation matrix [cos(θ) -sin(θ); sin(θ) cos(θ)]?

3. Given the matrix M = [2 3; 1 2] and given

```>> inverse( M )
ans =
2   -3
-1    2
```

what are the 1- and ∞-condition numbers of the matrix? (Both are 25)

4. What are the 1-, 2-, and ∞-condition numbers of the elementary matrices Ei,j (swap rows i and j) and Ei; c (multiply row i by c); and what are the 1- and ∞-condition numbers of the elementary matrix Ei,j; c (add c times row j onto row i)?

Applications to Engineering

Each circuit element, be it a resistor, capacitor, inductor, or transistor, has some amount of error associated with it. If each entry of a conductance matrix G is off by 10%, the relative error of the |||ΔG|||/|||G|||| could be as large as 10%. With dense matrices and random errors, this bound appears less likely to be achieved; however, for diagonally dominant matrices, perturbations on the diagonal entries will directly affect the relative error.

Similarly, errors in any currents will similarly affect the result. Thus, having a conductance matrix which has as small a condition number as possible will minimize the effects of these errors.

Matlab

The condition number of a matrix M is found in Matlab with the cond(M) command which uses the 2 norm by default. The condition number using the 1- or &infty;-norm may be found using cond(M, 1) or cond(M, Inf), respectively.

Matlab also provides a function condest(M) which provides an approximation to the condition number using the 1-norm.