Skip to the content of the web site.

Laboratory 9

This laboratory will investigate two topics:

  1. MATH 211: 4th-order Runge-Kutta and the Dormand-Prince method, and
  2. MATH 215: Least Squares.

9.1 4th-order Runge-Kutta and the Dormand-Prince method

9.1.1 Additional Matlab Functionality

In this topic, we will discuss higher-order methods for solving initial-value problems; however, to start, we will begin with additional features in Matlab. Rather than teaching 1001 different Matlab commands in the first class, it is more useful to introduce commands when they their use is apparent. It's like learning bridge and the various contract bridge bidding conventions: until you have played for a while, it's very difficult to understand the purpose of, for example, a negative double.

We will begin with defining functions in Matlab directly (as opposed to creating explicit .m files. For example, consider defining the function tex:$$y : t \mapsto 3.2 t \cos(1.52t) e^{-0.74t}$$:

y = @(t)(3.2*t.*cos(1.52*t).*exp(-0.74*t));

You can now treat this as any other user-defined function:

y(1)
  ans =
     y = @(t)(3.2*t.*cos(1.52*t).*exp(-0.74*t));
t = linspace( 0, 10, 1000 );
plot( t, y(t), 'r' )

It is also possible to define multi-variate functions but, in this case, the variables should be specified:

f9a = @(t, y)(-t*y + t - 1)
  f9a =
       Inline function:
       f9a(t,y) = -t*y + t - 1

f9a(0,1)                         % the slope at (0,1)
  ans =
      -1

Other useful commands:

  clc
Clear the command window: begin working with a blank screen.
  clear M
Unassign the variable M. Often useful if you accidentally assign a useful function plot = [1 2 3]'.
plot = [1 2 3]';
plot( x, y )             % fails
  ??? Subscript indices must either be real positive integers
  or logicals.
 
clear plot
plot( x, y )             % okay
Without any arguments, clear unassigns all variables.

  diary
Records all the commands in Matlab and their corresponding output.

At all times, you can always use help cmd to get more information about the Matlab function cmd. For example, looking at help diary, you will notice that you can also call diary( 'filename.txt' ) to explicitly specify the file in which the diary will be kept (the default filename is diary).

9.1.2 4th-order Runge Kutta

Recall that Heun's method is as follows:

  • Find the slope at tex:$$(t_0, y_0)$$ by calculating tex:$$K_0 = f9a(t_0, y_0)$$,
  • Find the slope at tex:$$(t_1, y_0 + hK_0)$$ by calculating tex:$$K_1 = f9a(t_0 + h, y_0 + hK_0)$$, and
  • Use the average of the slopes to find the next approximation tex:$$y_1 = y_0 + h\frac{K_0 + K_1}{2}$$.

We have now sampled the slope at two points: tex:$$(x_0, y_0)$$ and tex:$$(x_0 + h, y_0 + hK_0)$$. One may reasonably expect that further samples may be used to reduce the error even further. The 4th-order Runge-Kutta algorithm proceeds as follows:

  • tex:$$K_0 = f_{9a}(t_0, y_0)$$,
  • tex:$$K_1 = f_{9a}\left(t_0 + \frac{1}{2}h, y_0 + \frac{1}{2}hK_0\right)$$,
  • tex:$$K_2 = f_{9a}\left(t_0 + \frac{1}{2}h, y_0 + \frac{1}{2}hK_1\right)$$, and
  • tex:$$K_3 = f_{9a}\left(t_0 + h, y_0 + hK_2\right)$$.

The approximation of tex:$$y_1$$ is found by taking a weighted average of the four slopes:

tex:$$y_1 = y_0 + h \frac{K_0 + 2K_1 + 2K_2 + K_3}{6}$$.

You will notice that the coefficients in the numerator add up to tex:$$1 + 2 + 2 + 1 = 6$$.

We could compare Euler's, Huen's and 4th-order Runge-Kutta, for example:

MethodValue
Euler0
Huen-0.5
4th-order Runge-Kutta-0.2557977040
Exact-0.2484078630

however, this would be unreasonable: Runge-Kutta uses four evaluations of the slope while Euler uses only one and Heun uses two. Thus, it would be more reasonable to allow for four steps with tex:$$h = 0.25$$ with Euler's method and two steps with tex:$$h = 0.5$$ with Huen's method:

MethodStepsValue
Euler0.75, 0.51953125, 0.2407703400, -0.1389852180-0.1389852180
Huen0.4687500000, -0.3137178270-0.3137178270
Runge-Kutta-0.2557977040-0.2557977040
Exact -0.2484078630

Similarly, we can approximate tex:$$y(0.5)$$, and this yields

MethodStepsValue
Euler0.8750000000, 0.7600097656, 0.6452477295, 0.52218575910.5221857591
Huen0.7597656250, 0.50631271340.5063127134
Runge-Kutta0.51872000430.5187200043
Exact 0.5187154179

The error of 4th-order Runge Kutta is actually tex:$$O(h^5)$$; however, as with Euler and Heun, if you use it in a multi-step method, the error is reduced by one order to tex:$$O(h^4)$$.

To test your code for rk4 which implements the 4th-order Runge Kutta algorithm, you could try the following. The calling sequence should be the same as for heun.m from the previous laboratory.

In order to ensure that your function is working, you must run the following tests with the differential equation tex:$$y^{(1)}(t) = -y(t)$$:

2*exp(-0.5)
  ans =
     1.213061319425267

g = @(t, y)(-y);
[ts, ys] = rk4( g, [0, 0.5], 2, 2 )          % y(0) = 2 and approximate y(0.5) with two points (one step)
  ts =
                     0   0.500000000000000
  ys =
     2.000000000000000   1.213541666666667

[ts, ys] = rk4( g, [0, 0.5], 2, 3 )          % y(0) = 2 and approximate y(0.5) with three points (two steps)
  ts =
                     0   0.250000000000000   0.500000000000000
  ys =
     2.000000000000000   1.557617187500000   1.213085651397705


[ts, ys] = rk4( g, [0, 0.5], 2, 5 )          % y(0) = 2 and approximate y(0.5) with five points (four steps)
  ts =
     0                 0.125000000000000 0.250000000000000 0.375000000000000 0.500000000000000
  ys =
     2.000000000000000 1.764994303385417 1.557602445491486 1.374579721615834 1.213062689100529

[ts, ys] = rk4( g, [0, 0.5], 2, 1025 ); % y(0) = 2 and approximate y(0.5) with 1025 points (1024 steps)
ys(end)
  ans =
     1.213061319425268

Consider the following attempt to approximate tex:$$y(0.1)$$ for the same initial-value problem described above:

[ts, y1] = euler( g, [0, 0.1], 2, 2 )
  ts =
     0                   0.100000000000000
  y1 =
     2.000000000000000   1.800000000000000

[ts, y2] = heun( g, [0, 0.1], 2, 2 )
  ts =
     0                   0.100000000000000
  z1 =
     2.000000000000000   1.810000000000000

The correct solution is tex:$$y(0.1) = 2e^{-0.1}$$. Consider the value

dy1 = abs( y1(end) - 2*exp(-0.1) )
dy2 = abs( y1(end) - y2(end) )

The first is the absolute error of the approximation of tex:$$y(0.1)$$ using Euler's method. Recall that Heun's method is significantly better than Euler's method and therefore the second value, dy2 will represent an approximation of dy1. What is the relative error of dy2 as an approximation to dy1?

Would it be reasonable to use a higher-order approximation to approximate the error of a lower-order approximation in the manner which has been described here?

9.2 Linear Regression and Least Squares

In this topic, we will investigate the problem of finding the best-fitting line or curve to data with noise.

9.2.1 Derivation

Suppose that we have a collection of data points which were read form an input source with noise, for example

tex:$$(0.3280, 2.4076), (2.4646, 3.2033), (2.7174, 3.3907), (2.7363, 3.5598), (2.9535, 3.7492),$$
tex:$$(3.4276, 4.1927), (3.9822, 4.5568), (3.9857, 3.9233), (4.2696, 4.5111), (4.4138, 4.9624),$$
tex:$$(4.5440, 4.9250), (5.7820, 5.5075), (7.1648, 6.4641), (7.3318, 5.9572), (7.4680, 6.6160),$$
tex:$$ (7.7770, 6.9075), (7.9168, 6.7510), (9.0514, 7.6340), (9.4930, 8.1167), (9.9049, 8.3345)$$.

A plot of this data is in Figure 1.


Figure 1. Twenty noisy data points.

The following is the Matlab code for the data.

xs = [0.3279750853 2.464595439 2.717370800 2.736334037 2.953520867...
3.427599402 3.982231181 3.985682362 4.269594141 4.413798254...
4.544029287 5.781998567 7.164764166 7.331776715 7.467984805...
7.777045621 7.916769076 9.051432635 9.492962523 9.904912835]';

ys = [2.407577618 3.203292359 3.390732556 3.559771990 3.749224638...
4.192705730 4.556825783 3.923328666 4.511085438 4.962406558...
4.925002895 5.507543465 6.464099937 5.957215200 6.616048328...
6.907485826 6.750967379 7.633998055 8.116679158 8.334500002]';

Clearly the data appears to be approximately linear, and if we wanted to approximate this data, we could find the interpolating polynomial. Unfortunately, this turns out to be the interpolating polynomial shown in Figure 2.


Figure 2. The polynomial of degree 19 interpolating the 20 points shown in Figure 1.

As you may guess, this is completely inappropriate—just the constant coefficient of the interpolating polynomial is on the order of tex:$$7.76191 \times 10^9$$. Consequently, we would like to find an approximating line. The question is, which is the best possible line? For example, Figure 3 shows two possible lines which appear to be a reasonable approximation of the data.


Figure 3. Two possible lines approximating the data in Figure 1.

The question is, what criteria will we use to find the best possible line? First, the line we are finding is of the form tex:$$\alpha x + \beta$$ (we will use the Greek letters alpha and beta as these are standard in the literature). Given any line we have an error associated with the line and how well it approximates our points. Figure 4 shows a zoom of the points tex:$$(x_2, y_2)$$ through tex:$$(x_8, y_8)$$ and let us suppose our approximating line is shown in blue. In this case, the point tex:$$\alpha x_2 + \beta$$ is the approximation of the point tex:$$(x_2, y_2) = (0.3280, 3.2033)$$. Similarly, the point on the line tex:$$\alpha x_8 + \beta$$ approximates the point tex:$$(x_8, y_8) = (3.9857, 3.9233)$$.


Figure 4. A zoom on some of the points on the right-hand plot of Figure 4.

The black lines in Figure 4 represent the errors. We could, for example, attempt to choose a line which minimizes the sum of all the absolute errors, that is, we could minimize

tex:$$SE = \sum_{k = 1}^{20} |\alpha x_k + \beta - y_k|$$.

Unfortunately, for theoretical and practical reasons, this will not work. Instead, we choose to minimize the sum of the squares of the errors:

tex:$$SSE = \sum_{k = 1}^{20} (\alpha x_k + \beta - y_k)^2$$.

Here we have a function of only two variables: tex:$$\alpha$$ and tex:$$\beta$$—remember that all the tex:$$x_k$$ and tex:$$y_k$$ are already given: they are constants.

From first-year calculus, you have learned to to find a minimum: differentiate, equate to zero, and solve for the variable. In this case, we have two variables and therefore we must have two partial derivatives, equate each partial derivative to zero, and simultaneously solve for the two variables. Using the chain rule, we get:

tex:$$\frac{\partial SSE}{\partial \alpha} = \sum_{k = 1}^{20} 2(\alpha x_k + \beta - y_k)(x_k) = 0$$

and

tex:$$\frac{\partial SSE}{\partial \beta} = \sum_{k = 1}^{20} 2(\alpha x_k + \beta - y_k) = 0$$.

Expanding these two equations and dividing by 2, we get

tex:$$\sum_{k = 1}^{20}\left(  \alpha x_k^2 + \beta x_k - x_k y_k\right ) = 0$$

tex:$$\sum_{k = 1}^{20}\left(  \alpha x_k + \beta - y_k\right) = 0$$.

Because tex:$$\alpha$$ and tex:$$\beta$$ are not related to the index, we may expand this to get

tex:$$\alpha \sum_{k = 1}^{20} x_k^2 + \beta \sum_{k = 1}^{20} x_k = \sum_{k = 1}^{20} x_k y_k$$

tex:$$\alpha \sum_{k = 1}^{20} x_k + \beta \sum_{k = 1}^{20} 1 =  \sum_{k = 1}^{20} y_k$$.

Now we come back to Math 215 Linear Algebra: this is the system of linear equations

tex:$$\left( \matrix { \sum_{k = 1}^{20} x_k^2 & \sum_{k = 1}^{20} x_k \cr \sum_{k = 1}^{20} x_k & \sum_{k = 1}^{20} 1 } \right ) \left( \matrix { \alpha \cr \beta } \right ) =  \left( \matrix{ \sum_{k = 1}^{20} x_k y_k \cr = \sum_{k = 1}^{20} y_k }\right )$$.

Without proof, the determinant of the matrix is zero if and only if tex:$$x_1 = x_2 = x_3 = \cdots = x_{20}$$.

In order to quickly calculate the matrix in Matlab, notice that if we define

M = [xs xs.^0];
M' * M
  ans =
    719.2291  107.7124
    107.7124   20.0000

M'*ys
  ans =
    657.5218
    105.6705

p = (M' * M) \ (M' * ys)
  p =
      0.6355
      1.8609

The last is our solution to our minimization problem; that is, the best-fitting line (according to our definition of minimizing the sum of the squares of the errors) is

tex:$$0.6355x + 1.8609$$.

This line is called the best-fitting least-squares line and the process to find it is called linear regression. A plot of the line together with the points is shown in Figure 5.


Figure 5. The best-fitting least-squares line passing through our data.

The line can be plotted creating, for example, 100 points using xs = linspace( 0, 10, 100 ) and plotting those points against polyval( p, xs ).

It is also possible to calculate the sum of the squares of the error and demonstrate that even a slight change in the coefficients will produce a line which has a larger sum of squares of the errors:

yp = polyval( p, xs );
norm(yp - ys)^2
  ans =
     1.033686445547913

% A slightly different polynomial
ypp = polyval( [0.6354 1.8608]', xs );
norm(yp - ys)^2
  ans =
     1.033698337287500

9.2.2 Matlab and Least Squares

You will notice that the run time of calculating tex:$${\bf M}^T {\bf M}$$ is tex:$$2 \times n \times 2$$ and thus, it is only an tex:$$O(n)$$ calculation; however, Matlab even provides a short cut:

With the above matrix M, suppose you asked

M \ ys

What would you expect? From linear algebra, we have a system of 20 equations but only two unknowns. Except in the most degenerate cases, no answer exists to such a system. Rather than returning an error, Matlab assumes you are looking for the best-fitting least-squares line:

(M' * M) \ (M' * ys)
  ans =
     0.635520678531010
     1.860852407821792

M \ ys
  ans =
     0.635520678531010
     1.860852407821796

There is a slight difference in the answers, but it is only in the last decimal place.

9.2.3 Best-Fitting Least-Squares Curves

What happens, however, if we don't want to find the best straight line? Consider the data in Figure 6.


Figure 6. Another set of ten data points with noise.

The data points are

tex:$$ (0.0078147, 1.5437), (0.0088483, 1.5578), (0.045021, 1.4204), (0.078259, 1.4023), (0.31008, 1.3384),$$
tex:$$(0.32167, 1.2961), (0.33596, 1.3065), (0.60890, 1.8055), (0.80118, 2.5618), (0.80787, 2.5122)$$

and the exact data points are

xs = [0.007814731122 0.008848343076 0.04502115842 0.07825884434 0.3100781521...
0.3216720831 0.3359594404 0.6089038628 0.8011831826 0.8078659323]';
ys = [1.543701192    1.557800839    1.420389441   1.402311968   1.338448518...
1.296099571 1.306500894 1.805512249 2.561774915 2.512208861]';

An important observation is that you can fit a straight line to any data, but what is important is: does that line have any significance what-so-ever?

For example, the best-fitting line is given by

M = [xs xs.^0];
p = M \ ys
  p =
      1.2307
      1.2652

and this is plotted in Figure 7; however, is this really the best curve we fit to the data? Probably not.


Figure 7. The best-fitting least-squares line passing through the data in Figure 6.

Note that the data initially dips and then appears to rise again. We could attempt to fit a quadratic function to this data:

M = [xs.^2 xs xs.^0];
p = M \ ys
  p =
      4.0835
     -2.0560
      1.5472

If we plot this quadratic function tex:$$p(x) = 4.0835x^2 - 2.0560 + 1.5472$$, as is shown in Figure 8, we see that it appears to be a good fit.


Figure 8. The best-fitting least-squares quadratic function passing through the data in Figure 6.

Is this the best-fitting curve? Should we try a cubic function, or a quartic? With each extra term, the fit will be better and we know that if we try a polynomial of degree nine, the error will be zero—it will be a perfect fit.

This is a question about having a model for the data: does the model suggest a quadratic function? Are we taking the data from a system which suggests that the sampled data would behave quadratically as the variable tex:$$x$$ is changed?

As another example, assume we are testing a resistor by placing various currents through the resistor and measuring the voltages. For example,

tex:$$(0.03465 {\rm A}, 4.379 {\rm V}), (0.02765 {\rm A}, 3.301 {\rm V}), (0.08053 {\rm A}, 10.02 {\rm V}), (0.09868 {\rm A}, 11.62 {\rm V}), (0.08721 {\rm A}, 10.33 {\rm V}),$$
tex:$$(0.06688 {\rm A}, 7.831 {\rm V}), (0.08706 {\rm A}, 10.55 {\rm V}), (0.04057 {\rm A}, 4.940 {\rm V}), (0.04320 {\rm A}, 5.419 {\rm V}), (0.05856 {\rm A}, 7.111 {\rm V})$$

These points are given in terms of tex:$${\rm A}$$ and tex:$${\rm V}$$:

current = [0.03465445112 0.02764957747 0.08053249405 0.09868266998 0.08720882767 ...
           0.06688377664 0.08705955749 0.04056978469 0.04319976668 0.05855719319]';
voltage = [4.379276551   3.301032219  10.01909723  11.62485446    10.32856468 ...
           7.831413695  10.55234404    4.939755206  5.418582132    7.110929634]';


Figure 9. Current-voltage readings taken across a resistor.

You should note that the data, in this example, is not ordered, nor need it be.

If we find the best-fitting least-squares line, shown in Figure 10, we have

M = [current current.^0];
p = M \ voltage
  p =
    117.3677
      0.2151


Figure 10. The best-fitting least-squares line tex:$$v = 117.3677 {\Omega} i + 0.2151 {\rm V}$$.

This says that the voltage across the resistor is defined by

tex:$$v = 117.3677 {\Omega} i + 0.2151 {\rm V}$$

which should instantly raise the alarm in the mind of any electrical or computer engineering student (if it didn't, go back and review ECE 140). If the current is zero, the voltage must be zero.

Thus, rather than using the model that tex:$$v = Ri + V$$, we should use the model tex:$$v = Ri$$, in other words, we want to find the best line which passes through the data but also passes through the origin.

M = [current];
r = M \ voltage
  r =
    120.3697

Thus, the estimate on the resistance of the resistor is tex:$$120.37 {\rm \Omega}$$ and this best-fitting line passing through the origin is shown in Figure 11.


Figure 11. The best-fitting least-squares line tex:$$v = 120.37 {\Omega} i$$.

To practice linear regression, you could try the following: find the best-fitting line (using linear regression) which passes through the points

x1 = [1.3862 1.4929 2.5751 5.4722 6.9908 8.4072 8.9090 9.5929];
y1 = [4.6006 4.8822 5.4627 6.9211 7.9681 8.7056 9.3139 9.4706];

Plot both the eight points and a plot of the best-fitting straight line you found using the polyval function. Use an appropriate interval slightly larger than the x-values of the points which were found.

Find the best-fitting line (using linear regression) which passes through the points

x2 = [1.3862 1.4929 2.5751 5.4722 6.9908 8.4072 8.9090 9.5929];
y2 = [0.4191 1.1449 4.5667 0.7619 4.1291 2.6917 4.9807 0.3909];

Plot both the eight points and a plot of the best-fitting straight line you found using the polyval function. Use an appropriate interval slightly larger than the x-values of the points which were found.

Two questions:

  • Is this the best-fitting line passing through these points? (yes or no)
  • Just because you can fit a line to data, does that necessarily mean the line has any significance? (yes or no)

Find the best-fitting line which passes through the points

x3 = [1.3862 1.4929 2.5751 5.4722 6.9908 8.4072 8.9090 9.5929];
y3 = [3.5101 6.6099 2.1850 -16.1321 -33.2007 -47.7779 -55.9318 -67.5378];

Plot both the eight points and a plot of the best-fitting straight line you found using the polyval function. Use an appropriate interval slightly larger than the x-values of the points which were found.

Now, find the best-fitting quadratic polynomial which passes through the eight points. Plot both the eight points and a plot of the best-fitting quadratic polynomial you found using the polyval function. Use an appropriate interval slightly larger than the x-values of the points which were found.