Introduction
Theory
HOWTO
Error Analysis
Examples
Questions
Applications in Engineering
Matlab
Maple

# Introduction

The previous topic examined the most common form of
linear regression, that of finding the best fitting straight
line to data which is known to be linear in behaviour.

This next topic shows how we can find an arbitrary linear
combination of basis functions of the form

y(*x*) = *c*_{1} f_{1}(*x*) + *c*_{2} f_{2}(*x*) + ⋅⋅⋅ + *c*_{n} f_{n}(*x*)
which best fits a given set of data. ≤

This technique is a straight-forward generalization of the Vandermonde
method.

# Theory

# General Linear Regression

Just like we may generalize interpolation to use
a linear combination of basis functions, we may also
do so with least squares. The technique is identical
to the generalization of finding an interpolating
straight line passing through two points to the
finding of the best fitting least-squares line which
passes through *n* points.

If we are trying to fit the linear combination of
*m* basis functions:

y(*x*) = *c*_{1} f_{1}(*x*) + *c*_{2} f_{2}(*x*) + ⋅ ⋅ ⋅ + *c*_{m} f_{m}(*x*)
to a set of points, we define the generalized Vandermonde
matrix **V** where each column are the *x* values evaluated
at the *j*th basis function (*j* = 1, 2, ..., *m*),
and then we solve:

**V**^{T}**V****c** = **V**^{T}**y**
This gives us the vector of coefficients which give us the
best fitting curve.

# HOWTO

# Problem

Given data (*x*_{i}, *y*_{i}),
for *i* = 1, 2, ..., *n* which is known to approximate
a curve described by the following linear combination of
basis functions:

y(*x*) = *c*_{1} f_{1}(*x*) + *c*_{2} f_{2}(*x*) + ⋅ ⋅ ⋅ + *c*_{m} f_{m}(*x*)
find the coefficients which define that best fitting curve.

# Assumptions

We will assume the model is correct and that the data
is defined by two vectors **x** = (*x*_{i}) and
**y** = (*y*_{i}).
Additionally, we must assume that the number of
unique values of *x* is at least as great as *m* and
that the *m* basis functions are linearly independent.

# Tools

We will use linear algebra.

# Process

We wish to find the best fitting line of the form
given form, we define the following generalized Vandermonde matrix:

where the first column is the function f_{1}(*x*) evaluated at
each of the *x* values, the second column is the function
f_{2}(*x*) evaluated at each of the *x* values, and
so on.

Hence, we solve the linear system **V**^{T}**Vc** = **V**^{T}**y**.

Having found the coefficient vector **c**, we now
associate the appropriate entries with the appropriate basis function:
*y*(*x*) = *c*_{1} f_{1}(*x*) + *c*_{2} f_{2}(*x*) + ⋅ ⋅ ⋅ + *c*_{m} f_{m}(*x*)

# Error Analysis

The study of the error associated with a linear regression
is beyond the scope of this class. Se any text on linear
regression, such as Draper and Smith, *Applied Regression Analysis*, 2nd Ed.

# Examples

# Example 1

The following data is known to be quadratic in nature:

(1, -0.3), (2, -0.2), (3, 0.5), (4, 2), (5, 4),

(6, 6), (7, 9), (8, 13), (9, 17), (10, 22)
This data is shown in figure 1.

Figure 1. The given data points.

Finding the technique of least squares, we define:

We now solve **V**^{T}**V****c** = **V**^{T}**y** to get **c** = (0.2928, -0.75659, 0.18833)^{T}.

Therefore, the best fitting quadratic curve using the least-squares
technique is *y*(*x*) = 0.29280*x*^{2} - 0.75659*x* + 0.18833, which
is shown in Figure 2.

Figure 2. The best-fitting quadratic function using least squares.

# Questions

1. Find the least-squares quadratic polynomial which fits the data:

**x** = (-2, -1, 0, 1, 2)^{T}

**y** = (3, 1, 0, 1, 5)^{T}

Answer: *y* = *x*^{2} + 0.4 *x* + 0.

2. Find the least-squares curve of the form
*y* = *a*sin(0.4*x*) + *b*cos(0.4*x*) for
the data

>> x = (0:10)';
>> y = [2.29 1.89 1.09 0.230 -0.801 -1.56 -2.18 -2.45 -2.29 -1.75 -1.01]';

Answer: 2.321cos(0.4*x*) − 0.6921sin(0.4*x*).

3. Given the same data in Question 2, find the best fitting curve
of the form *y* = *a*sin(0.4*x*) + *b*cos(0.4*x*) + *c*. Would you consider the constant
coefficient to be significant? (While this is posed as a thought-provoking
question here, there are statistical techniques to determine if a
particular coefficient is significant: determine the standard deviation of
each parameter and see if 0 falls within 1.96 standard deviations
of the coefficient.)

Answer: 2.318cos(0.4*x*) − 0.6860sin(0.4*x*) − 0.006986.

# Applications to Engineering

Behaviour in engineering is most often linear or
quadratic in behaviour. In these cases, linear
regression to lines of the form:

*y*(*x*) = *c*_{1} *x* + *c*_{2}
*y*(*x*) = *c*_{1} *x*^{2} + *c*_{2} *x* + *c*_{3}

may be appropriate. However, in certain cases, additional
information may be available, for example, the linear
data may be known to pass through the origin (as is the
case when considering the voltage across a resistor when the
current is 0) or the voltage caused by the angle of a
joystick when it is known that when the joystick is
upright, the voltage is zero. In these two cases, it
would be more appropriate to choose the following
curves,

*y*(*x*) = *c*_{1} *x*
*y*(*x*) = *c*_{1} *x*^{2}

respectively.

### Linear Prediction

Given a sequence of points *x*_{k}, we may wish to
predict *x*_{n} by taking a linear combination of
the previous *p* values, that is, we would like to find parameters
*c*_{k} for *k* = 1, ..., *p* such that:

Given a known sequence of values, say, the first *N* values
*x*_{1}, ..., *x*_{N}, then this will create a
sequence of *N* − *p* equations in *p* unknowns. Thus must
therefore be solved using a least-squares approximation.

For example, given then points

0.0, 0.6234, 0.4990, 0.05738, -0.2279, -0.2140, -0.04619, 0.08045, 0.08975, 0.02770, -0.02709
which are sampled from a decaying sinusoid (the solution of a 2nd-order differential
equation, *e*.*g*., the result of a RLC circuit) and we want to predict future
values using the two previous values, applying the above general equation nine times, we get
the following equations:

0.0 *c*_{1} + 0.6234 *c*_{2} = 0.4990

0.6234 *c*_{1} + 0.4990 *c*_{2} = 0.05738

0.4990 *c*_{1} + 0.05738 *c*_{2} = -0.2279

0.05738 *c*_{1} + -0.2279 *c*_{2} = -0.2140

-0.2279 *c*_{1} + -0.2140 *c*_{2} = -0.04619

.

.

.
In Matlab, we can now calculate:

>> x = [0.0 0.6234 0.4990 0.05738 -0.2279 -0.2140 -0.04619 0.08045 0.08975 0.02770 -0.02709]';
>> n = length( x );
>> M = [x(1:n - 2) x(2:n - 1)];
>> y = x(3:n);
>> c = V \ y
c = -0.548760
0.800500

Thus, we will predict *x*_{n} by using the formula
-0.54876 *x*_{n − 1} + 0.8005 *x*_{n − 2}.

What may astound you is the result. The next table shows the values
*x*_{k} for *k* = 3, ..., 11 and their estimations based
on the previous least-squares best-fitting curve:

Prediction | Actual Value |

0.4990320398 | 0.4990 |

0.0573530258 | 0.05738 |

-0.2278983283 | -0.2279 |

-0.2139219011 | -0.2140 |

-0.0462447995 | -0.04619 |

0.08045943823 | 0.08045 |

0.08974747563 | 0.08975 |

0.02769721260 | 0.02770 |

-0.02707731066 | -0.02709 |

This leads to a general observation: a sequence of points which are periodic
samples of the solution to a *p*th-order ordinary differential equation
with constant coefficients may be predicted using a linear predictor using
*p* prior terms. We will see a justification for this this in a future
topic when we look at divided difference methods for estimating derivatives.

Thanks to Salah Ameer for suggesting this example.

# Matlab

Finding the coefficient vector in Matlab is very simple:

x = [1 2 3 4 5 6 7 8 9 10]';
y = [2.50922 2.12187 1.88092 1.94206 2.25718 2.79674 3.22682 4.09267 4.98531 6.37534]';
V = [x.^2 x x.^0];
c = V \ y; % same as c = (V' * V) \ (V' * y)
<

To plot the points and the best fitting curve, you can enter:

xs = (0:0.1:11)';
plot( x, y, 'o' )
hold on
plot( xs, polyval( c, xs ) );

Be sure to issue the command `hold off` if you want to
start with a clean plot window.

# Maple

The following commands in Maple:

with(CurveFitting):
pts := [[1, 2.5092], [2, 2.1219], [3, 1.8809], [4, 1.9421], [5, 2.2572], [6, 2.7967], [7, 3.2268], [8, 4.0927], [9, 4.9853], [10,6.3753]];
fn := LeastSquares( pts, x, curve = a*x^2 + b*x + c );
plots[pointplot]( pts );
plots[display]( plot( fn, x = 0..4 ), plots[pointplot]( pts ) );

calculates the least-squares line of best fit for given data points,
a plot those points, and a plot of the points together with the
best-fitting curve.

For more help on the least squares function or on the CurveFitting package, enter:

?CurveFitting,LeastSquares
?CurveFitting