Scientific error analysis

In secondary school, you likely learned that if you had two values $x$ and $y$, each with an uncertainty that is normally distributed, say $s_x$ and $s_y$, respectively, and you were adding, subtracting, multiplying or dividing the values, you'd have to have some way of determining the the error of the result. The rules you may have learned include:

When adding them, $z = x + y$, the error is $s_z = \sqrt{ s_x^2 + s_y^2 }$, and when subtracting them, $z = x - y$, the error is also $s_z = \sqrt{ s_x^2 + s_y^2 }$.

When adding a fixed constant $c$ to a value, say $z = x + c$, the error remains unchanged, so $s_z = s_x$. When multiplying a value by a fixed constant $c$, say $z = cx$, this multiplies the error by the absolute value of that constant, so $s_z = |c|s_x$.

If we multiply two values, say $z = xy$, then the relative error of the result is related to the relative errors of the values being multiplied:

$\left( \frac{s_z}{z}\right)^2 = \left( \frac{ s_x }{x} \right)^2 + \left( \frac{ s_y }{y} \right)^2$, or $\frac{s_z}{|z|} = \sqrt{ \left( \frac{ s_x }{x} \right)^2 + \left( \frac{ s_y }{y} \right)^2}$.

The formula for the error when calculating $z = x \div y$ is identical.

In many simple calculations, for example, in calculating $PV = nRT$, if you have three of the values and are trying to estimate the error in the last (remember, as of 2019, the molar gas constant $R$ is a constant exactly defined as $8.31446261815324$), then you can generally use the above techniques by combining the errors one at a time, so suppose we have measured pressure, volume and temperature and are attempting to estimate the number of moles:

$n = \frac{PV}{RT}$.

You would find the error is:

$\frac{s_n}{n} = \sqrt{ \left( \frac{ s_P }{P} \right)^2 + \left( \frac{ s_V }{V} \right)^2 + \left( \frac{ s_T }{T} \right)^2}$.

From this, you notice that there is no contribution to the error on the right-hand side by the scalar multiple $R$. This is not however gone, for $n = \frac{PV}{RT}$ so multiplying both sides by $n$ is equivalent to calculating

$s_n = \frac{PV}{RT} \sqrt{ \left( \frac{ s_P }{P} \right)^2 + \left( \frac{ s_V }{V} \right)^2 + \left( \frac{ s_T }{T} \right)^2}$,

and thus, the error is multiplied by the reciprocal of $R$.

However, suppose you blindly subtracted $z = 2x - x$. If you simply applied the rules above, you would find that the error is $s_z = 3s_x$, but this is nonsensical, because the error of $2x$ is correlated with the error of $x$, so the correct answer is $s_z = s_z$. Now this may be obvious, but suppose your calculation had you determine:

$z = \frac{3x + 2y + 1}{|x + y|}$ or $z = \cos(0.05x + 0.07y)$.

What is the error of either $z$? The issue here is that there are many correlations going on simultaneously, and this will affect the result. Consequently, a reasonable step is to assume that the errors of $x$ and $y$ are significantly smaller than that of $|x|$ and $|y|$, respectively, and what do is linearize the expression, because we know that the error of $cx$ is $|c|s_x$. More generally, if there are a greater number of variables, we have the formula

$_z^2 = \left( \frac{\partial f(x,y)}{\partial x} \right)^2 s_x^2 + \left( \frac{\partial f(x,y)}{\partial y} \right)^2 s_y^2$

with the obvious generalization if there are more than two variables. In the two above examples, in the first expression, if $x = 1 \pm 0.13$ and $y = 2 \pm 0.8$, the error of the result can now be calculated as:

$s_z^2 \approx 0.6261^2\cdot 0.13^2 + (-0.5367)^2 0.8^2 = 0.1909$

Thus, the result is $3.58 \pm 0.44$. For the second expression, with the same values and errors, the error can now be calculated as:

$s_z^2 \approx (0.009443)^2\cdot 0.13^2 + (0.01322)^2 0.8^2 = 0.0001134$

Thus, the result is $0.982 \pm 0.011$. Note that the argument of $\cos$ in the second example is small, and thus, we are close to the maximum of the cosine function, and there the function is very flat, hence, the error is actually reduced in magnitude.

This can be done with the ScientificErrorAnalysis package:

[> with( ScientificErrorAnalysis ): 
[> x := Quantity( 1.0, 0.13 ): 
[> y := Quantity( 2.0, 0.8 ): 
[> combine( (3*x + 2*y + 1)/abs( x + y ), 'errors' );

$\textrm{Quantity}(3.577708765, 0.4369723113)$

[> # Round the error to two digits, and round the value to the same decimal point 
[> ApplyRule( %, round[2] );

$\textrm{Quantity}(3.58, 0.44)$

For the second example,

[> combine( \cos(0.05x + 0.07y), 'errors' );

$\textrm{Quantity}(0.9820042351, 0.01064710341)$

[> ApplyRule( %, round[2] );

$\textrm{Quantity}(0.982, 0.011)$

Of course, the scientific error analysis package makes the assumption that the error is relatively small when compared to the quantities that are being measured and the concavity of the expression in the neighborhood of the point at which it is being evaluated is also not very significant, in which case, a linear approximation of the expression being calculated is quite reasonable; however, if either of these conditions do not hold, this can cause serious issues, possibly underestimating the consequent error significantly. For this, there is the tolerances package.

You'll note that we have the same issues with Newton's method, which creates a linear approximation of the expression at the point where we have an approximation of the root. Newton's method works well if the value of the expression at the point is small, and the second derivative is also not very large, but if either of these is false, then the next approximation may be quite far away from the actual root.

Example of the force between the Earth and the Moon

You may recall that the scientific constants package also has errors associated with at least those constants that are unknown. This package can combine those errors, so for example, suppose we wanted to know what the force between the Earth and the Moon is when the Moon is at perigee (that is, closest to the Earth). There is an an error in the mass of the Earth, an error in the mass of the Moon, but also, the distance between the Earth and Moon at perigee changes, so we can only estimate this value. On top of all this, the gravitational constant $G$ also has an error associated with it. However, Maple makes this easy:

[> with( ScientificConstants ): 
[> m[Moon] := Quantity( 7.3458e22, 0.0007e22 ): 
[> m[Earth] := Quantity( 5.9722e24, 0.0006e24 ): 
[> r[Perigee] := Quantity( 363.4e6, 7.0e6 ): 
[> combine( Constant( 'G' )*m[Moon]*m[Earth]/r[Perigee]^2, 'errors' );

$\textrm{Quantity}(0.2217149638 \cdot 10^{21}, 0.8541639518 \cdot 10^{19} )$

[> ApplyRule( %, round[2] );

$\textrm{Quantity}(0.2217 \cdot 10^{21}, 0.85 \cdot 10^{19} )$

Example using the ideal gas law

Returning to our ideal gas law example above, suppose that we know that the conditions are $P = (85315 \pm 456) \textrm{ Pa}$ (we're in a slight vacuum), $V = ( 0.00003814 \pm 0.00000075 ) \textrm{ m}^3$ (around $38.14 \textrm{ mL}$), $T = (292.5 \pm 0.3) \textrm{ K}$ (around $18^\circ \textrm{C}$), and we'd like to estimate the number of moles. If we use a previous version of Maple, before 2019, the ideal gas constant $R$ had an error, but in 2019, the ideal gas constant was defined as an exact value. Let's do both using the older version of Maple:

[> R_2019 := 8.31446261815324: 
[> evalf( Constant( 'R' ) ); # The new value above is slightly larger

$8.3144598$

[> P := Quantity( 85315, 456 ): 
[> V := Quantity( 0.00003814, 0.00000075 ): 
[> T := Quantity( 292.5, 0.3 ): 
[> combine( Constant( 'R' )*T/(P*V), 'errors' );

$\textrm{Quantity}(747.4012580, 15.24970176)$

[> combine( R_2019*T/(P*V), 'errors' );

$\textrm{Quantity}(747.4015113, 15.24970692)$

As you can tell, the redefinition of the ideal gas constant does little to affect the propagated error, as the two constants match each other for the first five digits. For your interest, if you just use the old value of the ideal gas constant without the error provided for in the ScientificConstants package, you get:

[> combine( 8.3144598*T/(P*V), 'errors' );

$\textrm{Quantity}(747.4012580, 15.24970175)$

So, as you can see, in this case, there is a negligible impact on the error (only the last digit is affected). However, better safe than sorry.