Skip to the content of the web site.

Lesson 172: Unions

Previous lesson Next lesson


Suppose you have a situation where data may be in one of multiple formats, but only one of which will ever be used. For example,

  • a floating-point number may be stored as a float or as a double,
  • a position in an array may be defined by either an index (size_t) or a pointer (type *, or
  • an unsigned integer specifying a standard honorific (e.g., "Ms", "Mr", "Prof", "Dr", "MCpl", "Miss", "Mrs", etc.) or a string indicating a preferred but non-standard honorific.

In each case, one one of the two will be used, so there is no point in allocating memory for both. In most general applications, the need for such frugal allocation of memory is seldom necessary; however, in operating systems and embedded systems where memory may be at a premium, wasted memory may be of significant disadvantage.

The union data structure allows you to store multiple different types in a single data structure, but the memory allocated to this union is the maximum of the different possible choices. There is one caveat, however: you, the programmer, must know what is being stored at any time. This may, for example, require an additional variable, or reliance on a relevant state.

The format of a union is where each separate type is given an separate identifier, so if you declare a variable x as being of a specific union, then to refer to that variable as being an instance of the type associated with identifier type_value, you would access x.type_value.

As an example of a definition of a union:

union honorific_t {
	size_t id;
	std::string str;
};

or

union float_t {
	float  float_value;
	double double_value;
};

For example, the following saves π using both floating-point and double-precision floating-point values:

#include <iostream>
#include <iomanip>

int main();

union float_t {
	float  float_value;
	double double_value;
};

int main() {
	float_t x;


	std::cout << "3.14159265358979323846264338328"
	          <<  std::endl;

	x.float_value = 3.14159265358979323846264338328;
	std::cout << std::setprecision( 30 )
	          <<  x.float_value << std::endl;

	x.double_value = 3.14159265358979323846264338328;
	std::cout << std::setprecision( 30 )
	          <<  x.double_value << std::endl;

	return 0;
}

In the first case, because float only maintains 24 bits of precision, it is only accurate to seven significant decimal digits, while the double-precision floating-point format stores 53 bits of precision, and thus is accurate to sixteen decimal digits of precision:

$\frac{24}{\log_2(10)} \approx 7.22$ and $\frac{53}{\log_2(10)} \approx 15.95$.

Thus, the output is

3.14159265358979323846264338328
3.1415927410125732421875
3.14159265358979311599796346854

If you attempt to access a union as a type other than that which it was assigned, you will get very bizarre results:

	x.float_value = 3.141592653589793;
	std::cout << std::setprecision( 30 )
	          <<  x.double_value << std::endl;

	x.double_value = 3.141592653589793;
	std::cout << std::setprecision( 30 )
	          <<  x.float_value << std::endl;

The output is

3.14159265358979323846264338328
5.32864626443881739544466726114e-315
3370280550400

Thus, attempting to interpret a union as the wrong value it was last assigned could lead to peculiar results.

As another example:

#include <iostream>

int main();

union numeric_t {
	long   long_value;
	double double_value;
};

int main() {
	numeric_t x;

	x.double_value = 1;
	std::cout << x.long_value << std::endl;

	x.long_value = 1;
	std::cout << x.double_value << std::endl;

	return 0;
}

Executing this produces the output

4.94066e-324
4607182418800017408

Thus, 0x00000001 interpreted as a double is a very small floating-point value, and 0x3f00000000000000 interpreted as a long is a very large integer value. You may note that:

$\sum_{k = 52}^{61} 2^k = 4607182418800017408$

and that

$2^{-1074} \approx 4.9406564584124654417656879286822137236506 \times 10^{-324}$.

After your first course in numerical analysis where you examine the double-precision floating-point format, you will understand both of these interpretations.

Questions and practice:

1.


Previous lesson Next lesson