Source Files

C Structures: `struct`

What is a `struct`?

A struct allows the programmer to create a composite type or record of fields, an instance of which would consist of a fixed set of labeled objects. Such a type would be used to represent an actual or abstract object that is associated with more than one parameter.

For example, a real-valued coordinate in the plane is made up of two variables, tex:$$(x, y)$$ . Suppose we are using this coordinate in a search algorithm that requires each point to be visited. Consequently, we could associate with each coordinate a Boolean value indicating whether or not the point has been visited (this requires #include <stdbool.h>). Thus we could define the structure:

struct Coord2d {
	double x, y;
	bool visited;
};

To define a variable of such a type, we simply declare it as such:

int main() {
	struct Coord2d point;

	// ...

Alternatively, we could declare a pointer to such a data time, and then allocate sufficient memory for such a structure:

int main() {
	struct Coord2d *p_point = malloc( sizeof( struct Coord2d ) );

	// ...
};

The operator sizeof will return how much memory its argument requires. Note that I call it an operator and not a function: no function takes a type as an argument; instead, sizeof is recognized by the compiler as a directive to replace it with the number of bytes that type occupies. In this case, we could run the following program:

% more struct.1.c 
#include <stdio.h>
#include <stdbool.h>

struct Coord2d {
        double x, y;
        bool visited;
};

int main() {
        printf( "The memory required by 'struct Coord2d' is %d\n",
                 sizeof( struct Coord2d ) );

        return 0;
}
% gcc struct.1.c 
% ./a.out 
The memory required by 'struct Coord2d' is 24

You may immediately notice that doubles are 8 bytes and a bool is only one byte; however, the entire structure occupies 24 bytes. This is because the compiler will optimize the structure for access by the processor, as is shown in Figure 1.

Figure 1. Memory allocated for the struct Coord2d. The three white bytes are unused.

To access the fields of a variable defined to be of this type, we use the . operator:

	point.x =   5.3425;
	point.y = -73.2325;
	point.visited = false;

	printf( "The coordinate is (%f, %f)\n", point.x, point.y );

To access or modify the fields of a pointer which is assigned the address of such a structure can either be done through de-referencing the object, and then using the . operator:

	(*p_point).x = 29.8521;
	(*p_point).y = 18.9505;
	(*p_point).visited = false;

	printf( "The coordinate is (%f, %f)\n",
                 (*p_point).x, (*p_point).y );

Recall that p_point was declared to be a pointer; that is, it stores the address of a struct Coord2d. Thus, *p_point indicates that you mean to refer to the coordinate stored at that memory location, and then you can use the . operator to indicate which field you wish to access.

The notation (*p_point).x is a little distracting and therefore, it can be replaced by the equivalent -> operator:

	p_point->x = -5.5993;
	p_point->y = 19.7851;
	p_point->visited = false;

	printf( "The coordinate is (%f, %f)\n",
                 p_point->x, p_point->y );

An array of structures can be created just as easily:

% more struct.1.c 
#include <stdio.h>
#include <stdbool.h>

#define N 20

struct Coord2d {
        double x, y;
        bool visited;
};

int main() {
	int i;
	struct Coord2d point_array[N];

	for ( i = 0; i < N; ++i ) {
		point_array[i].x = 0.0;
		point_array[i].y = 0.0;
		point_array[i].visited = false;
	}

	// Each of the N coordinates is now initialized

	// ...

	return 0;
}

More intelligently, however, we would define an init function that would initialize each of the entries:

/**************************************************
 * Initialize the entries of an array of Coord2d. *
 **************************************************/

void init( struct Coord2d *array, int n ) {
	for ( i = 0; i < n; ++i ) {
		array[i].x = 0.0;
		array[i].y = 0.0;
		array[i].visited = false;
	}
}

Because an array is already defined in C to simply be the address of the first entry in the array (always whatever is at location 0: array[0]), this function would be called as follows:

int main() {
	struct Coord2d point_array[N];

	init( point_array, N );

	// ...

	return 0;
}

We will now look at two examples: the first is motivational (why do we use struct) while the second is a more detailed and complex example.

A Motivating Example

Suppose you have numerous records you would like to keep where each record has similar fields associated with them. For example, suppose you were designing some form of on-line Dungeons and Dragons^TM game where you have a number of players. You come up with the characteristics of each individual:

	char *name;
	int x_coord, y_coord;
	short gender;
	int hit_points;

Suppose you expected twenty players. You could therefore create an array of size twenty for each of these characteristics:

#define MAX_PLAYERS 20

	char *name[MAX_PLAYERS];
	int x_coord[MAX_PLAYERS], y_coord[MAX_PLAYERS];
	short gender[MAX_PLAYERS];
	int hit_points[MAX_PLAYERS];

Each function that requires this information must be passed

void upgrade_experience( int *hit_points ) {
	// Some code...
}

void combat(
	char *name_1, int *x_coord_1, int *y_coord_1, int *hit_points_1,
	char *name_2, int *x_coord_2, int *y_coord_2, int *hit_points_2
) {
	// Some more code...
}

void move_character( int *x_coord, int *y_coord ) {
	// Some more code...
}

Suppose, however, that your game suddenly takes off and now you decide to add additional features: vertical coordinates, experience points, etc. Now you have to add even more arguments to more functions and hope you keep them straight. You'd have to change every function call to add the new coordinate:

void combat(
	char *name_1, int *x_coord_1, int *y_coord_1, int *z_coord_1, int *hit_points_1,
	char *name_2, int *x_coord_2, int *y_coord_2, int *z_coord_1, int *hit_points_2
) {
	// Some more code...
}

As you can imagine, this would lead to a maintenance nightmare ending with the ultimate collapse of the system.

However, all this information is about a player in the game: should we not be able to associate all fields that are associated with a player together, rather than in separate arrays? This is the purpose of a structure, or struct in C:

struct Player {
	char *name;
	int x_coord, y_coord;
	short gender;
	int hit_points;
};

Now, you could create an array of structures:

#define MAX_PLAYERS 20

struct Player player_list[MAX_PLAYERS];

If you wanted to initialize these, a for-loop would be quite nice:

for ( i = 0; i < MAX_PLAYERS; ++i ) {
	player_list[i].hit_points = 100;
}

This would initialize the field for each character to 100.

Now, each function that requires any information at all about one or more of the players simply requires us to pass a reference to the appropriate entry in the array:

void upgrade_experience( struct Player *plyr ) {
	// Some code...
}

void combat( struct Player *plyr_1, struct Player *plyr_2 ) {
	// Some more code...
}

void move_character( struct Player *plyr_1 *x_coord, int *y_coord ) {
	// Some more code...
}

Now, if you're successful and you want to expand your game, if you add more features describing each player, you just add the additional fields into the structure:

struct Player {
	char *name;
	int x_coord, y_coord, z_coord;
	short gender;
	int hit_points;
	int experience_points;
};

Updating the initialization would be straight-forward:

for ( i = 0; i < MAX_PLAYERS; ++i ) {
	player_list[i].hit_points = 100;
	player_list[i].experience_points = 0;
}

Inside the combat function, to reference the new coordinate, one would simply have to refer to plyr_1->z_coord and plyr_2->z_coord. Nothing else would change in the parameters or in the calls to the function.

This also, incidentally, reduces the time to make function calls: rather than copying eight or ten or twelve parameters, only two parameters need be passed to the combat function, at least, with respect to the players.

A More Interesting Example

Suppose you are recording the propagation of heat at various points on an electromechanical device which is initially at the ambient temperature. You have perhaps two dozen sensors (say, NUM_SENSORS) each of which has a location on the device, and you would like to track the last NUM_READINGS temperatures.

struct Coord3d {
	float x, y, z;
};

You could now use this to build a sensor.

struct Temp_sensor {
	struct Coord3d loc;
	float reading_array[NUM_READINGS];
	int current_reading;
};

The field current_reading would be used to indicate where in the array the most recent reading was recorded. When a new reading comes in, the variable current_reading would be incremented and the new reading would be stored in that location. Thus, this is a cyclic array, as shown in Figure 2.

Figure 2. A sample of the state of an array of temperatures.

We could then declare an array of NUM_SENSOR temperature sensors:

	struct Temp_sensor sensor_array[NUM_SENSORS];

The code to, for example, initialize all of the temperature values to a given value could then be done as follows:

	int i, j;

	for ( i = 0; i < NUM_SENSORS; ++i ) {
		sensor_array[i].current_reading = 0;

		for ( j = 0; j < NUM_SENSORS; ++j ) {
			sensor_array[i].reading_array[j] = initial_temperature;
		}
	}

This initial temperature may be the ambient temperature when the device is first being used.

We will now look at two modifications to this structure.

First Modification

Now, suppose you want to add onto this an option that allows the user to specify each sensor with a colour. This would require red-green-and-blue components:

struct Color {
	unsigned char red, green, blue;   // Each colour is from 0 to 255 allowing 16 777 216 colours
};

Note, if the colours were for printing, it would be better to use CMYK.

Updating the temperature sensor code is, again, straight-forward:

struct Temp_sensor {
	struct Coord3d loc;
	float reading_array[NUM_READINGS];
	int current_reading;
	struct Color display_color;
};

Initializing an array of NUM_READINGS sensors to red would be as simple as:

	int i;

	for ( i = 0; i < NUM_SENSORS; ++i ) {
		sensor_array[i].display_color.red   = MAX_COLOR;
		sensor_array[i].display_color.green =         0;
		sensor_array[i].display_color.blue  =         0;
	}

where MAX_COLOR would be a constant defined to be 255.

Note: in today's global market, it is best if you use American spellings--even if you have to swallow your national pride. On occasion, when I use "color", I get mildly annoyed; however, I still do it because it's the right thing to do from an engineering point-of-view. For any employer or client, find out what their preference is.

Second Modification

Suppose now you determine that the readings do not come in at a regular enough cycle. Consequently, you need to record not only the temperature, but the time at which the reading was taken. This would allow you to analyze the data more correctly. Thus, we should add a time array, as well. At this point, the easiest solution would be as follows:

struct Temp_sensor {
	struct Coord3d loc;
	float reading_array[NUM_READINGS];
	int time[NUM_READINGS];
	int current_reading;
};

This is also poor programming practice. Instead, because each reading has associated with it a temperature and a time, we should develop a new structure:

struct Reading {
	float temperature;   // Temperature in degrees Celsius.
	int time;            // Seconds after midnight 2012.
};

In the temperature sensor structure, we can now replace the array of floating-point numbers with an array of readings:

struct Temp_sensor {
	struct Coord3d loc;
	struct Reading reading_array[NUM_READINGS];
	int current_reading;
};

Now, a snapshot of the array of one sensor would be as is seen in Figure 3.

Figure 3. A sample of the state of an array of temperature readings.

Now our initialization code would look as follows:

	int i, j;

	for ( i = 0; i < NUM_SENSORS; ++i ) {
		sensor_array[i].current_reading = 0;

		for ( j = 0; j < NUM_SENSORS; ++j ) {
			sensor_array[i].reading_array[j].temperature = initial_temperature;
			sensor_array[i].reading_array[j].time = 0;
		}
	}

Here, the sensor_array[i] references the i^th entry in the array. Because it is an array of struct Temp_sensor, it has four fields. We are interested in the field reading_array. This is itself an array, and therefore we access the j^th entry of this array by accessing sensor_array[i].reading_array[j]. Because this is an array of structures, we can access the two fields of this structure by using sensor_array[i].reading_array[j].temperature and sensor_array[i].reading_array[j].time.

Introduction to Computer Structures and Real-time Systems