Three implementations are provided:
- Pseudo code,
- A basic implemenation of the pseudo code,
- An enhanced implementation, and
- A pointer-based implementation.
When compiled with the -O2 option, the basic implementation
sorts an array of size 200 million in 16-17 seconds, the
enhanced array-based implementation sorts the same list in 12-13 seconds,
and the pointer-based implementation sorts the list in 8-9 seconds.
For now, see the source directory.
Pseudo Code
To radix sort a list of integers:
For each bit, starting with the least-significant bit:
Enqueue all 0s in one queue and all 1s in another.
Dequeue all the 0s and all the 1s back into the array.
Basic Implementation
template <typename Type>
void radix_sort( Type *const array, int const n, int const bits ) {
Type queue_0[n];
Type queue_1[n];
for ( int i = 0, mask = 1; i < bits; ++i, mask <<= 1 ) {
int tail_0 = 0;
int tail_1 = 0;
for ( int i = 0; i < n; ++i ) {
if ( (array[i] & mask) == 0 ) {
queue_0[tail_0] = array[i];
++tail_0;
} else {
queue_1[tail_1] = array[i];
++tail_1;
}
}
int i = 0;
for ( int j = 0; j < tail_0; ++i, ++j ) {
array[i] = queue_0[j];
}
for ( int j = 0; j < tail_1; ++i, ++j ) {
array[i] = queue_1[j];
}
}
}
Implementation on Arrays
This implementation:
- Creates a temporary array,
- Uses the temporary array to enqueue the entries from either side,
- Then alternates between using the array and the temporary array for such queues.
It does not dequeue the elements: for the next iteration, it walks forward in the currently
enqueued array to the zero-count, and then walks back from the end to that same point.
Implementation using Pointers
The enhanced implementation using pointers does away with the unnecessary use of indices
of the array-based version. This reduces the run time by almost 40 percent over the array-based
implementation.