Skip to the content of the web site.

Heap algorithms

These functions are found in the #include <algorithm> library.

A sorted list has the maximum entry at the end of the list. This is necessary if you require knowledge of the relative position of each entry in the list. In some cases, however, it is only necessary to know the maximum entry within the container at any time. In this case, it is much more efficient to use a "binary heap". Inserting data into a sorted list can be expensive, for if you are using an array or any other contiguous data structure, this may require shifting numerous entries. The alternative is to use a balanced tree, but this requires more memory, as each entry must store the locations of its two children.

A max-heap, however, can be stored in an array yet it uses a tree structure within that array as follows:

  1. Assume that the $N$ entries in the array are located in locations $1$ through $N$.
  2. The maximum item is at location $1$.
  3. The child or children of the entry at location $k$ are located at $2k$ and $2k + 1$. If $2k = N$, that entry has only one child, and if $2k > N$, that entry has no children.
  4. Such a tree is a max-heap if for each entry, the value stored at an entry is greater than or equal to any values stored in any children of that entry.

Notice that we use the word "location" instead of indices, as indices generally run from 0 to N - 1.

Unlike a general binary tree, the location of the children in a binary heap under this implementation need not be explicitly stored, thus reducing the memory requirements to no more than that of a sorted list. Also, beyond the scope of these topics, there are very efficient algorithms for taking a max-heap occupying locations $1$ through $N$ and either:

  • inserting ("pushing") a new value while maintaining the max-heap structure so that now the max-heap occupies locations $1$ through $N + 1$, and
  • removing ("popping") the maximum item and rearranging the remaining entries so that they continue to be a max-heap occupying locations $1$ through $N - 1$.

Underlying container

The underlying data structure must have random-access iterators, and thus, we are restricted to arrays, std::array and std::vector.

Examples

Given the integers $1, 2, 2, 3, 4, 5, 5, 6, 7, 7$, each of the following are valid max-heaps:

    {7, 7, 5, 6, 4, 5, 2, 1, 3, 2}
    {7, 5, 7, 4, 2, 5, 6, 2, 3, 1}
    {7, 7, 6, 5, 5, 4, 3, 2, 2, 1}

To visualize the second variation, both as an array and as a binary tree, you can view Figure 1.


Figure 1. A heap of ten items, both as entries in an array with locations 1 through 10, and the corresponding entries in a complete binary heap where, for example, the children of location 4 are at locations 8 and 9.

The only requirements is that each parent is greater than any children. You will note that the last, a reverse-sorted list, is also a max-heap.

The following is not a max-heap, as the entry the 4th location is $3$ (remember, we are starting at location $1$), has children at locations 8 and 9, and the entry at location 9 is $4$ and $4 > 3$:

    {7, 5, 7, 3, 2, 5, 6, 2, 4, 1}

Functions

All of these functions below allow for an optional argument that is of type bool predicate( T lhs, T rhs ) that returns true if the lhs argument is less than the rhs. Specifically, if you want a min-heap, you should use

    []( T m, T n ) {
      return m > n;
    }

or you can use the #include <functional> library with std::greater<T>().

bool std::is_heap( itr_t itr_first, itr_t itr_last )
itr_t std::is_heap_until( itr_t itr_first, itr_t itr_last )

The first returns true if the entries on the range satisfy the conditions of a max-heap, and the second returns an iterator referring to the first entry that is greater than its parent in the heap, returning itr_last if the range is a max-heap.

void std::make_heap( itr_t itr_first, itr_t itr_last )

This is similar to sort(...), except rather than sorting the entries, the entries are reordered to ensure that the resulting order forms a max-heap. Because the constraints on a max-heap are less than that of being sorted, this algorithm is faster.

void std::push_heap( itr_t itr_first, itr_t itr_last )

This function assumes that the range itr_first to itr_last - 1 is a valid max-heap, and then inserts the entry at itr_last - 1 into this max-heap so as to ensure that the entire range itr_first to itr_last is a valid max-heap.

void std::pop_heap( itr_t itr_first, itr_t itr_last )

This function assumes that the range itr_first to itr_last is a valid max-heap and reorders the entries so that what was the maximum entry (at itr_first) is moved to the location itr_last - 1 and the entries on the range itr_first to itr_last - 1 form a new max-heap with one less entry.

void std::sort_heap( itr_t itr_first, itr_t itr_last )

A max-heap can be converted into a sorted list in-place and with a guaranteed run time of $O(n \ln(n))$. This is done by repeatedly calling pop_heap(...) on successively smaller ranges, and with each application of this function, the next largest entry appears in the appropriate location.

Note that the behavior of the last three functions is undefined if the required ranges are not in fact max-heaps.