Multidimensional Exact Pairwise Nearest Neighbour in Θ(n log n) time

A log-log graph of time over data size, showing the execution time of PNNcolourmap.c for the "House" and "Peppers" test images follows an O(n log n) curve rather than an O(n²) curve.

Download: pairwise-nearest-neighbour.tar.gz

The pairwise nearest neighbour algorithm is a quantization algorithm. It is a form of "Ward's method" for hierarchical clustering.
It can be used to generate a palette of few colours from an image of many colours. It can also be used for other things, but palette generation is the one I care about.

The downloadable implementation of pairwise nearest neighbour on this page runs in Θ(n log n) time, n being the number of pixels in the image. It can run for arbitrarily many dimensions (though most images have three (RGB) or four (RGBA)). It always finds nearest neighbours exactly rather than making a best-guess. Previously, the algorithm had been implemented in O(n log n) time with inexact nearest neighbours, in O(n log n) time for only one dimension, and O(n²) time when exact in multiple dimensions, unless a better implementation was well-hidden from me.
My implementation requires the Netpbm library, and reads from/writes to its "PAM" image format.

How it works

The main innovation is the use of a K-D tree to store the points, in which the nearest neighbour of a point, using the algorithm's non-Euclidian distance metric, can be found in Θ(log n) time rather than O(n) (though its worst case would be traversing the entire tree, which has O(n log n) space complexity. Does this amortize out? Unsure.). It is a "midpoint split" K-D tree because other K-D trees would be rather more complicated to insert points into or remove points from.
The nearest neighbour heap and delayed updates from the Lazy PNN method (Kaukoranta et al. 1999) are used, with the addition that the heap items are not explicitly marked out-of-date, but rather maintain copies of the monotonically increasing cluster sizes.
Other optimizations: The partial distortion search is used. An initial histogram is constructed with an Adams-style weight-balanced tree, to avoid redundant colour-space conversions and to quickly detect whether quantization is necessary (i.e. whether the source image has more colours than the desired palette).

Other complexity factors

Technically the time complexity is not only a function of the number of pixels in the image. It is also a function of the number of dimensions/colour-channels, which for regular images is small and can be ignored. It is also a function of the ratio between the greatest and least Euclidian distance between any two points in the image (due to the nature of the midpoint-split K-D tree), the ratio being a function of the bit precision of the image, which for regular images is small and can be ignored.