Besides sorting, another important task in datastructures is selection, e.g., to select the \(k\)th smallest element from an unsorted array of size \(n\). Clearly, you can always sort the array first, but that will take \(O(n\log n)\) time. Can we do it faster without sorting? Intuitively, we should be able to, because selection does not necessarily require the full sorted order (I just want the \(k\)th smallest number, and nothing else!), and should in principle be much easier than sorting. In fact, if \(k=1\) (smallest) or \(k=n\) (largest), we can just use a simple scan of \(O(n)\) time. But what about any arbitrary \(k\)?
Hint: think about quicksort. Can you simplify quicksort a little bit to do selection?
Indeed, we can! And the resulting algorithm is conveniently called “quickselect”. The idea is very simple (to simplify our reasoning, let’s first assume that the array contains distinct numbers):
a
(size \(n\)) and index \(k\)left
and right
left
),
we already have a crucial observation: the pivot ranks
|left|+1
in a
! (here |...|
means
size). We can use this fact to do a case analysis, by comparing \(k\) with |left|+1
:
=|left|+1
,
then we’re done: return the pivot, because there are exactly \(k-1\) numbers in left
that are
less than the pivot, so the pivot ranks \(k\)th smallest in this array.<|left|+1
,
then the \(k\)th smallest element of
this array must be in left
; in fact it must be the \(k\)th smallest element in
left
, so do quickselect on left
with
k
>|left|+1
,
then the \(k\)th smallest element must
be in right
. But is it still the \(k\)th smallest element in
right
? No, because there are already |left|+1
numbers (pivot included) that are smaller than our target number, its
rank within right
must be k-|left|-1
. So we
should do quickselect on right
with
k-|left|-1
.Notice that right
is only necessary in case 3, so we
make our code a bit faster like the following (this does not matter in
the standard in-place implementation):
def qselect(a, k):
= a[0] # you can add two lines to enable randomized pivot
pivot = [x for x in a if x < pivot]
left = k - len(left) - 1 # 1 is for pivot
remaining if remaining <= 0: # cases 1-2: no need to do right!
return pivot if remaining == 0 else qselect(left, k)
= [x for x in a[1:] if x >= pivot]
right return qselect(right, remaining) # case 3
Example:
4, 1, 5, 3, 2] k=3
qselect [1, 3, 2] 4 [5] # pivot rank: |left|+1=4 > k; in left
[
1, 3, 2] k=3
qselect [1 [3, 2] # pivot rank: |left|+1=1 < k; in right
[]
3, 2] k=3-1=2: find 2nd smallest
qselect [2] 3 [] # pivot rank: |left|+1=2 == k: voila!
[return 3
Remarks:
The analysis of quickselect is very similar to quicksort, but a little simpler since we only have one sided recursion. In the most balanced case (with randomized pivot), each time we throw away about half of the array (analogous to binary search in sorted array), so:
\[ T(n) = T(n/2) + O(n) \]
This is just a converging geometric series:
\[ T(n) = O(n) + O(n/2) + O(n/4) ... + 1 = O(n) \]
(Note: even the infinite sum of \(1 + 1/2 + 1/4 + ...\) converges to \(2\), let alone a finite sum; in this analysis we don’t even need to know the height of the recursion tree, but in case you wonder, it is still \(\log n\), like quicksort best case).
In the most unbalanced case where each time we can only reduce the size of the array by one (the pivot), this becomes identical to quicksort worst case:
\[ T(n) = T(n-1) + O(n) = O(n^2) \]
So the best case of quickselect is \(O(n)\), which is faster than quicksort best case, and this makes sense since selection is easier than sorting. However, the worst case of quickselect is as slow as quicksort worst case!
Caveat: what if you’re really lucky that the first time is a case 3
(i.e., \(k\)==len(left)+1
)
so that you don’t need to do any further recursion? Well, that’s still
\(O(n)\) because of partition. So
quickselect best case is always \(O(n)\), unlike binary search in sorted
array, whose best case is actually \(O(1)\) (if you find the query in the first
try; its worst case is \(O(\log
n)\)).
What about the average case? Similar to quicksort, its average case is the same as its best case: \(O(n)\). If you understand our derivation of quicksort average case, you can derive quickselect average case yourself very easily.
Quickselect is not worst-case linear-time but rather expected linear-time (meaning the average complexity is linear in the size of the array). In practice, with randomized pivot, this is good enough, because there is no a priori worst case input. However, if you really want a deterministic (i.e., not randomized) worst-case linear-time selection algorithm, there is indeed one, called “median of medians” algorithm, which is rather complicated (though very clever). More importantly, it is actually quite slow (much much slower than quickselect!) due to a high constant factor. This algorithm is beyond the scope of our course; see the Wikipedia article (linked above) or other textbooks such as CLRS. In practice, just use quickselect.
Similarly, quicksort is quite a bit faster than mergesort in practice, although the former is expected \(O(n\log n)\) while the latter is worst-case \(O(n\log n)\). (Well, to be fair to mergesort, it is still a very simple and useful algorithm while deterministic selection is too complicated and not practical).
The reasons why quicksort is faster than mergesort are:
algorithm | divide | conquer | combine | complexity |
---|---|---|---|---|
quicksort | partitioning: \(O(n)\) | \(2\times\): best: \(n/2+n/2\); worst: \((n-1)+0\) | trivial: \(O(1)\) (in-place) or \(O(n)\) (out-of-place) | best/avg: \(O(n\log n)\); worst: \(O(n^2)\) |
quickselect | partitioning: \(O(n)\) | \(1\times\): best: \(n/2\); worst: \((n-1)\) | n/a | best/avg: \(O(n)\); worst: \(O(n^2)\) |
binary search | split: \(O(1)\) | \(1\times\): always \(n/2\); | n/a | always \(O(\log n)\) |
Quickselect, like quicksort, was also invented by the Turing Award winner Tony Hoare, and is known as Hoare’s selection algorithm.
The deterministic linear-time selection algorithm, “median of medians”, was invited by Blum, Floyd, Pratt, Rivest, and Tarjan in 1973 when they were all at Stanford. Among them, Blum, Rivest, and Tarjan later received Turing Awards (for other contributions), and Floyd and Pratt are also legendary figures in computer science.