1.4 Mergesort

Mergesort, like quicksort, is also one of the most well-known sorting algorithms and also a typical instance of divide-n-conquer (again, divide-conquer-combine). But the two of them have very different allocations of work between the divide and combine steps. Quicksort, on the one hand, puts most of the work in the divide step (partitioning), while its combine step is simple (list concatenation in our out-of-place implementation) or trivial (no work at all in the conventional in-place implementation). Mergesort, on the other hand, puts most of its work in the combine step, while its divide step is trivial.

At a high level, mergesort is divide-conquer-combine:

divide: split the array in two (evenly)
conquer: mergesort the left half, and mergesort the right half
combine: combine the two sorted lists (from the two recursive calls) into one sorted list

The main program can be extremely short, again in a functional style:

def mergesort(a):
    if (n:=len(a)) <= 1:
        return a
    return mergesorted(mergesort(a[:n//2]), mergesort(a[n//2:]))

Python caveats:

:= is a new syntactic feature introduced in Python 3.8, the assignment expression (a.k.a. the walrus operator), which resembles a similar feature in C/C++ (it saves a line).
n//2 is integer division in Python3 (i.e., $ $, or n/2 in Python2).
List slicing (a[:n//2] and a[n//2:]) creates new lists (i.e., out-of-place) and thus cost $O(n)$ time. A more conventional implementation found in most textbooks can reduce splitting to $O(1)$ by maintaining two indices to indicate the span $[i,j)$ to be sorted (i.e., $a[i:j]$, or $a[i]\ldots a[j-1]$):

def mergesort(a, i, j):
    if j-i <= 1:
        return a[i:j]          # slicing: new copy
    mid = (i+j)//2             # split point
    left = mergesort(a, i, mid)
    right = mergesort(a, mid, j)
    return mergesorted(left, right)

Note that although splitting can be made “in-place” like above, the main logic (especially the merging of two sorted lists) is very hard to be made in-place, unlike quicksort.

Merging Two Sorted Lists

The non-trivial work in mergesort lies in the combination step, i.e., merging of two sorted lists. For example, merging [1, 4, 6] and [2, 3, 5], we get [1, 2, 3, 4, 5, 6]. Here we use a very simple idea of “two-pointer scan”, where the left and right pointers start at the first element of each array, respectively, and take the smaller number while advancing the corresponding pointer:

a: [1, 4, 6]   b: [2, 3, 5]  # two sorted arrays
    ^              ^
    *=>                      # left is smaller
c: [1...                     # combined array (1st number)

a: [1, 4, 6]   b: [2, 3, 5]  # advancing the left pointer
       ^           ^
                   *=>       # right is smaller 
c: [1, 2...                  # combined array (2nd number)

a: [1, 4, 6]   b: [2, 3, 5]  # advancing the right pointer
       ^              ^
                      *=>    # right is smaller
c: [1, 2, 3...               # combined array (3rd number)

a: [1, 4, 6]   b: [2, 3, 5] # advancing the right pointer
       ^                 ^ 
       *=>                  # left is smaller
c: [1, 2, 3, 4...           # combined array (4th number)

a: [1, 4, 6]   b: [2, 3, 5]  # advancing the left pointer
          ^              ^ 
                         *=> # right is smaller
c: [1, 2, 3, 4, 5...         # combined array (5th number)       

a: [1, 4, 6]   b: [2, 3, 5]
          ^                ^ # advancing the right pointer (right is empty)

until one side is empty (in this case, the right array). Then we copy the rest of the other side (in this case, only [6]) to the combined array:

[1, 2, 3, 4, 5, 6]     # combined array (complete)

This process takes $O(n)$ time. Why? You can count the number of comparisons. There are at most $n$ comparisons, because each of them results in a new number added to the resulting array.

Caveat: if you want to avoid the last “copying” step, you can append a dummy $+\infty$ to both the left and right arrays.

Complexity Analysis

The analysis of mergesort is much simpler than quicksort, since

Unlike quicksort which has best and worst case scenarios, mergesort recursion tree is always balanced and thus its best and worst cases are the same.
Whether the splitting is done out-of-place in our functional code or in-place with a pair of indices in the conventional code, the divide+combine is always $O(n)$.

Therefore:

\[T(n) = 2T(n/2) + O(n) = O(n\log n)\]

Summary

Here is a table summarizing this comparison:

algorithm	divide	conquer	combine
quicksort	partitioning: $O(n)$	$2\times$: best: $n/2+n/2$; worst: $(n-1)+0$	trivial: $O(1)$ (in-place) or $O(n)$ (out-of-place)
mergesort	trivial: $O(1)$ (in-place) or $O(n)$ (out-of-place)	$2\times$: always balanced ($n/2+n/2$)	merging two sorted lists: $O(n)$

Historical Notes

Mergesort was invented by John von Newmann in 1945.

algorithm	divide	conquer	combine
quicksort	partitioning: \(O(n)\)	\(2\times\): best: \(n/2+n/2\); worst: \((n-1)+0\)	trivial: \(O(1)\) (in-place) or \(O(n)\) (out-of-place)
mergesort	trivial: \(O(1)\) (in-place) or \(O(n)\) (out-of-place)	\(2\times\): always balanced (\(n/2+n/2\))	merging two sorted lists: \(O(n)\)