The topological ordering is one of the most fundamental and widely used concepts in graph theory and graph algorithms. You probably didn’t notice it, but we have been using topological ordering in many real-life scenarios:
figuring out the order of taking classes in a university with respect to pre-requisites;
when you get up in the morning, you have to complete a list of tasks in a certain order which satisfy some constraints such as
You can “interleave” these tasks as long as you satisfy the above constraints; for example, a valid order could be (a) brush teeth, (b) put on socks, (c) put on underclothes, (d) eat breakfast, (e) put on shoes, and (f) put on outer clothes.
same thing can be said about cooking. given any recipe, some tasks can be done in arbitrary order but some tasks must be done in order.
when you install a software package (e.g., on Linux), it depends on many other packages, which recursively depend on even more packages. The package installer automatically figures out a valid order of installation with respect to these dependencies.
Now let’s define this concept rigorously. For a directed graph \(G=(V,E)\), an ordering of nodes is a topological ordering if and only if for any edge \((u,v)\in E\), \(u\) must be before \(v\) in that ordering. Or more formally, a topological ordering of \(G\) is an ordering of nodes as \(v_1, v_2, \ldots v_n\) so that for any edge \((v_i, v_j)\in E\) we have \(i<j\). In other words, all edges must point “forward” in a topological order.
For example, a topological ordering on courses provides an order to take courses that respects the prerequisites: when taking course \(v\), all the courses that are required by it (not just \(v\)’s immediate prerequisites, but also its “recursive ancestors”) have already been taken.
One of the most important theorems in graphs is the following:
\(G\) has a topological ordering iff. \(G\) is a DAG.
Proof.
Given a directed graph, how to find a topological order (if it has one)? This process is known as “topological sort” (because like sorting, it returns an ordering), and there are two classical algorithms for this task: BFS style (bottom-up) and DFS style (recursive top-down).
Imagine you just became a freshman in a college. You’re given the course prerequisite chart, which is (guaranteed to be) a DAG. How would you figure out a plan of taking courses? Well, which course can you take first? It must be a course without any prerequisites. In other words, a node with 0 in-degree. Then you take it (say course \(v\)). By doing so, you also satisfy prerequisites of any course \(u\) if there is a \((v,u)\) edge. Basically, delete all outgoing edges from \(v\) (actually no need to delete them, instead just decrease the indegrees of those \(u\) nodes). Then find another course with 0 in-degree, and continue until you’ve taken all courses.
This is a simple variant of the BFS algorithm we’ve seen before. The only change is that the queue should only contain 0-in-degree nodes; when you update along the \((v,u)\) edge, you decrease the indegree of \(u\) and if the indegree is 0, add \(u\) to the queue.
If at any point the queue is empty before you take all courses, you know there is a cycle, and there is no topological ordering (there are no courses that I can take immediately, yet there are still courses that I need to take).
Here is the pseudocode:
def order(V, E):
= {v in V | indeg(v) == 0}
Q = []
order while Q is not empty:
= Q.pop()
v # add v to the topological ordering
order.append(v) for (v, u) in E: # should use adjacency list
-= 1
indeg(u) if indeg(u) == 0: # can take course u now
Q.push(u)if len(order) == len(V):
return order
else:
return None # cyclic, no topological order
For example, consider the following DAG:
> B ---> C
/ ^ /
/ | /
A ---> D ---> E
Q = [A]
A
; Q=[]
; order: [A]
indeg(B)
: 2->1;indeg(D)
: 1->0; push D
;
Q=[D]
D
; Q=[]
; order: [A, D]
indeg(B)
: 1->0; push B
;
Q=[B]
indeg(E)
: 1->0; push E
;
Q=[B, E]
indeg(C)
: 2->1;B
; Q=[E]
; order:
[A, D, B]
indeg(C)
: 1->0; push C
;
Q=[E, C]
E
; Q=[C]
; order:
[A, D, B, E]
C
; Q=[]
; order:
[A, D, B, E, C]
Another algorithm is top-down. Let’s say you want to install software
package \(v\), but it depends on \(u_1\) and \(u_2\). So you need to recursively satisfy
\(u_1\) first, and after conquering
\(u_1\) (which might involve installing
100+ packages), you conquer \(u_2\)
recursively (with memoization, of course). Now you can finally install
\(v\) (like post-order traversal). So
you add \(v\) to the topological order
(as the last node). Any node is added to the topological order when it’s
time to install it. Note that as in DFS, you need an outer loop over all
nodes because (a) you don’t know a priori which node is the “sink” node
in the graph, and (b) there could be more than one sink nodes (e.g.,
C
and E
above).
Recursion tree for the above example (it could start at any node):
C
B
A
:
A
is takable; take it; order: [A]
D
:
A
:
D
is takable; take it; order: [A, D]
B
is now takable; take it; order:
[A, D, B]
D
C
is now takable; take it; order:
[A, D, B, C]
E
D
E
is now takable; take it; order:
[A, D, B, C, E]
Note that the DFS returns a different topological order from BFS.
Here is the pseudocode:
def order(V, E):
def dfs_order(v):
if v in visited: # already taken
return
visited.add(v)for u in prereqs[v]: # satisfy prereqs first
dfs_order(u)# now take v
order.append(v)
= {}
visited = []
order = dict((v,u) for (u,v) in E)) # incoming edges
prereqs for v in V:
if v not in visited:
dfs_order(v) return order
In general, (as in the above example) a DAG is likely to have more than one topological orders, just like in college, two students in the same major can take the same set of courses in different orders. Let’s study the following questions: