AI Paradigms and Paradigm Shift

As mentioned in the brief history of AI, there are two main paradigms in AI, Symbolic AI (rational, rule-based, logic-based, hand-coded) and Connectionist AI (empirical, data-driven, learning-based). Their decades-long competition is the central theme of AI history.

Symbolic AI

The core idea is that intelligence arises from explicit reasoning over symbols and rules — much like how humans manipulate concepts and logic. It argues that reasoning must come first and learning can wait, or in other words, it cares much more about the knowledge itself and how to reason about it, than where this knowledge comes from.

Characteristics: It uses discrete symbols to represent knowledge (e.g., Cat(x) then Animal(x)), relies on logic, rules, and search to make inferences and decisions.

Strengths:

It works well in structured, well-defined domains (e.g., chess, theorem proving, planning).
It is interpretable — the reasoning steps are explicit.

Weaknesses:

It is poor at handling uncertainty, noise, or raw sensory data.
It requires manual knowledge engineering which is costly, does not scale, and can not keep up with the changing world.

Typical techniques:

Knowledge representation and reasoning (KRR)
Expert systems
Logic programming (Prolog)
Classical planning and search

Era:

Dominant from the 1950s to the late 1980s.

We will discuss symbolic AI in Unit 2.

Connectionist AI

On the contrary, connectionism’s core idea is that in intelligence, learning must come first and reasoning can wait, and that knowledge can and should be learned automatically from data rather than hand-coded. It argues that intelligence emerges from patterns of activation in networks of simple units (neurons), inspired by the brain.

Characteristics: It learns from data, rather than relying on hand-crafted rules, and uses distributed representations — knowledge isn’t symbolic but encoded in weights.

Strengths:

Excels in perceptual and pattern-recognition tasks (vision and speech).
Handles complex, unstructured data
Learns automatically and scales with data
Easily adapted to new or changing scenarios with emerging data

Weaknesses:

Opaque (“black box”): lacks in explainability
Struggles with explicit reasoning, logic, and compositionality

Typical techniques:

Machine Learning
Deep Learning

Era:

First wave in the 1950s–1960s, revival in the 1980s (backpropagation), starting to take over in mid-1990s with machine learning, and explosive dominance since the 2010s with deep learning.

We will discuss connectionist AI in Unit 3 (ML) and Unit 4 (DL).

Philosophical and Psychological Connections

You can see that the dichotomy between these two paradigms is philosophical and psychological:

philosophical connections: symbolism embraces rationalism (reason as the chief source and test of knowledge) while connectionism embraces empiricism (true knowledge comes only from sensory experience and empirical evidence).
psychological connections: analagous to Nobel Laureate Daniel Kahneman’s famous System 1 and System 2, where symbolism focuses on the slower and logical System II while connectionism focuses on the fast and instinctive System I.

Paradigm Shift from Symbolic to Connectionist

In the late 1980s and early 1990s, symbolic AI hit the “knowledge acquisition bottleneck” — their systems could not scale to handle real-world messiness or adapt to the ever-changing world, or were way too costly to scale or maintain.

Let’s take machine translation (MT) as a concrete example. Since the 1950s, MT has always been one of the focus areas of AI and was (due to its complexity) considered a holy grail of AI. The early MT systems, from 1950s till 1990s, were rule-based, where bilingual experts need to write translation rules, say, for English-to-Chinese translation. Apparently, we have the following observations:

these rules are very costly to write
only bilingual experts (often linguists) can write them
we need a huge number of such rules to cover all aspects of English-to-Chinese translation
when multiple rules can apply in the same context, it is hard to specify which rules should have higher precedence
when new words, phrases, or constructions emerge in the source language (which happen all the time), we need to constantly add new rules and adjust precedences
if we need a new translation direction, say, English-to-French, we need to repeat this whole process again to write a completely different set of rules

These limitations motivate researchers to consider the other alternative, i.e., learning-based translation (also known as statistical MT). Starting from 1990, statistcal MT has gradually become increasingly popular, and by late 1990s, the dominate approach. Instead of hiring linguists to write the translation rules, we try to extract them from data by machine learning, without any prior knowledge of the two languages. Such data is called “parallel text”, for example, a set of English-Chinese sentence pairs. Such data is abundant, say, in the United Nations and European Commission, and in multilingual user manuals. If we see the English word “apple” and the Chinese word “pingguo” tend to co-occur in many sentence pairs, we can conjecture that they might be translations of each other.

The above is just an example, but it reflects the fact that the whole AI field has gradually shifted towards learning-based approaches since the mid-1990s. More and more large datasets were available, which also faciliated this shift. This trend continues and culminates with the deep learning revolution starting in the mid-2010s.

The progression from symbolic AI to (pre-deep learning) machine learning and eventually to deep learning is a double-sided sword: it makes AI systems more and more empirical, robust, adaptable, and human-like, while at the same time less and less formal, reliable, and explainable. Current (as of 2025) DL-based chatbots are super successful, but they nevertheless are unable to perform some very basic tasks, such as long multiplication and counting letters/words, and are easily fooled by irrelevant information or small changes in the problem statement. They are very good at high-level intuitions but struggle at the details and cannot perform rigorous reasoning. Future development of AI will need to combine symbolicism and coonectionism in creative new ways in order to design AI systems that are not just smarter than humans but also rigorous, reliable and explainable.

In our course, Units 2-4 reflect this shift: first symbolic (Unit 2), then ML (Unit 3), and finally DL (Unit 4).

Summary

Aspect	Symbolic AI	Connectionist AI
Core principle	Explicit reasoning over symbols	Learning distributed representations
Philosophical belief	knowledge is reasoning	knowledge comes from experience
Philosophical roots	Rationalism (Descartes, Leibniz, Kant)	Empiricism (Bacon, Hume, Locke)
Representation	Rules, logic, symbols	Neural activations, weights
Strengths	Reasoning, explainability	Perception, pattern recognition
Weaknesses	Brittleness, manual knowledge	Opaqueness, weak reasoning
Data requirement	Small, structured	Large, unstructured
Golden era	1950s–1980s	1995–present
Example systems	Expert systems, planners	Deep neural nets, LLMs