Time and Location | T 11:45am-1:45pm, Room 6496 |
Personnel |
Prof. Liang Huang (huang at cs.qc), Instructor James Cross (jcross at gc.cuny), TA |
Office Hours | Tuesday afternoons at CS Lab. Additional office hours available before HW dues and exams. |
Prerequisites | CS: algorithms and datastructures (especially recursion and dynamic programming).
solid at programming (in Python). basic understanding of formal language and automata theory. LING: minimal understanding of morphology, phonolgy, and syntax (we'll review these). MATH: good understanding of basic probability theory. |
Textbooks / MOOCs | This course is self-contained (with slides and handouts) but you may find the following textbooks helpful:
You might also find these Coursera courses helpful:
|
Grading |
|
Week | Date | Topics | Homework/Quiz |
1 | Sep 2 | Intro to NLP and Rudiments of linguistic theory Intro to Python for text processing | Ex0 |
Unit 1: Sequence Models and Noisy-Channel: Morphology, Phonology | |||
2 | Sep 9 | Basic automata theory. FSA (DFA/NFA) and FST. | |
3 | Sep 16 | FSAs/FSTs cont'd
The Noisy-channel model. | HW1 out: FSA/FSTs, carmel; recovering vowels |
RELIGIOUS HOLIDAY - NO CLASS | |||
5 | Sep 30 |
hw1 discussions
SVO/SOV vs. infix/postfix; adv of SVO: less case-marking; adv of SOV: no attachment ambiguity simple pluralizer language model: basic smoothing: Laplacian, Witten-Bell, Good-Turing | Quiz 0
ex1 out |
6 | Oct 7 |
language model (cont'd): information theory, entropy and perplexity, Shannon game
Viterbi decoding for HMM; transliteration | hw2 out: English pronunciation, Japanese transliteration |
7 | Oct 14 |
Pluralizer demo; discussions of HW2. More on HMM/Viterbi; sample code. intro to HW3 (semi-markov). | hw3 out: decoding for Japanese transliteration |
Unit 2: Unsupervised Learning for Sequences: Transliteration and Translation | |||
8 | Oct 21 |
Korean vs. Japanese writing systems. More on semi-markov Viterbi. EM for transliteration. | |
9 | Oct 28 |
More on EM: forward-backward and theory | hw4 out: EM for transliteration. |
10 | Nov 4 | Machine Translation: IBM Models 1-2 | |
11 | Nov 11 | EM for IBM Model 1 | |
12 | Nov 18 |
EM/HMM demo from Jason Eisner
Pointwise mutual information vs. IBM model 1 and IBM model 4 |
|
Unit 3: Tree Models: Syntax, Parsing, and Semantics | |||
13 | Nov 25 | CFG and CKY | hw5 out: IBM model 1 |
14 | Dec 2 | semantics intro; entailment; upward and downward monotonicity. | |
15 | Dec 9 (last class) | compositional semantics: quantifiers, type raising. | hw6 out: parsing |