Time and Location | M 4:15-6:15pm, Room 5383 |
Personnel |
Prof. Liang Huang (huang @ cs.qc), Instructor Jie Chu (jchu1 @ gc.cuny), TA |
Office Hours (tentative) |
LH: M 6:15-6:30pm, CS Lab JC: F 3-4pm, CS Lab Additional office hours available before HW dues and exams. |
Prerequisites | CS: algorithms and datastructures (especially recursion and dynamic programming).
solid at programming. basic understanding of automata theory.
Math: good understanding of basic probability theory. |
Textbooks | This course is self-contained (with slides and handouts) but you may find the following textbooks helpful:
|
Grading |
|
Week | Date | Topics | Homework/Quiz | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 | Jan 28 | Intro to NLP and Rudiments of linguistic theory Intro to Python for text processing | Ex0 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Unit 1: Sequences and Noisy-Channel | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2 | Feb 4 | Basic automata theory. FSA (DFA/NFA) and FST. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3 | Feb 11 | FSAs/FSTs cont'd The Noisy-channel model. | Quiz 0 (Python and trees) HW1 out: FSA/FSTs, carmel. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
President's Day. Class moved to Wednesday.
| 4 | W Feb 20 | Probability theory and estimation. | Weighted FSA/FSTs. Noisy-Channel model. help on HW1
| 5 | Feb 25 | Language Models and Smoothing; P(Obama), P( | Bush).
| HW1 due | Jie: Discussions of HW1; minilecture on Unix; hash vs. array. 6 | Mar 4 | Smoothing: pseudocounts, prior/MAP; add-(less-than)-one, Witten-Bell, Good-Turing; backoff and interpolation | Quiz 0' (trees, stack, postfix/SOV, FSA (pluralizer), hash, binary search).
| 7 | Mar 11 | Entropy/Perplexity; Shannon Game | HMM and Viterbi; Japanese Transliteration Ex1
| 8 | Mar 18 | Trigram Viterbi; Excel Demo | More on English and Japanese Phonology Phonetics/Phonology 101: IPA, emic-etic help on Ex1. | HW2 out: Shannon Game, English Pronunciation, and Katakana transliteration Spring Break
| HW2 due on Friday 4/5 Unit 2: Trees and Grammars
| 11 | Apr 8 | CFGs
| Jie: discussions on HW2 | Proposal suggestions out 12 | Apr 15 | PCFGs and CKY | Bottom-up vs. Top-down dynamic programming with memoization Hypergraphs: generalized topological sort; Viterbi=>CKY; Dijkstra=>Knuth
| 13 | Apr 22 | Probabilistic Parsing with Unary Rules | Earley's Algorithm project proposal due | HW3 out: PCFG and CKY Unit 3: Language Learning
| 14 | Apr 29 | Unsupervised Learning | EM (slow version). EM slides help on HW3. 15 | May 6 | Theory of EM convergence | EM (fast version: DP/forward-backward) HW3 due. HW4 out: EM on Katakana transliteration.
| 16 | May 13 | last week project mid-way presentations
| HW4 due on Thursday (last day of instruction).
| 17 | May 20 |
| |