Language Technology, Fall 2014 Exercise 1 - FUN with morphology and entropy (Due Monday 10/6 11:59pm on Blackboard) -------------------------------------------------- 1. Redo the pluralization transducer from the quiz: Besides the default +s rule, you'd also consider the following rules: {-s, -x, -sh, -ch} + es (e.g. buses, boxes, bushes, matches), -f + -ves (e.g. leaves), and -y + -ies (e.g. flies). You don't need to consider any other rules or irregularities (e.g. tooth => teeth). (a) include a photo of your transducer on paper (b) include a carmel text file for your transducer, and test it with at least five examples. 2. Derive this central equation in Noisy-Channel model rigorously (see slide 4): argmax_{t..t} P(t..t|w..w) ~ argmax_{t_1..t_n} P(t_1) P(t_2|t_1) P(t_3|t_2 t_1) ... P(t_n|t_{n-1} t_{n-2}) P(w_1|t_1) P(w_2|t_2) P(w_3|t_3) ... P(w_n|t_n). here "~" means "approximately equal". In each step of your derivation, it's either an equality or an approximation, and * in case of equality, annotate the law/rule used (e.g., Bayes rule), * in case of approximation, explain the reason/assumption behind it. Also explain why this model is intuitively called "HMM". 3. Play the Shannon Game at least once (it's fun!), and report your entropy. http://www.math.ucsd.edu/~crypto/java/ENTROPY/ (if it doesn't run, System Preference->Java->Security->Edit Site List, and add http://www.math.ucsd.edu to that list.) a) take a screen shot of your result. b) work out the formula that was used to calculate your entropy in this game, and verify it with your result. c) write a short paragraph of observations from your game and your result.