Review Questions for Midterm I. Solutions. Midterm I (Friday Feb 5) will be very similar to these questions. 0. The only Turing-Award winner from this state is: __Douglas Englebart___ He invented the __mouse___ (something you use everyday). He graduated from __OSU, of course!__ (choose one: OSU / UO). The only other Turing-Award winner who has lived in this state is: __John Backus__ He invented the __FORTRAN__ language and co-invented the __ALGOL__ language. He also co-invented the BNF notation for the __context-free grammar__ invented by __Noam Chomsky__. Lex was co-authored by __Eric Schmidt__ who was the CEO of __Google__. He wrote Lex during his internship with __Al Aho__ who is the author of the compiler textbook ("dragon book"). 1. Write regular expressions for (a) Python variable names [a-zA-Z_][a-zA-Z0-9_]* (b) integers (-?)\d+ (c) simple float numbers (i.e., no scientific notation) on week2-3 slides 2. Are the following grammars ambiguous? If so, demonstrate. (a) E -> E + E | int yes: two parse trees for int + int + int (E (E int) + (E (E int) + (E int))) (E (E (E int) + (E int)) + (E int)) (b) E -> E + (E) | int no. 3. For this grammar: E -> E + E | E * E | int (a) demonstrate two types of ambiguities (precedence and associativity). actually three kinds of ambiguities here: precedence b/w + and * int + int * int associativity of +: int + int + int and associativity of *: int * int * int (b) write a grammar that eliminates the precedence ambiguity only. E -> E + E | T T -> T * T | int (c) write a grammar that eliminates the associativity ambiguity only, where both + and * are right-associative, i.e., a+b+c = a+(b+c). E -> T + E | T * E | T T -> int (note this grammar is not ambiguous, but it does not have operator precedence). (d) write a grammar that eliminates both ambiguities, where + is left-associative, * is right-associative, and + has lower precedence than *. E -> E + T | T T -> F * T | F F -> int 4. Demonstrate shift-reduce conflict for (a) E -> E + E | int step | stack . next_token | action (after) -----+--------------------+--------------- 0 | . int | shift 1 | int . + | reduce E->int 2 | E . + | shift 3 | E + . int | shift 4 | E + int . + | reduce E->int 5 | E + E . + | reduce E->E+E (can also shift!) 6 | E . + | shift 7 | E + . int | shift 8 | E + int . $ | reduce E->int 9 | E + E . $ | reduce E->E+E 10 | E . $ | accept the resulting parse tree is left-associative: (E (E (E int) + (E int)) + (E int)) the above is preferring reduce. if we prefer shift instead, then 5 | E + E . + | shift 6 | E + E + . int | shift 7 | E + E + int . $ | reduce E->int 8 | E + E + E . | reduce E->E+E 9 | E + E . $ | reduce E->E+E 10 | E . $ | accept the resulting parse tree is right-associative: (E (E int) + (E (E int) + (E int))) (b) E -> E + E | E -> E * E | int similar to above (c) S -> if E then S | if E then S else S | print "5" E -> True | False at this step (see LR slides): if E then S . else you can either shift or reduce 5. The Ls and Rs in LL and LR mean: (a) left-to-right scanning (b) right-to-left scanning (c) left-most derivation (d) right-most derivation LL = (a)(c) LR = (a)(d) LL and LR are: (a) shift-reduce (b) recurive descent (c) bottom-up (d) top-down LL = (b)(d) LR = (a)(c) In practice, Python is parsed by _LL_ and C/C++/Java by _LR_. Why there is no "right-to-left" parsing for programming languages? variables have to be defined (or at least declared) first before being referred to. 6. Reduce-reduce conflict is due to ______. Give an example using P_2. due to multiple nonterminals deriving the same substring. e.g.: bool_expr -> name and int_expr -> name or in this grammar: E -> E + T | T T -> int adding E -> int would cause reduce-reduce conflict. 7. If in a shift-reduce conflict, the parser always prefers reduce, the result will be left-associative or right-associative? Demonstrate. see problem 4. 8. Redo the Python expr to LISP example from week1 slides, using the new ast instead of the old compiler package. trivial. 9. Consider the following grammar P_0.8: module : stmt+ stmt : (assign_stmt | print_stmt) NEWLINE assign_stmt : name "=" decint print_stmt : "print" name Now you need to add a simple "while" statement, e.g.: a = 0 while a < 10: a += 1 print a For simplicity we only consider: (a) variable initialization to an integer outside of loop (but not in a loop) (b) variable increment (by an integer) or print_stmt inside loop (i.e., no nested loops) (c) single comparison in the form of "variable < integer" for the condition Now you need to: (1) Write additional CFG rules (2) Write additional code for lex (3) Write additional code for yacc (4) Write additional code to convert to C. The solution to P_0.8 (lex, yacc, translation) is in addwhile.py. see addwhile_solution.py online.