Lecture Notes
Alex Aiken
CS 164, Fall '94

			        Lecture 10
                                ----------

I.  Course Administration

    - Send me your name and SID or SS# if you are an extension student.
    - WAII is assigned today.

II. Review from last time:
	- A handle is a production X -> A and position in a right-sentential
		form BAw such that S -*> BXw -> BAw via a rightmost 
		derivation.

	- Handles always appear at the top of the stack in shift-reduce
		parsing.
		- In particular, the handle is never to the left of the
			rightmost non-terminal (can't be because 
			the parser is tracing a rightmost derivation)
		- Therefore shift and reduce moves are sufficient (never
			need to move the "#" to the left).

III. A digression: Conflicts

    - General shift-reduce strategy:
	If there is no handle on the stack, shift.  If there is a handle,
	reduce.

    - But what if there is a choice? 
	- if it is legal to either shift or reduce, then there is 	
		a shift-reduce conflict

	- if it is legal to reduce by two or more productions, then there
		is a reduce-reduce conflict.

    - Conflicts generally arise with ambiguous grammars. (All ambiguous
	grammars generate conflicts, but so do other kinds of 
	grammars).  For example:

	E -> E * E | E + E | id

        # id * id + id
        ...
	E * E # + id     reduce
	E # + id         shift
	E + # id	 shift
	...
	E + E		 reduce
	E

	Alternatively, we can also:
        # id * id + id
        ...
	E * E # + id     shift
	E * E + # id     shift
 	...
	E * E + E	 reduce
	E * E		 reduce
	E
		
	In this situation we can either reduce by E * E or
	shift a + in second step shown.  Which one is chosen
	determines the associativity of * and +.

    - One can always rewrite ambiguous grammars of this sort to encode
	precedence in the grammar.

    - However, such grammar hacking is tedious and results in an unnatural
	grammar that is hard to understand.

    - bison permits precedence and associativity declarations.  
	For example, one can say that * has higher precedence than +.

    - The effect of precedence and associativity declarations is to cause 
	bison to resolve conflicts in certain ways.  For example, 
	declaring  * to have higher precedence than + causes the parser
	to reduce in the example above.

    - Be careful with precedence and associativity declarations.  They
	are basically heuristics and, while they usually do the "right"
	thing, you can get into trouble.  Make sure you understand 
	precedence and associativity declarations in bison.

IV.  Detecting Handles

    - Our favorite grammar:
        E -> ( E ) 
          |  E'
          |  E' + E
        E' -> int * E'
          | int

    - To implement shift-reduce parsing, we must detect handles.

    - Important:  Just because there are symbols on the stack that can be 
	reduced does not mean there is a handle.  Reusing an example from
	Lecture 9, the string int # * int + int can be reduced by E' -> int to  
	E' # * int + int.  But such a reduction is not part of any rightmost 
	derivation and therefore is not a handle.

    - Of course, the rhs of some production must appear at the top of the stack,
	 but this is only a necessary and not sufficient for having a handle.

    - Put another way, whether a given stack has a handle can depend on the
	entire contents of the stack. 

    - Fact: If a grammar is unambiguous, then every sentential form has a 
	unique handle.  For simplicity, we assume grammars are 
	unambiguous.  The book takes a more general approach.

    - Definition: The set of "viable prefixes" are all A such that A#w
	is a configuration of a shift-reduce parse and Aw is a right
	sentential form.  Equivalently,	the viable prefixes are the
	prefixes of (rightmost derived) sentential forms that don't extend 
	beyond the right end of the handle.

    - Note: a viable prefix is "viable" because it can be extended by adding
	terminals to form a valid sentential form of some rightmost derivation.
	As long as there is a viable prefix, no parsing error has been detected.

    - Important Fact #3 about bottom-up parsing is:
	The set of viable prefixes is a regular language.

    - Fact #3 is completely non-obvious.  We won't prove it formally,
	but instead show how to construct an automaton that accepts
	viable prefixes.

    - Definition: An item is a production of a grammar with a "." somewhere
	on the rhs.  For example,

		E -> . ( E )
		E -> (. E )
		E -> ( E ) .
	
	are all items.  For epsilon productions X -> \epsilon the only
	item is X -> .  Items are also called LR(0) items.

    - Intuition:
	- The problem in recognizing viable prefixes is that the stack may
	  have only bits and pieces of the rhs of productions (if it had
	  the complete rhs we could reduce).

	- Part of the trick is that the bits and pieces---at least
	  in accepting parses---are always prefixes of the rhs of
	  productions.

	- Example:  Consider the input ( int ).
	   Then the stack   ( E # ) is a state of the shift reduce parser.
	   "( E" is a prefix of the rhs of E -> ( E ), which we will reduce
	   after the next shift.

        - An item  E -> ( E . ) says we have so far seen the prefix ( E of this
	  rhs and hope to see ).

   	- There is more to the story.  If the stack does not have a prefix
   	  of just one rhs R1, then the top of the symbols above the
	  prefix of R1 must be a prefix of some rhs R2 that will eventually
	  be reduced to the remainder of R1.  Recursively, there can be
	  symbols above R2, and so on.

	- Example: consider ( id )
	  The stack  ( int * # int ) is a state of the shift-reduce parser.
	  ( is a prefix of the rhs of E -> ( E ), and int * is a prefix of the
	  rhs of E' -> int * int.

    - The Basic Idea: Recognizing viable prefixes is equivalent to recognizing
	a sequence of prefixes of rhs's of productions, where each
	prefix in the sequence can eventually reduce to part of the missing
	suffix of its predecessor.

    - Algorithm:  Given a grammar G, construct an NFA recognizing the
	viable prefixes of right sentential forms of G.

	(1) Add a dummy production S' -> S to G
 	(2) The states of the NFA are the items of G with the extra prod.
	(3) For each item E -> A . X B and production X -> C, add an
		edge labeled \epsilon:
		E -> A . X B   ===epsilon===>  X -> . C
	(4) For each item E -> A.XB add an edge labelled X:
		E -> A . X B   ===X===> A X . B
        (5) Every state is an accepting state.
	(6) Start state is S' -> . S

    - Example:  The NFA for the example grammar above.

	Item		Eps Transitions		Other Transitions	On
	S' -> . E	E -> . ( E )		S' -> E .		E
			E -> . E'
			E -> . E' + E
			E' -> . int * E'
			E' -> . int		
	E -> . ( E )				E -> ( . E )		(
	E -> . E'	E' -> . int * E'	E -> E' .		E'
			E' -> . int
	E -> . E' + E	E' -> . int * E'	E -> E' . + E		E'
			E' -> . int		
	E' -> .int * E'				E -> int . * E'		int
	E' -> . int				E -> int .		int
	E -> ( . E )	E -> . ( E )		E -> ( E . )		E
			E -> . E'
			E -> . E' + E
			E' -> . int * E'
			E' -> . int
	E -> ( E . )				E -> ( E ) .		)
	E -> E' . + E				E -> E' + . E		+
	E -> E' + . E 	E -> . ( E )		E -> E' + E .		E
			E -> . E'
			E -> . E' + E
			E' -> . int * E'
			E' -> . int
	E -> int . * E'				E -> int * . E'		*
	E -> int * . E'	E' -> . int * E'	E -> int * E' . 	E'
			E' -> . int
	E' -> . int				E' -> int .		int

    - We can translate this NFA into a DFA by taking the \epsilon-closure
	of the NFA states.

	State		Items			Transition 		On
	1		S' -> . E		9			E
			E  -> . ( E )		2			(
			E  -> . E' + E		5			E'
			E  -> . E' 		3			int
			E' -> . int * E'
			E' -> . int	
	2		E  -> ( . E )		7			E
			E  -> . ( E )		2			(
			E  -> . E' + E		5			E'
			E  -> . E'		3			int
			E' -> . int * E'
			E' -> . int	
	3		E' -> int . * E'	4			*
			E' -> int .	
	4		E' -> int * . E'	10			E'
			E' -> . int * E'	3			int
			E' -> . int	
	5		E  -> E' . + E		6			+
			E -> E' .
	6		E -> E' + . E		11			E	
			E  -> . ( E )		2			(
			E  -> . E' + E		5			E'
			E  -> . E' 		3			int
			E' -> . int * E'
			E' -> . int		
	7		E -> ( E . )		8 			)
	8		E -> ( E ) .
	9		S' -> E .
	10		E' -> int * E' .
	11		E -> E' + E .

     - Definition: An item X -> B . C is valid for a viable prefix AB if
	there is a rightmost derivation S' -*> AXw -> ABCw

     - Intuition: the valid items for A are the prefixes of productions we
	might see after A.

     - Fact: An item I is valid for a viable prefix A if the DFA recognizing
	viable prefixes terminates on input A in a state containing I.

     - Notes:
	- An item is generally valid for many prefixes.
	  Example:  The item E -> (. E ) is valid for prefixes
		(
		(( 
		(((...

	- The states of the DFA are called the "canonical collections of items"
	 	or the "canonical collections of LR(0) items"

	- The book gives another way of constructing LR(0) items.

 V SLR Parsing Example

	- A SLR (Simple LR) parsing example was given in class using the
	  automaton above.  The SLR parsing algorithm is given at the
	  start of Lecture 11.

	- Consider int * int.

		Configuration		DFA Halt State		Action
		# int * int $		1			shift
		int # * int $		3 (* not in Follow(E')) shift
					(note we avoid mistake here)
		int * # int $		4			shift
		int * int # $		3 ($ is in Follow(E'))	red E' -> int
		int * E' # $		10			red E' -> int *E'
		E' # $			6			red E -> E'
		-- accept --