EECS 473 - MP5: Semantic Verification using Bison

Due: April 19, 2001 April 23, 2001 at 11:59 pm

Update 4/16/2001 - Getting it done in two Weeks

In order to get a compiler written before the end of the semester, we need to scale back the assignment.
  1. Remove all functions and methods from the grammar. The only function will be the main() function which will have a return type of void. This impacts the rest of the grammar in the following ways:
    1. No prototypes or function calls needed.
    2. No return statement needed.
    3. Classes become the equivelant of C structs. All data members are public, so the private/public keywords are not needed.
    4. The local variables of main() and the global variables can be at the same scope.
    5. A global scope and class scopes are the only scoping levels that need exist.
  2. We will remove pointer elements (the main purpose was pass-by-reference parameters). This removes the -> operator and the unary * operator.
  3. We will also remove the if statement. The inclusion of the while statement introduces all compiler elements used by the if statement.
  4. We will also remove arrays from the problem (4/18/2001)

Original Write-up

For this program you are to add in semantic checking to the parser that was created in MP 4 (see end of this page). The primary checking is with the use of identifiers. Other checks will include basic type checking.

If the program input to your parser has no syntax errors, but has at least one semantic errors, your parser is to print an error message about each semantic error encountered. These error messages should be descriptive (include a line number where the error was found, an element involved with the error and an short description of what the error is).

As soon as a syntax error is encountered, your program may halt execution with some approriate error message. No recovery is needed for syntax errors.

EECS 473 - MP4: Syntax Verification using Bison

Due: April 5, 2001 April 9, 2001 at 11:59 pm (10 pts extra credit if turned in by April 6, 2001 at 11:59 pm)

For this assignment, you will use Bison with your Flex Program from MP3 to verify the Syntax for the following grammar. This program is to do syntax checking only and NO semantic checking (that comes in MP 5).

Bison can be found at the following location on the EECS machines

	/usr/local/gnu/bin/bison
You will need to add this to your path, if it does not already exist (and you don't what to type out the entire pathname everytime you wish to run the program).

The Grammar for this assignment is as follows. Note: the grammar is written using notation for regular expressions.

LHS of RuleRHS of Rule
Start -> (varDecl | classDef | functDef | prototype ) *
varDecl -> type varList SEMI
type -> int | char | ident
varList -> varList, varIdent | varIdent
varIdent -> [STAR] ident [ OPENSB num CLOSESB ]
classDef -> class ident OPENCB ( [ public COLON | private COLON | E ] varDecl | functDef )* CLOSECB
functDef -> returnType ident OPENPAREN FormalParamList CLOSEPARAN OPENCB body CLOSECB
prototype -> returnType ident OPENPAREN ProtoParamList CLOSEPARAN SEMI
returnType -> void | type [STAR]
FormalParamList -> E | type [ STAR ] ident [ COMMA type [STAR] ident ]*
ProtoParamList -> E | type [ STAR ] [ident] [ COMMA type [STAR] [ident] ]*
body -> ( varDecl | prototype | statement )*
statement -> lhsIdent EQUAL expr SEMI |
if OPENPAREN expr CLOSEPAREN statement [ else statement ] |
while OPENPAREN expr CLOSEPAREN statement |
OPENCB statement* CLOSECB |
return expr SEMI |
cout [ DOUBLELESSTHAN expr ]+ SEMI |
cin [ DOUBLEGREATERTHAN lhsIdent ]+ SEMI |
fcall SEMI
fcall -> lhsIdent OPENPAREN CallParamList CLOSEPAREN
CallParamList -> E | expr [ COMMA expr ]*
lhsIdent -> ident | ident OPENSB expr CLOSESB | lhsIdent PERIOD lhsIdent | lhsIdent ARROW lhsIdent | STAR lhsIdent
expr -> fcall | NUM | CHAR | STRING | OPENPAREN expr CLOSEPAREN | expr BOP expr | UOP expr | rhsIdent
rhsIdent -> ident | ident OPENSB expr CLOSESB | rhsIdent PERIOD rhsIdent | rhsIdent ARROW rhsIdent | STAR rhsIdent | AMPERSAND rhsIdent
BOP -> + | - | STAR | / | % | > | >= | < | <= | == | != | && | DOUBLEBAR
UOP -> + | - | !
EQUAL -> =
AMPERSAND -> &
DOUBLEBAR -> ||
COMMA -> ,
PERIOD -> ,
ARROW -> ->
SEMI -> ;
STAR -> *
OPENSB -> [
CLOSESB -> ]
OPENPAREN -> (
CLOSEPAREN -> )
COLON -> :
OPENCB -> {
CLOSECB -> }
DOUBLELESSTHAN -> <<
DOUBLEGREATERTHAN
->
>>

The bison program is to be given the name of the program/file to analyse through the command line.

Your program will be submitted electronically using turnin and must run on the EECS department computers. You must also submit a make file to compile your program. Your program is to be the result of individual work and is expected to be written using good programming style. You are to use the project name of mp4 when using turnin.

Added 4/4/2001 - Based from Discussion During Class

The following is an attempt at creating an LALR grammar based from the above grammar as was discussed during class. Note: even though this is listed using almost yacc/bison syntax, do not take this a being 100% correct. It has not been tested (plus I have to leave something to you :) ). Non-terminals will be written starting with an upper-case letter (unless I make a mistake). Also operator symbols will be given "as is".
Start 	:	Gitem
	|	Start Gitem
	;

Gitem	:	class ident { Classbody }
	|	void ident ( Pplist
	|	void ident ( Fplist
	|	Type Optstar ident Girest
	;

Type	:	int
	|	char
	|	ident
	;

Optstar	:	/* empty */
	|	*
	;

/* See other form at bottom the should be 100% LALR - 4/6/2001 */
Girest	:	Optarray Vardecl
	|	( Pplist
	|	( Fplist
	;

Optarray	:	/* empty */
	|	[ num ]
	;

/* Added Optarray in second rule 4/6/2001 */
Vardecl	:	;
	|	, Optstar ident Optarray Vardecl
	;

Fplist	:	) { Body }
	|	Fplist1 { Body }
	;

Fplist1	:	Type Optstar ident )
	|	Type Optstar ident , Fplist1
	;

Pplist	:	) ;
	|	Fplist1 ;  /* idents always given */
	|	Pplist1 ;  /* idents never given */
	|	Pplist2 ;  /* idents not given, given, opt given */
	|	Pplist3	;  /* idents given, not given, opt given */
	;

Pplist1	:	Type Optstar )
	|	Type Optstar , Pplist1
	;

/* Note: this may not be LALR */
Pplist2	:	Type Optstar , Type Optstar Pplist2a ident Pplist4 )
	;

Pplist2a	:	/* empty */
	|	, Type Optstar Pplist2a
	;

/* Note: this may not be LALR */
Pplist3	:	Type Optstar ident , Type Optstar Pplist3a Pplist4 )
	;

Pplist3a	:	/* empty */
	|	ident , Type Optstar Pplist3a
	;

Pplist4	:	/* empty */
	|	, Type Optstar Pplist4
	|	, Type Optstar ident Pplist4
	;

Classbody	:	Citem
	|	Classbody Citem
	;

Citem	:	Optscope void ident ( Fplist
	|	Optscope Type Optstar ident Cirest
	;

Optscope	:	/* empty */
	|	public :
	|	private :
	;

Cirest	:	Optarray Vardecl
	|	( Fplist
	;

Body	:	Bitem
	|	Body Bitem

/* not sure if this is LALR */
Bitem	:	void ident ( Pplist
	|	Type Optstar ident ( Pplist
	|	Type Optstar ident Optarray Vardecl
	|	Statement

Statement	:	;
	|	Lhsid = Expr ;
	|	Lhsid ( Cplist ;
	|	if ( Expr ) Statement
	|	if ( Expr ) Statement else Statement
	|	while ( Expr ) Statement
	|	return expr ;
	|	cout Couts ;
	|	cin Cins ;
	|	{ Stlist }
	;

Stlist	:	/* empty */
	|	Statement Stlist
	;

Cplist	:	) 
	|	Cplist1 

Cplist1	:	Expr )
	|	Expr , Cplist1

Lhsid	:	Lhsbase
	|	Lhsid -> Lhsbase
	|	Lhsid .  Lhsbase
	|	* Lhsid
	;

Lhsbase	:	ident
	|	ident [ Expr ]
	;

Couts	:	/* empty */
	|	<< Expr Couts
	;

Cins	:	/* empty */
	|	>> Lhsid Cins
	;

Expr	:	Expr || Expr1
	|	Expr1
	;

Expr1	:	Expr1 && Expr2
	|	Expr2
	;

Expr2	:	Expr2 == Expr3
	|	Expr2 != Expr3
	|	Expr3
	;

Expr3	:	Expr3 <  Expr4
	|	Expr3 <= Expr4
	|	Expr3 >  Expr4
	|	Expr3 >= Expr4
	|	Expr4
	;

Expr4	:	Expr4 +  Expr5
	|	Expr4 -  Expr5
	|	Expr5
	;

Expr5	:	Expr5 *  Expr6
	|	Expr5 /  Expr6
	|	Expr5 %  Expr6
	|	Expr6
	;

Expr6	:	! Expr7
	|	+ Expr7
	|	- Expr7
	;

/* Change to "charconst" to distinguish from keyword "char" - 4/6/2001 */
Expr7	:	( Expr )
	|	Lhsid
	|	Lhsid ( Cplist
	|	& Lhsid
	|	num
	|	charconst
	|	string
	;
This following is an update on the prototype vs fumction parameter list. This should be LALR. The following rules would replace Girest, Pplist, Pplist1, Pplist2, Pplist2a, Pplist3, Pplist3a and Pplist4 from above (Fplist is still used in the Classbody).
Girest	:	Optarray Vardecl
	|	( Plist
	;

Plist	:	) Forp
	|	Type Optstar ) ;
	|	Type Optstar ident ) Forp
	|	Type Optstar , Type Optstar Pl1 ) ;
	|	Type Optstar , Type Optstar Pl1 ident Pl3 ) ;
	|	Type Optstar ident , Type Optstar Pl2 ident ) Forp
	|	Type Optstar ident , Type Optstar Pl2 Pl3 ) ;
	;

Pl1	:	/* empty */
	|	, Type Optstar Pl1
	;

Pl2	:	/* empty */
	|	ident , Type Optstar Pl2
	;

Pl3	:	/* empty */
	|	, Type Optstar Optident Pl3
	;

Optident	:	/* empty */
	|	ident
	;

Forp	:	;
	|	{ body }
	;

Pplist	:	) ;
	|	Type Optstar Optident Pl3 ) ;
	;