Interactive Mode Bison

By Satya Kiran Popuri. Sun Sep 10 18:04:34 CDT 2006

Introduction

Interactive mode Bison is a yacc grammar debugger. It allows you to test parse a sample input file with your grammar using a gdb style debugging interface. This way you can:
Interactive mode Bison is based on Bison version 2.3 sources. There are minimal changes to the original Bison sources. I have added additional source files to handle user interaction and parsing test input files.

Note:
Interactive mode Bison is still an experimental system and is work in progress. If you want to contribute code, please send a patch to me.

Download

New - Browse the sources at Google Project hosting

Click here to download the source tar ball. To compile sources perform the usual dance:
$ tar -xvjf iBison.tar.bz2
$ cd iBison/
$ ./configure
$ make
$ sudo make install
It is up to you to install it or not. Its just the usual Bison 2.3 installation except that you get an extra command line option (-i) for interactive mode parsing. Interactive mode Bison can also be used by executing the generated binary from the src/ directory.

Using Interactive mode Bison

You can start interactive mode bison by supplying a valid bison grammar file as input on the command line.
$ bison -i parser.y

Welcome to Bison 2.3 interactive mode.

(interactive)_

This will present you with a prompt where you can enter commands. A simple < Enter > keystroke will repeat the last command. The following commands are currently available:

test < inputfile > - Start running a parse on the input file.
next - Single step through a parse.
rule < ruleno > - Display a rule.
state < stateno > - Display rules in a state along with position of the dot.
token < token_string > - Supply a token to the parser.
stack - Display the state and token stacks.
break < token-string > - break at first occurrence of token-string from this point.
lexer <lexer module> - Load a lexical analyzer dynamically.

Interactive mode depends on the user to supply tokens for parsing. There are two ways to do this:

(1) Load a lexical analyzer dynamically - If you have a flex specification for your grammar, you must compile the output of flex with gcc -shared switch to produce a shared object file. Your flex specification should NOT contain a main () function.

$ flex lexer.l
$ gcc -c lex.yy.c
$ gcc -shared -o lex.so lex.yy.o
Now you are ready to load lex.so inside bison interactive mode. Make sure that your environment variable LD_LIBRARY_PATH includes the directory in which lexer.so is present. Otherwise it is not possible to load the lexer. The 'lexer' command loads a dynamic lexer module:
(interactive) lexer lex.so

(2) In case you don't have a flex specification, you can supply tokens manually by using the token command. You must supply the string used in %token declaration inside the grammar file. For example if you have declared:

%token TOK_INT
then you have to use token command like this:
(interactive) token TOK_INT
This command can also be used to override the tokens supplied by a dynamically loaded lexer. The most recent token will always be used for the next parse step. The interactive parser breaks at every one of the following steps unless a break point is set:

1. A token is required by the parser.
2. A token is read from the (dynamically loaded) lexical analyzer.
3. The parser performed a SHIFT operation.
4. The parser performed a REDUCE operation.

The parser terminates when it has accepted the input or an error has occurred.

A sample session

Here is small demo of how you can use interactive mode bison. This is a very simple classic desk calculator example.

Step 1: write your lexer

Write the flex specification for your language. My scanner for the calculator (calclexer.l) looks like this:
%{
# include "calcparser.tab.h"
# undef yywrap
# define yywrap() 1
%}

%option noyywrap

int   [0-9]+
blank [ \t]

%%
{blank}+
[\n]+
\+         return OPERATOR_PLUS;
-          return OPERATOR_MINUS;
\*         return OPERATOR_MULT;
\/         return OPERATOR_DIV;
\(         return LPAREN;
\)         return RPAREN;
{int}      return NUMBER;
.          {fprintf(stderr,"Invalid token!\n");}
%%
Note that I have included calcparser.tab.h which I plan to generate using bison -d option. Download this file.

Step 2: write the grammar spec

Now prepare your grammar in bison format. The calculator grammar (calcparser.y) looks like this:
%{
/* A simple calculator */
void yyerror(const char *s);
%}
%token        NUMBER
%token        OPERATOR_PLUS
%token        OPERATOR_MINUS
%token        OPERATOR_MULT
%token        OPERATOR_DIV
%token        LPAREN
%token        RPAREN
%left OPERATOR_PLUS OPERATOR_MINUS
%left OPERATOR_MULT OPERATOR_DIV
%%
exp: exp OPERATOR_PLUS exp
   | exp OPERATOR_MINUS exp
   | exp OPERATOR_MULT exp
   | exp OPERATOR_DIV exp
   | LPAREN exp RPAREN
   | NUMBER
;
%%
void
yyerror(cosnt char *s)
{
  fprintf(stderr,"Syntax error: %s\n",s);
}
Download this file.

Step 3: Compilation

Now generate the calcparser.tab.h header file for use with the lexer, compile the lexer into a shared object file and you are set.
$ bison -d calcparser.y
$ flex calclexer.l
$ gcc -c lex.yy.c
$ gcc -shared -o calclexer.so lex.yy.o
$ export LD_LIBRARY_PATH="."

Step 4: Start using interactive mode Bison

Prepare an inputfile with a sample string you expect to be in your language. Here is a sample input I used. The input file had just one line in it:
$ cat >calcinput
(8+6)*9
^d
Now you can start bison with the -i switch like this:
$bison -i calcparser.y

Welcome to Bison 2.3 interactive mode. Type "help" for assistance.
Please report bugs to spopur2@uic.edu.

(interactive) lexer calclexer.so

Lexer loaded successfully.

(interactive) test calcinput

Opening file...ready to test.

(interactive) next

Reading a token...next token is: ( (LPAREN)

(interactive)

Token is shifted. Entering state 2

Stacks:(states, tokens)
0 2
(

(interactive)

Reading a token...next token is: 8 (NUMBER)

(interactive)

Token is shifted. Entering state 1

Stacks:(states, tokens)
0 2 1
( 8

(interactive)

Popping stack: 8

Reduced by rule #6
exp: NUMBER

Stacks:(states, tokens)
0 2 4
( exp

(interactive)

Reading a token...next token is: + (OPERATOR_PLUS)

(interactive)

Token is shifted. Entering state 6

Stacks:(states, tokens)
0 2 4 6
( exp +

(interactive)

Reading a token...next token is: 6 (NUMBER)

(interactive)

Token is shifted. Entering state 1

Stacks:(states, tokens)
0 2 4 6 1
( exp + 6

(interactive)

Popping stack: 6

Reduced by rule #6
exp: NUMBER

Stacks:(states, tokens)
0 2 4 6 11
( exp + exp

(interactive)

Reading a token...next token is: ) (RPAREN)

(interactive)

Popping stack: exp      +       exp

Reduced by rule #1
exp: exp OPERATOR_PLUS exp

Stacks:(states, tokens)
0 2 4
( exp

(interactive)

Token is shifted. Entering state 10

Stacks:(states, tokens)
0 2 4 10
( exp )

(interactive)

Popping stack: )        exp     (

Reduced by rule #5
exp: LPAREN exp RPAREN

Stacks:(states, tokens)
0 3
exp

(interactive)

Reading a token...next token is: * (OPERATOR_MULT)

(interactive)

Token is shifted. Entering state 8

Stacks:(states, tokens)
0 3 8
exp *

(interactive)

Reading a token...next token is: 9 (NUMBER)

(interactive)

Token is shifted. Entering state 1

Stacks:(states, tokens)
0 3 8 1
exp * 9

(interactive)

Popping stack: 9

Reduced by rule #6
exp: NUMBER

Stacks:(states, tokens)
0 3 8 13
exp * exp

(interactive)

Popping stack: exp      *       exp

Reduced by rule #3
exp: exp OPERATOR_MULT exp

Stacks:(states, tokens)
0 3
exp

(interactive)

Reading a token...next token is: <> ($end)

(interactive)

String is accepted.

$_
Note that once your enter the 'next' command, you just need to press the <Enter> key to repeat the command. Also, you can observe the states with the 'state' command whenever you want during the execution of the test parse.

Want to improve this?

The ultimate goal of this effort is to produce a neat IDE for Bison grammars (Visual Bison) with specs as listed here. If you want to add more features or fix bugs etc. to this program, please send a patch to spopuri [at] cs [dot] uic [dot] edu. Thanks!