In this lecture we will look at LR parsers and parser
generators.
We will also look a bit more at programming language
design.
The slides for this lecture can be found here in ppt and here in pdf.
Sebesta, section 4.5
The web article: Compiler Design -
SableCC tutorial
The CUP user’s manual, by Scott Hudson. The manual can
be downloaded from
http://www.cs.princeton.edu/~appel/modern/java/CUP/manual.html
The web article Growing a Language:
A Conversation with Guy Steele
(For those of you who want to read more, see Guy
Steele’s OOPSLA98
paper)
The web article Python &
Java: Side by Side Comparison
I would recommend you take the time to look at the MSDN
TV: Whiteboard with Anders Hejlsberg
This is a good illustration of the kind of discussions
language designers may have.
Background material and further recourses, including
the SableCC can be found on the following URL:
A version of SableCC for C++, C# and xml can be found
on this URL:
http://www.mare.ee/indrek/sablecc/
The paper: “SableCC, an object-Oriented Compiler
Framework”, by Etienne M. Gagnon and Laurie J. Hendren, Sable Research Group,
The JLex and CUP systems can be found on the following
URLs:
http://www.cs.princeton.edu/~appel/modern/java/JLex/
(You may want to use JFlex
- The Fast Scanner Generator for Java instead of JLex)
http://www.cs.princeton.edu/~appel/modern/java/CUP/
The newest versions have moved to the following URL CUP)
You can find a package of compiler tools for the .Net
platform, all written in C#, including Lex and Yacc, called lg for
LexerGenerator and pg for ParserGenerator on Malcolm Crowe’s homepage
http://cis.paisley.ac.uk/crow-ci0/index.htm#Research
The tools are on the URL:
http://cis.paisley.ac.uk/crow-ci0/CSTools45.zip
As background reading on LR parsers and the above
systems I can recommend chapter 2 and chapter 3 of Andrew Appel’s book: “Modern
compiler implementation in Java (Second edition)” from Cambridge University
Press.
The
definition of LR and LALR is usually given operationally, i.e. if you can
construct a LALR or LR automata that recognise a language, then that language
is LALR respectively LR. You can find a formal definition of LR and LALR in the
note
Quick Review of LR(k) and LALR(k)
Parsing Theory
A pragmatic solution to check if a language is LALR
(or LL for that matter) is to use a tool like SableCC (or in the case of LL use
JavaCC). If the tool accepts your language, it is LALR (or LL) and if the tool
doesn’t accept your language, it usually give some error messages that will
direct you towards the problem area of your grammar – but unfortunately not
always.
Exercises for lecture 5 will be done from 12.30 till
14.15 before Lecture 6 on Monday the 6th of March.
The following exercises you may prefer to do on your
own, e.g. just after you have read the literature, and discuss the outcome with
your group:
Try CUP (and JLex) on the MinimalExample http://www2.cs.tum.edu/projects/cup/minimal.tar.gz
Try SableCC on the postfix
grammar and on the SmallLang grammar.
What happens if you add an if-then-else production to
the SmallLang grammar?
Try SableCC on a bigger language, either your own
language, MiniTriangle (see MiniTriangle.htm) or
MiniJava (see BNF
for MiniJava ).
The following exercises are best done as group
discussions: