Programming Languages and Compilers

 

Lecture 5

 

In this lecture we will look at LR parsers and parser generators.

We will also look a bit more at programming language syntax issues and programming language design in general.

 

The slides for this lecture can be found here.

 

Literature

 

Sebesta, section 4.5 and chapter 5

 

The web article: “Writing a compiler with SableCC”, by Fidel Viegas:

http://www24.brinkster.com/araujofh/compiler/sablecc.asp

 

The CUP user’s manual, by Scott Hudson. The manual can be downloaded from

http://www.cs.princeton.edu/~appel/modern/java/CUP/manual.html

 

Background material and further recourses, including the SableCC can be found on the following URL:

http://www.sablecc.org/

 

A version of SableCC for C++, C# and xml can be found on this URL:

http://www.mare.ee/indrek/sablecc/

 

The paper: “SableCC, an object-Oriented Compiler Framework”, by Etienne M. Gagnon and Laurie J. Hendren, Sable Research Group, McGill University, Canada. The paper can be downloaded from http://www.sablecc.org/downloads/tools-98.pdf

 

The JLex and CUP systems can be found on the following URLs:

 

http://www.cs.princeton.edu/~appel/modern/java/JLex/

 

http://www.cs.princeton.edu/~appel/modern/java/CUP/

 

You can find a package of compiler tools for the .Net platform, all written in C#, including Lex and Yacc, called lg for LexerGenerator and pg for ParserGenerator on Malcolm Crowe’s homepage

 

http://cis.paisley.ac.uk/crow-ci0/index.htm#Research

 

The tools are on the URL:

 

http://cis.paisley.ac.uk/crow-ci0/CSTools45.zip

 

As background reading on LR parsers and the above systems I can recommend chapter 2 and chapter 3 of Andrew Appel’s book: “Modern compiler implementation in Java (Second edition)” from Cambridge University Press.

 

The definition of LR and LALR is usually given operationally, i.e. if you can construct a LALR or LR automata that recognise a language, then that language is LALR respectively LR. You can find a formal definition of LR and LALR in the note

 

Quick Review of LR(k) and LALR(k) Parsing Theory

 

A pragmatic solution to check if a language is LALR (or LL for that matter) is to use a tool like SableCC (or in the case of LL use JavaCC). If the tool accepts your language, it is LALR (or LL) and if the tool doesn’t accept your language, it usually give some error messages that will direct you towards the problem area of your grammar – but unfortunately not always.

 

Exercises

 

Exercises for lecture 5 will be done from 12.30 till 14.14 before Lecture 6 on Tuesday the 2nd of March.

  1. Download and install CUP.

Try CUP (and JLex) on the MinimalExample http://www.cs.princeton.edu/~appel/modern/java/CUP/minimal.zip

  1. Download and install SableCC.

Try SableCC on the postfix grammar and on the SmallLang grammar.

What happens if you add an if-then-else production to the SmallLang grammar?

Try SableCC on a bigger language, either your own language, MiniTriangle (see MiniTriangle.htm) or MiniJava (see http://geezy.cs.purdue.edu/~samanta/MCIIJ2E/grammar.html ).

  1. Try CUP and JLex on a bigger example, either your own language or http://www.cs.princeton.edu/~appel/modern/java/CUP/javagrm.zip

 

  1. Give a short description of the computers you know, including hardware, operating system and runtime system virtual machines, such as Intel based PCs, Unix and the JVM. How have these machines influenced programming language design? To solve this exercise you may want to consult the web site http://cne.gmu.edu/itcore/virtualmachine/index.htm

 

  1. Do exercise 4 in Chapter 5 of Sebesta on page 226.

You may consider the following Java code

 

public static int increment(int x) {

            return x + 1;

}

 

What is the binding time of

        Value of argument x?

        Set of values of argument x?

        Type of argument x?

        Set of types of argument x?

        Properties of operator +?

 

  1. What is the scope of the various x’s in the following C program?

You may want to compile and execute the program.

 

//StaticScopingExample;

#include<stdio.h>

 

int x;

 

void Proc1(void)

{

  int x;

   

  x = 3;

  printf("In Proc1: x = %d\n", x);

}

 

void Proc2(void)

{

      x = 9;

    printf("In Proc2, x = %d\n", x);

}

 

int main(void)

{

  x = 1;

  printf("Before Proc1, x = %d\n", x);

  Proc1();

  printf("After Proc1, x = %d\n", x);

  Proc2();

  printf("After Proc2, x = %d\n", x);

  return 0;

}

 

  1. What is the scope of the various x’s in the following C Program?

You may want to compile and execute the program.

 

//DynamicScopingExample;

 

#include <stdio.h>

 

int x;

 

void Proc1()

{

    x = 1;

}

 

void Proc2()

{

  int x;

  x = 2;

  printf("In Proc2 before Proc1, x = %d\n", x);

  Proc1();

  printf("In Proc2 after Proc1, x = %d\n", x);

}

 

int main(void)

{

      x = 3;

      printf("Before Proc2, x = %d\n", x);

      Proc2();

      printf("After Proc2, x = %d\n", x);

      Proc1();

      printf("After Proc1, x = %d\n", x);

      return 0;

}