COMP 3002 Assignment 2

For this assignment you will be modifying and extending the cmm programming language that can be downloaded here. Remember that the token specifications are stored in cmm.t and the grammar is stored in cmm.g. Remember, also, that you should not edit any of the files generated by the parser generator SiCC.

To compile and run this code use commands like the following:

morin@miniscule: java -cp SiCC.jar SiCC --prefix CMM cmm.t cmm.g 
SiCC finished with no errors.
morin@miniscule: javac *.java
morin@miniscule: java CMM simple.cmm 
Program parsed successfully - attempting to run
Program output:
24.0
0.0
1.0
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0
Program value: true
morin@miniscule: 
  1. [5 marks] Add arbitrary-base numeric constants like those in Smalltalk. These use the notation <base>r<number> where <base> denotes the base to use (expressed in decimal) and r<number> is the actual number (expressed in the appropriate base). For example: The <base> should be between 1 and 29 and the characters a-z or A-Z can be used to denote decimal numbers with A=10, B=11, and so on.
  2. [5 marks] Currently, the interpreter implements while loops, but not do loops. Implement these (this only requires changes to the interpreter, not the grammar).
  3. [5 marks] Currently, the interpreter does not implement if-then-else branching. Implement these (this only requires changes to the interpreter, not the grammar).
  4. [5 marks] Add the unary logical negation operator !, just like in C.
  5. [5 marks] Add the ternary choice ?: operator as you did in Assignment 1. It should have the lowest level of precedence out of all operators.
  6. [5 marks] Add a binary . operator that takes two string arguments and evaluates to their concatenation. For example,
    string x, y, z;
    x = "Hello";
    y = "World!";
    z = x . " " . y;   // z = "Hello World!"
    
  7. [5 marks] Modify the definition of string constants to allow implicit concatenation by simply writing one string constant after the other. For example
    "This is my"   " compound string"
    
    would be equivalent to
    "This is my compound string"
    
    Before you begin, be sure to decide whether you will do this simply by modifying the definitions of <STRING> tokens or whether it will take something more.
  8. [5 marks] Add the ability to assign constants to variables when they are declared like in C. This is to allow expressions like
    number x = 59, y = 43;
    
    or
    string s = "hello", y = "goodbye";
    
  9. [5 marks] Add the subscript operator (surprise, like in C) that work on string variables or constants so that we can have expressions like
    n = s[52];
    n = "my long string"[0];
    
    The operator takes a number as an argument and returns a number equal to the ASCII value of the string. The argument is converted to an integer by rounding to the nearest integer. Indexing is 0-based so the first letter of a string x is x[0].
  10. [5 marks] Add C-like for loops of the form for (init; cont; incr) { body } where init, cont, and incr are simple statements and cont must evaluate to a boolean value.
  11. [5 marks] Add a unary $ operator that has the highest level of precedence. The operator evaluates to its argument and has the side effect of printing its argument on System.out followed by a newline. It could be used as follows, for example:
    $"Hello world";      // prints "Hello world\n"
    $($5 + $10);         // prints "5\n10\n15\n"
    $x;                  // prints the value of variable x
    
    This operator will be extremely useful in debugging the next question.
  12. [5 marks] Currently, the a program is just a sequence of functions, and functions are a sequence of statement. Implement changes to allow nested functions, so that functions can be defined inside of other functions. The scoping rules are just like those for local variables. A function defined this way is only accessible until its enclosing stack fram is popped. Here is an example:
    number f(number a, number b) {
      string f2(string a, string b) {
        return "my string f2";
      }
      while (1 < 2) {
        string f3() {
          return "Hello world!";
        }
        f3();  // ok, f3 is in scope
      }
      f3(); // not ok, f3 is out of scope
    }
    
  13. [15 marks] This part of the assignment is setting us up for Assignment 3.

    Write a CMMTypeCheckVisitor class that statically checks the types of everything. For example, it should check that the arguments to numerical operators +-*/%<> are numbers. It should check that the arguments to . are strings. It should check that the arguments to &| are booleans. It should check the arguments to functions are of the correct type. It should check that the return statement in any function is of the correct type. It should do any other typechecking that is possible without actually running the program.

    To do this part of the assignment, you need to devise a type system. For each type of node in the AST you need to decide what types are required for its children and what type (if any) the node itself evaluates to. Once you've done this, it simply a matter of traversing the parse tree.