Language Syntax

Lexical description.

Token types / Lexemes

The lexical units of the language are integers, special notation, identifiers, keywords, and white space. Any input string that contains only those components is lexically valid.

Note that because whitespace doesn't belong to any token, a whitespace character will always separate tokens on either side of it — whatever (non whitespace) is to the left of a whitespace must be part of a different token than whatever is on the right of the whitespace. However: two distinct tokens are not always separated by whitespace — for example:

Similarly, because the operator and delimiter characters cannot be a part of any identifier or integer literal, they also serve to separate such tokens from one another.

Disambiguation

The rules above are ambiguous. To disambiguate, use the following two policies.

For example: the string deff should be lexed into a single identifier (and not, e.g., the token def followed by the identifier f.) Similarly, === must be lexed into a token representing == followed by a token representing =, and not as the other way round, or as three occurences of =.

Syntax description

Here is the language syntax, given by the following context free grammar with initial non-terminal PROG, where ε stands for the empty production.

PROG →DEC | DEC PROG
DEC →TYPE IDFR (VARDEC) BLOCK
VARDEC →ε | VARDECNE
VARDECNE →TYPE IDFR
|VARDECNE, TYPE IDFR
BLOCK →{ ENE }
ENE →EXP | EXP; ENE
EXP →IDFR
|INTLIT
|IDFR := EXP
|(EXP BINOP EXP)
|IDFR (ARGS)
|BLOCK
|if EXP then BLOCK else BLOCK
|while EXP do BLOCK
|repeat BLOCK until EXP
|skip
ARGS →ε | ARGSNE
ARGSNE →EXP | ARGSNE, EXP
BINOP →==  |  <  |  >  |  <=  |  >=
| +  |  −  |  *  |   /  |  &&  |  ||  |  ^^
TYPE →int | bool | unit
IDFR →(an identifier)
INTLIT →(an integer)

Execution model

The programming language describes a programming language, in which all arguments to functions are passed by value, and the only identifiers which are defined in any function scope, are the names of other functions (which may be defined before or afterwards) and the parameters taken by the current function. In particular, no side-effects are allowed, and no variables can be declared apart from the parameters of the current function.

The output of a function is the value of the code-block given by its function body; which is to say, the value of the final expression in the function body.

The main program is described by a function with name and signature int main(), which may occur anywhere in the function declarations. This function may perform a simple computation with constants, or call another function on some constants, to evaluate some integer value.

Example programs

Here are some example programs in the language:

Example #1

  int fun(int x, int y, int z) {
    if (x == y) then { z } else { 0 } }

  int main() { fun(1, 2, 3) }

Example #2

  int main() { fibo(10) }
  int fibo(int n) {
    if (n < 2)
    then { n } 
    else { (fibo((n - 1)) + fibo((n - 2))) } }

Example #3

  unit doLoop (int i, int a) {
    while (i <= 100) do {
      a := (a + i);
      i := (i + 1) } }

  int main() {
    doLoop(0, 5);
    1337 }

Example #4

  int main() { fact(10) }

  int fact(int n) {
    if (n == 0)
    then { 1 } 
    else { (n * fact((n - 1))) } }