Introduction. Here is the
second (of two) assessed exercises. It consists of three tasks.
- Task 1 (worth 40%) is here.
- Task 2 (worth 45%) is here.
- Task 3 (worth 15%) is here.
The necessary Java files for all three tasks can be found for
convenient download here:
The webpages describing the three tasks also feature download links to
individual files. It is probably easier to download them in bulk
When and how to submit. The second assessed exercise is due
Fri, 20 Dec 2019, 18:00. Note that this is
after term has finished (you are to submit earlier). Submission is
electronic, so no need to be on campus. To assure anonymity, please
submit via the course's Canvas page. For submission guidelines
see here.
Your task is to implement, in Java, code generators. As source
language we take subsets of (a minor variant of) the simple language
that we generated code for in the lectures (see
the lecture
notes, Page 5). As usual, the language is represented by
ASTs. The target language is RISC-V machine code.
Background. It might be
interesting to see the language you will have to compile in it's
conventional (non-AST) form. Here it is as a context free grammar.
PROG → DEC | DEC; PROG
DEC → def ID (VARDEC) = E
VARDEC → ε | VARDECNE
VARDECNE → ID | VARDECNE, ID
ID → ... (identifiers)
INT → ... (Integers)
E → INT
| ID
| if E COMP E then E else E endif
| (E BINOP E)
| (E)
| skip
| (E; E)
| while E COMP E do E endwhile
| repeat E until E COMP E endrepeat
| ID := E
| ID(ARGS)
| break
| continue
ARGS → ε | ARGSNE
ARGSNE → E | ARGSNE, E
COMP → == | < | > | <= | >=
BINOP → + | - | * | /
The ASTs for this language can be
found here (and in the Zip file /
Github above).
Do note change these definitions at all!
(except by adding new methods). If you change them in a way that my
test suite will not compile, you will get 0 points.
An important difference from the
pseudo-code used in the lectures is that in declarations we use
numbers as variables (instead of strings). Variable 1 refers to the
first (leftmost) variable in the declaration, variable 2 to the second
from the left and so on. For example the procedure
def f(x,y,z) = { if x == y then z else 0 }
becomes
Declaration ( "f",
3,
new If ( new Variable ( 1 ),
new Equals (),
new Variable( 2 ),
new Variable( 3 ),
new IntLiteral ( 0 ) ) );
The meaning of break and continue will be explained in
Task 3. Note that you don't have to
write a lexer or parser for this language, only a code generator. All
your code generators will use the following given classes.
Notes on the implementations. It is your task to implement
the code generators in Java. You will have to use the following interface and
exception to implement the code generators.
class CodegenException extends Exception {
public String msg;
public CodegenException ( String _msg ) { msg = _msg; } }
interface Codegen {
public String codegen ( Program p ) throws CodegenException; }
The RISC-V code to be produced is returned by all these methods as a
string. Don't print out the generated code. Just return it as a
string.
You can download
these here (and in the Zip file /
Github above). Do note
change these definitions at all! If you change them, my test
suite will not compile and you will get 0 points. I will instantiate
your code generators with the following classes, one for each
task.
class Task1 {
public static Codegen create () throws CodegenException { ... } }
class Task2 {
public static Codegen create () throws CodegenException { ... } }
class Task3 {
public static Codegen create () throws CodegenException { ... } }
(You find these three code fragments
here,
here and
here, and in the Zip file /
Github above.) To implement e.g. Task 1, replace
the ... with appropriate code that instantiates
the Codegen interface
(see here for a
convenient suggestion how to implement
the create method -- there are other ways). If you
don't want to do a task, just throw a CodegenException
exceptions in create instead.
The
codegen method is intended to generate code for a whole
p of type Program . The code generated for
a whole program p includes preamble, data layout
directives etc. The generated code should be valid and executable
RISC-V code. You need to add alignment instructions and ensure that
upon start of your program control is handed to the initial program,
and upon termination control is handed back.
The code generators for Task 1 and Task 2 work only on subsets of the
language. The test suite I will use for Task 1 will only use ASTs
that are valid programs in the subset of Task 1. The test suite I
will use for Task 2 will only use ASTs that are valid programs in the
subset of Task 2. So it doesn't matter what your code generators for
these tasks do when they encounter ASTs that are not valid in the
relevant subsets. For example if you do Task 1 and 2, it makes sense
to write a code generator for Task 2, and then
let Task1.create as well as Task2.create
return the same code generator.
The generated code must do the following.
- Be syntactically correct, meaning is must compiler.
- Generate syntactically correct code, meaning generated code must
run in the RARS emulator.
- Implement the accumulator machine described in the lectures. You should use
a0 as accumulator. The final result of the program
should also be in a0 . If you don't want to implement
an accumulator machine, then please contact me beforehand and
discuss this matter with me.
- Implement the meaning of the program correctly, i.e. 4+4 must evaluate
to 8 and so on.
- Should not lead to an error when being executed, e.g. due to memory
misalignment.
- Integers should be mapped to RISC-V 32 bit integers, as described in the
lectures. Note that you don't have to care about integer
overflow. All integers occurring in test-data will be sufficiently
small to avoid overflow.
- In particular, as an elaboration of the previous point, integer
multiplication and integer division on RISC-V can conceivably overflow
the target register. All tests cases regarding integer multiplication
and integer division will be small enough to fit into a 32 bit
register (e.g. 4*2 or 7/-3).
You can assume that each program that I use to test your submission
definition defines at least one procedure. You can also assume that
all declarations in a program have distinct names. All programs are
valid, e.g. contain no type errors. Moreover, the first declaration
in the declaration list of a program takes 0 arguments. This first
declaration is to be executed first, i.e. it is the procedure that
should be executed when control is handed over to the code your code
generator produces. Note that the first procedure does not have
to be called main .
Note that the test suite will call the codegen
multiple times on the same object of type Codegen .
Use the
provided CodegenException to signal errors or problems in
your code generator. Do not use other means of signalling errors. In
particular, please don't abort the program if you encounter a
problem.
How to avoid common mistakes.
Here is a list of suggestions that help you avoid common mistakes, all
of which I've seen in plenty of submission last year:
- You'll need a function that generates fresh labels. Don't hardcode
labels, ensure that the function generates a fresh label every time
it is called. Don't use a random generator for this, use a
global/static variable for this purpose. This is one of the few
legitimate uses of global/static variables.
- If your sumission for the first assessed CW suffered from problems
related to using instance variables where you should have used
method-specific variables, ensure that you don't make this mistake
again.
- The RISC-V assumbly your submission needs to generate a string, which
each new assembly command being on a new line. It's easy to forget a
"\n" here or there, if you add them 'by hand'. Hint: don't put them
in by hand. Have your code generator generate an auxiliary list (or
similar) of RISC-V commands and then transform this into a string
where the "\n" are automatically added.
- Check that your submission is not miscompiling conditionals like
if /then /else or loops in the
sense that if the condition evaluates to true then
the else -branch is executed and vice versa.
- Repeated implementations of the translation of conditionals: The
source language has several places where conditionals need to be
translated:
if , while ,
and repeat . I recommend not to repeat the code
generator for conditional three times, but have an auxiliary
function generating the code, and calling that auxiliary method
for if , while
and repeat . Note that the conditional for repeat is the
'other way round' in comparison with while so a joint
translation method would a way of handling the two different ways of
translating conditionals.
- I suggest to be careful about the evaluation order in the
translation of procedure invocation: should it be left-to-right or
right-to-left? What is the evaluation order in the lecture notes?
- I strongly suggest to write and submit only one code generator
that coveres all the features you want to implement. It's a bad idea
to submit e.g. a different code generator for Task 1 and for Task 2
(e.g. commenting out Task 2 features in the code generator for Task
1).
- I suggest that you split your code generator into at least the
following three
methods:
codegenProg , codegenDecl and
codegenExp that translagte programs, declarations and
expressions, respectively. You are welcome to have more (e.g. for
comparison operations and arithmetic expressions).
- I suggest to ensure that your submission isn't 'noisy', meaning what
you submit does not emit debug output: the problem with debug output
is that it might be interpreted as RISC-V assembly which leads to
errors.
-
I strongly recommend testing your code before submission. Note that
cutting/pasting code into RARS is insufficient as test, because
that doesn't test the termination of the code generated by your
code generator (think: returning a result (in the accumulator) to
the OS).
Test suite. A small test suite
that you can use is available
here.
|