Introduction. Compilers are programs that translate programs from a source language to a target language. Typically, the target language is low level, for example x86 or RISC-V machine code, and directly executable on a processor or a virtual machine. In contrast, the source language is usually high-level (e.g. Java, Haskell, Rust, Swift, Scala, or C), not directly executable on machines, and designed to be usable by humans. In other words, compilers bridge the abstraction gap between high-level languages that are convenient for us to use, and low-level languages that can be executed efficiently by machines. Compilers are arguably the most important tool for programmers, and consequently much effort has been expended since the dawn of the age of computing to produce high-quality compilers. Modern compilers are marvels of technology and typically thousands of person-years are invested in the production of a state-of-the-art compiler.

This course will explain how compilers work. The benefits of understanding how compilers work are multifarious.

  • A better understanding of programming languages in general. In particular, a clear understanding of the costs and benefits of using this or that programming construct.
  • An exposure to good software engineering practices, as modern compilers exhibit the benefits of clear modularisation and abstraction from low-level details in an exceptionally beautiful form.
  • Increasing familiarity with modern software engineering tools such as parser generators which is highly useful for building domain-specific languages.
  • Acquiring the ability to design and implement programming languages, a task that is frequently required in large software projects.
  • A better understanding of the underlying hardware.
  • All the above can be summarised by saying that a solid understanding of how compilers work helps us to become much better programmers.

This is a one-term level 3 course, taught in the autumn term. There are two lectures per week together with exercise classes and programming work for the practical exercises. I have a weekly office hours after the lectures – feel free to come along to ask questions about this course. If you want to meet met at other times, best to send me an email first. Also, feel free to email me if you want to ask questions. The course is assessed by coursework (50%) and by unseen examination (50%). The coursework consists of two parts. The first is a programming exercise to write part of a compiler and the second is another programming exercise (an extension of the first, to complete the compiler project). Due dates TBC.

Topics covered: This course will discuss the key ideas in implementation of compilers for modern programming languages such as Java, including lexical analysis, parsing, semantic analysis based on types and type checking, syntax-directed translation using abstract syntax trees, intermediate languages, and code generation. We will also discuss run-time issues such as garbage collection. Along the way we will look at key ideas in processor architecture such as stack machines and register machines, RISC vs CISC processors and processor caching.

Module learning outcomes. After taking the course students will be able to understand how modern programming languages are translated to machine code, and will be able to write their own compilers.

Times and locations. The lectures take place on Wednesdays 11:00-12:00 in PEV1-1A7, and Fridays 17:00-18:00 in RICH-AS3.

Pre-requisites: A solid grasp of Java programming is important for this course. This course is not teaching you how to program.

Important links for the course.

Online Material and Books: The course does not follow a set textbook. Instead, suitable lecture notes will be provided. Below is a small selection of supplementary reading material. Most go well beyond what we will cover in the course.

  • Modern Compiler Implementation in Java (second edition) by Andrew Appel and Jens Palsberg. This is my favourite.
  • Alex Aiken's online course on compilers from Stanford University. It goes considerably beyond our course, but you will find quite a bit of overlap between Aiken's and our course, as the former was brand new when I started teaching compilers here, and I've taken a lot of inspiration from Alex.
  • Compilers - Principles, Techniques and Tools (second edition) by Alfred V. Aho, Monica Lam, Ravi Sethi, and Jeffrey D. Ullman. The first edition of this book is is the classic text on compilers, known as the "Dragon Book", but its first edtiion is a bit obsolete. The second edition is substantially expanded and goes well beyond the scope of our course. For my liking this book is a bit on the long side.
  • Engineering a Compiler, by Keith Cooper, Linda Torczon.
  • Basics of Compiler Design by Torben Mogensen. It's free.
  • Computer Architecture - A Quantitative Approach (sixth edition) by John Hennessey and David Patterson. This is the 'bible' for computer architecture. It goes way beyond what is required for our course, but very well written by some of the world's leading experts on computer architecture. Well worth studying. This book contains a great deal of information about the MIPS architecture used in this course.
  • The RARS RISC-V emulator.