Below you'll find the transcript of an interview with Robin Milner, held in Cambridge on the 3. September 2003. The interview was conducted by Martin Berger. Thanks to Kohei Honda, Keith Clarke, Carla Benjamin and Tony Stockman for their help and suggestions.
When and where were you born and into what kind of family background?
My father was an army officer. I was born in 1934 in Plymouth on the south coast and because we were an army family we moved about quite a lot. I remember we ended up living in Scotland during the war, in Edinburgh. I think we moved to Edinburgh when I was about five. I lived in Edinburgh for three years, then I went to school in Wales. This was a private school which was evacuated to Wales because of the war and eventually it moved back after the war and so I went back with it to Kent. It was a school where normally people would pay a lot of money but we didn't have a lot of money. But I had a kind of scholarship, because I was being trained so that I should get a scholarship to one of the so-called public schools after that.
So your parents put emphasis on your academic achievements?
Well, there was something like that. When I was thirteen, I took a scholarship exam to one of the public schools, Eton College, that most people know about. I got in, and that was the beginning of what you might call a fast-track education for me.
Do you have brothers or sisters?
I have one sister, yes, she's older than I am. She lives in Wales, she is retired now. She has a family, one son. He has children. She is a widow, unfortunately. She looks after a herb garden in Wales.
You went to Eton College: what favourite subjects did you have there?
I was always more interested in maths than anything else, but as you probably know in those days classical education was essential in that sort of school.
Did you learn Greek or Latin?
I learned Greek from the age of ten and Latin from the age of eight and I regarded them both as a kind of defective form of mathematics because they were taught with such accuracy. But as far as classical culture is concerned we weren't taught as much as we might have been. In any case, I wasn't particularly interested in history. But I could write Latin verse, because Latin verse was very accurate and there is no doubt what syllables should be long, what syllables should be short: as I say, a defective form of mathematics.
Of course I was doing real mathematics too, and I was allowed to do it full time, as it were, from the age of sixteen. There was a good teacher at my prep-school, that is, when I was eight to thirteen, whom I learned a lot from, simply because he enjoyed it so much. There was at least one other at Eton College, I remember he was in charge of all the rowing as well, but I remember he was inspiring, he taught me projective geometry. And he found it so beautiful that we found it beautiful, too. That was a good experience. That took me up to the age of eighteen. Then I took a scholarship to Cambridge. I went to King's.
What year did you go to Cambridge?
I took my scholarship to Cambridge in 1952 but then I did my national military service. That took me to '54.
You went to the army before Cambridge?
I'm terrible at history: did you get involved with the Korean war?
It was around that time and two of the people in my group went to Korea, but I didn't. I went to the Suez Canal. At the time, the British were still in charge.
Must have been warm there.
It's quite warm and very dry. It doesn't rain because it is so dry. I didn't enjoy the army particularly, but my father was in the army, so I took it seriously.
Did your parents understand your fascination with mathematics or did they just look bemused?
Oh no, they didn't understand mathematics or scholarship in general, but certainly they supported me.
You could not go to your mother or father and ask "please explain", say, "probability distributions to me"?
Oh no! No one in the family at all. But that was fine, because I went to a boarding school. I lived there for eight months of the year. So I had plenty of people to talk to. Because the education system is very strange in this country -- we have these private schools -- they are a hothouse, but one of the things that you learn is to work extremely hard. The standard of how much work you do, particularly if you are supposed to be doing that sort of thing, as opposed to just being a sportsman, is very high. You learn that it's natural to spend all night thinking about a mathematical problem, rather than going to sleep. That sort of thing came naturally there.
At Eton College you didn't feel you stood out as being particularly studious?
Oh no, because there was a group of about 70 scholars, who would be separate from the rest of the school and were regarded as collectively mad by the rest of the school. So that was alright.
Is any of these people, your friends at Eton College, famous today?
In the army, did you do normal grunt-work or did you get involved with things like cryptography?
Oh no! I was an officer in the Royal Engineers, I looked after the heavy machinery ... bulldozers ... I was in charge of training people to drive bulldozers and big mobile buckets.
Was this a childhood dream of yours?
No it wasn't a childhood dream of mine. Absolutely not! In particular, I didn't have a childhood dream to move sand. What we were doing was digging holes in the sand.
Then you went to King's in Cambridge?
Yes, in '54.
How did you find the transition? You studied math?
Yes, I studied maths for two years. Because of the exam I had taken, I did leave out the first year and I started at the second year, which was probably a mistake because I had lost touch after two years in the army. But I took the degree essentially after two years, the main degree paper, although I had to be resident at Cambridge for three years. In 1956 I went on a course on the EDSAC machine here. I regarded programming as really rather inelegant. You'd write one instruction after the other, and it was all rather arbitrary. It seemed to me that programming was not a very beautiful thing. So I resolved that I would never go near a computer in my life!
This programming course was part of your university studies?
No, it was a ten day course in the summer. Of course I had not realised what a tremendous achievement the digital computer was. I didn't know about Turing at the time. So I didn't know about the logical work. It was because of my own ignorance that I didn't appreciate what was going on. From October '56 I changed to philosophy. I took History of Philosophy. I also did a certain amount of analytical philosophy at that time. I got interested in and puzzled by Russell's theory of descriptions and names. It's relevant to what we think about these days. But when I left Cambridge, although people suggested I might stay, I was tired of academic things.
Yes it does look strange, doesn't it, but this groundbreaking work was very much in logic. Interestingly enough, the computing people in Cambridge were not particularly interested in the theoretical side of it.
They were not mathematicians?
No. The engineering achievement was the main thing about the EDSAC computer, which was built in '49. That was a great achievement, but the link between that work and Turing's work wasn't really made very obvious. Turing had been a fellow at King's College, but he died, I discovered later, a few months before I arrived. The man who taught me maths at King's was a friend of Turing's. But I didn't realise that because we didn't talk about Turing. So I didn't know about him at all.
Your mathematical education was not really on the logical side?
No, logic came much later, when I was about 29, 30. '63, '64. In '64 I began to learn about logic, about recursion theory and all those things, while I was at City.
What did you do after Cambridge, before you went to City?
I went to London and did some part-time jobs. I became a school teacher for one year in '59.
Primary or high-school?
This was in a secondary school. A so-called grammar school. Then I left that, in 1960, after one year and went into Ferranti, where I did programming for three years.
Despite your objections?
Yes. I didn't move into it with great enthusiasm, but I realised that I should get a job.
It sounds like you were quite unsure at the time.
I was. Completely unsure. At Ferranti I looked after a program library.
That was in London?
This was in London, yes. I made sure that all the new machines they were selling were actually working. I had a first experience of writing a compiler or part of a compiler. Then I thought, perhaps I should get an academic job, which I did: I went to City University in 1963. And there I had a kind of double life: I taught maths to engineers, which was very routine, but I also began to get interested in AI. I was particularly interested in CPL, a programming language inspired by Christopher Strachey, which eventually led via BCPL to C which everyone knows.
You didn't have a PhD at the time?
No, I never had a PhD.
It was no problem getting teaching positions without?
No, no problem. You must realise that City University had until recently been a College of Advanced Technology and it had just been made into a University. But in any case, at that time -- and even now -- you don't have to have a PhD to get a job in this country. We are fairly unique in that respect, but I was still in the minority. I never had the experience of a PhD, and in some ways perhaps what I did was rather more independent than if I had done a PhD. I was not guided. I simply guided myself.
I got interested in relational algebra, particularly because of databases. I wondered what the status of this was -- Tarski's relational algebra. Of course this related to Artificial Intelligence; I wasn't interested in intelligence per se, but I was interested in automated question-answering, perhaps question-answering in relation to a database. That was an example of using algebra or logic -- an algebraic logic really -- to formulate questions and answers. I didn't get very far with that. What really sparked me off was getting interested in program verification and what semantics might mean.
When I went to Swansea in 1968 I took a research job, I gave up teaching and became a research assistant with David Cooper who was head of the department in Swansea. He had a small group there, working on program verification and automatic theorem-proving and semantics. That was at the time when Dana Scott produced his famous domain theory. He gave a series of talks then, in '69, and I went over to Oxford and heard him. That was very exciting.
So, in some sense, it began to move very fast. The idea of a machine proving theorems in logic, and the idea of using logic to understand what a machine was doing ... this double relationship began to inspire me because it was clearly not very simple. The relationship wasn't just an inverse relationship "I bite you and you bite me": the computer says: "I will automate your logic" and the logic says "I will tell you what your programs mean". These are two different relationships.
Of course you already have this in classical mathematics where you formalise logic, but you also give meaning to and reason about logic using logic. They claim there are these hierarchies of (meta-)languages, but actually it's the same thing really.
Yes, I suppose it's the same thing. I wasn't aware of that of course. In particular, it comes to life with computing.
Your interest in relational algebra, databases, program verification, was it triggered by some specific problems you needed to solve or was it general intellectual curiosity? Were these questions in the air at the time?
Oh yes, these were things one could do to get into the academic side of the subject. I wrote an automatic theorem prover in Swansea for myself and became shattered with the difficulty of doing anything interesting in that direction and I still am. I greatly admired Robinson's resolution principle, a wonderful breakthrough; but in fact the amount of stuff you can prove with fully automatic theorem proving is still very small. So I was always more interested in amplifying human intelligence than I am in artificial intelligence.
That means I began to be interested in how one could verify programs. I knew about Floyd's work. Tony Hoare was working on that sort of thing, too. I got to know him during that time. Of course I applied my theorem-prover to Floyd assertions. I generated verification conditions from programs. But then it took ages to prove them. My theorem-prover couldn't do it. I then thought "shall I prove a real program"? So I got a chemist, I forget his name, he had written a Fortran program in connection with his research, and I thought, let's prove some things about this. I was fascinated by the fact that the program was quite difficult; he was using Fortran arrays to encode his own matrices, but he was embedding them all in one big Fortran array for efficiency reasons. So you had this interesting representational problem. The problem of verifying was a problem of simulation or data representation, and I realised how big a problem that was going to be.
In fact, out of that I got interested in simulation, which I did a bit of work on. But only simulation between programs. It wasn't a very mathematically profound piece of work, though Tony Hoare was very nice about it. But the idea of something concrete representing something more abstract was obviously important. So, what the human does is to abstract from concrete representation, no matter what that representation is. That's the essence of the relationship between algorithm and program.
What was the next step for you?
Then I visited the United States during that period once with David Cooper. We went to Carnegie-Mellon, and there was Zohar Manna who had been working with McCarthy at Stanford. I spoke to Zohar Manna about what I had been doing with program schemata and verification. As a result I got a job with McCarthy from 1970, from the beginning of '71, at the AI lab in Stanford.
The interesting thing there was, they were just looking for somebody who would do something practical. Scott had recently produced a kind of logic of domains, a logic of the hierarchy of domains, a type-theoretic hierarchy, This to me seemed to be something which was practical. That is what LCF came out of. Since I was so interested in human-assisted -- or I would say machine-assisted-- proof, this seemed to be a wonderful opportunity for a program-oriented logic. It really took off for me when I realised that I could write down the syntax of a programming language in this logic and I could write the semantics in the logic. I could also write down the semantic function in the logic.
That was a domain-theoretic semantics?
Yes. I could write the translator from some trivial high-level language to some trivial assembly code. I could write down all four sides of the commuting diagram which says: you could do compilation and then take the semantics of the target code, or take the semantics of the source language and encode it in the target semantics, and you get the same answer. One of the first things we did in LCF was to verify -- in principle verify -- that commuting diagram. It was quite a big step. But it was much more easy to verify the compiler than it was to verify a single computer program!
In what sense do you understand "verifying a compiler"? Does it assume that you have the semantics of assembly language and the semantics of the high-level language?
What did you use as semantics of your assembly language? The state transitions of some microprocessor?
So you went to a very considerable amount of detail?
There was a certain encoding of the semantic objects of the source language. It wasn't as though we were the first people to think about it. The advice was "use algebra"! Essentially, we were using algebra, the representation of one algebra in another, a lot of the time in Stanford.
But meanwhile I got somehow interested, and I don't know how, in concurrency. I remember that, without linking it to verification particularly, I wondered about interacting automata. I had an idea that automata theory was inadequate, because it hadn't said what it was for two automata to interact with each other. Except for the Krohn-Rhodes Representation Theorem, which said something about feeding the output of one automata into another. But there wasn't interaction between the automata.
At the time the internet was developed in southern California you were in Stanford. And what about the work on object-orientation? They were thinking about sending messages between objects. Did that influence you?
I don't know. Maybe. I did know simulation languages. I'd done some work, not published work, but I did understand simulation languages back in '63, '65 because Simula was invented in '63 or '64 or something like that. We had a lot of simulation languages then. Simulation intrigued me because it was about how you represent the real world inside a computer. I remember being puzzled by how to define the agents in a simulated process. For example if you have humans passing buckets down the chain, then you simulate the humans, but shouldn't you simulate the buckets in exactly the same way? From the bucket's point of view, the humans are moving past the bucket, from the human's point of view, the buckets are moving past the humans. So there was some extraordinary looseness about what could be meant by representing a real-world process.
If you approach things from the simulation angle, then many models of computation must be inadequate: Turing-Machines, Lambda-Calculus, Shannon's theorem ... They all are extremely informal about what it means to get information from one entity to another.
You are probably right, but knowing about simulation languages must have been one of the reasons that I though automata ought to interact with one another. Of course I didn't know about Petri's work, which again began in '63. I didn't know that at all. But what struck me later was that the great thing about Petri was that he had actually worried about automata theory and what interaction between automata might mean. Here is one transition diagram and here is another transition diagram, but this transition in the left diagram must coincide with that transition in the right diagram. And that sharing-of-a-transition is how Petri represented communication.
The intriguing thing about Petri's work is that he was talking about how two automata could interact, and he then put the whole into one Petri-net and he didn't do it in a modular way. But the fact that he used this to represent office systems and real-world information systems showed that he had set his sights really quite high.
Wasn't he interested in formal models of physical information exchange?
I think so, but he also applied his model to office systems. I don't honestly remember when I got aware of his work. Certainly, when I worked on CCS much later, I got aware of it. In the late '70s, I visited him, but by that time he was much more interested in modular semantics, because the great thing about denotational semantics was that it was compositional in the most accurate sense of the word "compositional".
The concurrency theory at the time was more of a side-issue?
It was, but secretly I realised then that working in verification and automatic theorem-proving was to me very exciting but it wasn't getting to the heart of computation theory. What it was doing was providing assistance to people who had real concerns about correctness, but it was Dana Scott's work that was getting to the heart of computation and the meaning of computation. I wanted to do something which would carry that understanding into the wider field. I was concerned not to only work with computer-assisted proof, though that was going to be very important part of it, but also to ask what the limits of computation may be. This interaction business began to seem to me to be breaking the mould, breaking the limits what people have been trying to understand in a mathematical way up to that point.
Was it easy for you to communicate these ideas to the community at the time? Something like the Church-Turing Thesis must have been very powerful. I can imagine people saying "everything is a Turing-Machine. What's the need for other models"? Did you encounter this attitude or did people react with an open mind to your suggestions?
Some people did. I went to a conference in '73 in Pisa. This was a working conference. Italians were also very interested at that time. So my first paper on processes, which I was trying to do in a domain-theoretic way, was about that time. Semanticists were trying to apply semantic ideas of a familiar kind to concurrency, which was an unfamiliar thing for it.
Did you get a temporary or permanent position in Edinburgh?
I got a permanent position in 1973. That was coming back from the States, which was a good thing because we wanted our family to be educated in Europe, not in America.
Before we go to Edinburgh, can you say a little bit more about your own family? When did you get married?
Yes. I got married in '63. We had our children while I was living in London, working in City University, between '63 and '68.
How many children have you got?
I had three, but one of my children died. I have two now, a son and a daughter, and each of them has a daughter. My daughter lives in Scotland now. She teaches yoga and previously ran a cafe which trains handicapped people. Anyway, we go up there quite a lot, seeing that family. And we also see my son's daughter when we go to Australia, which we don't very often.
Your son lives in Australia?
No, he lives here, you see. But he goes to Australia to see her.
What does your son do?
He's a teacher, in Reading.
You didn't inspire your children to be Pi-calculists?
I didn't try. I probably avoided trying in the early days; on the whole I'm quite happy to leave it to people to discover it themselves if they have any particular inspiration in that direction.
What did you do at Cambridge when you were not studying?
I spent most of my time in university doing music; as a student in Cambridge I didn't meet most mathematicians. Most of my social life was to do with playing chamber music. I'm an oboist, or I was then. I was wondering if I should be a musician. I was wondering if I should train properly, late, and be a professional musician.
You played to quite a high standard.
Yes, I was quite good.
What kind of music does interest you?
Mostly classical music.
I mean within classical music: 20th century? Bach? ...
Everything. Everything. I think there are wonderful pieces of music from every style and there are awful pieces of music. There are awful pieces of Beethoven and there are wonderful pieces of Beethoven. I was listening to Dido and Aeneas the other night; Purcell is brilliant, but not always. Bach is occasionally boring and some 20th century music is wonderful, some Brahms is awful, some Brahms is actually wonderful. I think you must recognise what is really exciting despite -- or independently of -- the idiom.
Why did you not become a musician in the end?
I don't know. I think because I don't compose music, I wasn't very good at that, although I understand about harmony very well, but I wasn't very creative. Other things? I realised that I was perhaps training a bit late. But I never wanted to do mathematics as a hobby, so I decided to keep music as a hobby! I played in good amateur orchestras.
You don't play anymore?
No I don't play anymore.
Do you get asked if there's a connection between your formal work and your musical ability?
My answer is: yes, why not? Certainly mathematicians are interested in music. Perhaps they are interested in the structure of music. Whether mathematics has any relationship to what you would regard as musical phrasing, or tone-colour or passion in music, I doubt.
Or the other way round: is your mathematical creativity linked to your musical ability?
I don't know. I think one has a certain idea about what's elegant. And that's close to what you do though not exactly the same thing, but I certainly recognise a lot of inelegant work, both in computing and in music, including some of my own.
Semantics to me is about aesthetics to a large degree.
Yes, although there are criteria, stern criteria, for saying what is correct, what is mathematically viable. For example, compositionality, in some rigorous sense, you do need that. But beyond that ...
The Petri-net community says compositionality is uninteresting.
They have in the past, but they were wrong of course. They are changing now. But they were right about plenty of other things, so I'm happy to say they were wrong here, without being too critical. Petri nets were actually an extraordinarily important development in understanding concurrency.
I seem to remember that Don Knuth once said his early students were all musical while his later ones were into extreme sports. Did you notice similar shifts?
I couldn't make a link with that.
Then you went to Edinburgh to take up a professorship?
I took up a lectureship.
Why did you want you children to be educated in Europe?
On the whole, we tended to find that American children grow up too fast, particularly in California. And we had no intention of living ourselves in America forever, so if we stayed there, it would be living for ourselves. In any case, how we wanted to live and how we wanted to bring our children up were connected strongly, so we wanted to come back. Edinburgh was a good place to go to. It just happened to have a job at the time.
What was your goal in Edinburgh?
I think I was doing these two things: I went on with the verification work, but I also wanted to make concurrency the main intellectual challenge, although the two were very close to each other. Some first-class people came to work with me on on theorem proving and the LCF stuff: Malcolm Newey, Lockwood Morris, Mike Gordon, Chris Wadsworth. We did devise a system which itself didn't immediately get applied a great deal, but Mike Gordon brought it to Cambridge -- this was the LCF system. He started doing hardware verification. And then one or two other people began to design verifications systems, or rather systems to perform computer-assisted proof, on the model of our system, particularly Constable at Cornell with his NuPrl.
The idea was that you had a metalanguage -- which we called ML -- in which you were expressing your proof strategies, and that there was a rigorous discipline in the way that you could steer your theorem prover so that it could not break the logical rules. That was the pillar of our design. So much so that we would refuse to be efficient if we would be breaking the rules. We wouldn't encode some proof procedure in assembly code or directly in Lisp. It would all have to go via the execution of the logical rules of inference. So if you stayed late at night working out a proof with the computer, you wouldn't have to think afterwards "gosh, did I cut a few corners?", because the LCF system just wouldn't have allowed you to do so! Of course, it had some features which would help you to express your thoughts on how a proof might go.
Including proof tactics and things like that?
Yes, The intellectual challenge there was: how do you have a believable proof assistant, how do you structure the ways in which we go about finding proofs, defining bigger strategies in terms of smaller strategies and so on. That developed, but mostly through the medium of other systems, like Mike Gordon's system here in Cambridge. He designed his HOL Higher Order Logic system along the principles of LCF. That was very nice. I was busy with that in the second half of the '70s very much, but at the same time developing my ideas on concurrency, and CCS came up about 1980, 1979.
Before we talk more about CCS, can you say a few words about ML, one of the big consequences of your work on proof assistants. When did you realise that you had a general purpose language at your hands that might be useful in all sorts of other contexts?
That was very interesting because we didn't really try to do that. That was not entirely due to our insights. It was partly due to Luca Cardelli who now works at Microsoft Labs here. He was a student at that time of Gordon Plotkin's, and he took our implementation of ML and implemented it on a faster machine, because he wanted to process some formulae for his thesis, as far as I remember, which was on concurrency. In any case, being a fast implementer, within a few weeks he implemented ML on a different machine and that happened to be the machine on which we could teach the students. And somebody else -- I can't remember who it was -- decided to that they would teach ML to the second year undergraduates on that machine. Through this sequence of events, we began to realise that we did have something that was ready to be used generally. In fact people began to produce different dialects of ML -- I forget what the exact sequence of events was; that was all through the '80s, for different machine-assisted proof systems. Including the French. CAML came about at that time. The French, led by Gérard Huet, were using it for machine-assisted proof as well.
Eventually, about 1983, we thought we would pull together the different dialects and make one language out of them. That's how Standard ML evolved. It took about 7 years, from '83 to '90. In the end it looks like a lot of hard work! What we got in the end was a formally defined programming language, which is used a reasonable amount; and it stands up as a paradigm of how you might define a language rigorously. At the same time it is a working language. It stands well in that way. But it's not used worldwide as much one might hope or expect. I think that's because it is very difficult to evolve a language at the same time as evolving its semantics. Languages do evolve, in ways which don't sit easily with formal semantics. We haven't discovered how to evolve semantics.
What do you mean by evolving programming languages or semantics? Can you illustrate this with an example?
Yes. I'm not sure this is the best example, but think of the type system of ML: there is no particular respect paid to record handling. There have been a lot of theories about types for record handling, very important type systems for record handling, but it's not in there, in ML. The reason it's not in there is because we didn't understand how to prove the theorems and how to get it right. It would have been nice if ML could have evolved in the direction of a type system for records; but making sure that these theorems hold, making sure that "well-typed programs don't go wrong", was difficult to do. It is a tremendous amount of work getting the semantics and proving theorems about it anyway. It is rather non-modular: if you change one bit, you tend to change other bits elsewhere. Adapting a theorem, or checking that it still holds, is a big task. That's not a very good example, but I'm sure there are others I could think of if I had enough time.
If you look at something like Ocaml, which I think is currently the most advanced dialect of ML, it is the first such language that is pragmatically viable. You can use it do do most things other than low-level programming. Are you surprised at what it looks like, what's in it, what's omitted? Is it how you expected it to be? For example you have modules and objects, although these two concepts overlap significantly.
I think that's a wonderful story of how a language can evolve and still remain very robust. But carrying rigorous semantics through that evolution would be very difficult and that is something we will have to solve for future languages. I think Caml is a terrific success because it shows that a language which began as a theoretical undertaking can nevertheless achieve what any engineering language can also achieve.
But the most successful language of all, C, came about because someone wanted to write computer games on some spare machine, so it's not always through engineering efforts that languages succeed.
That's true. But it's also not through semantic efforts that good languages come about. We have to resolve something there and I think that's very relevant to concurrency languages.
When did you stop being involved with the ML effort?
I went on being involved. We produced the formal semantics in 1990. I was very much involved up to that. We then revised it in '97. That was a lot of work, but certainly not full-time, modifying the semantics.
By that time my main effort was definitely in concurrency, trying to understand concurrency intellectually. I'm beginning to regard it as more of a modelling exercise than a language exercise. We are modelling what happens in the world, like Petri modelling office processes, if you like. We are not looking for the smallest set of primitives that can make sense of computation. In fact, we are in a state of tension: we are looking for a small set of primitives, but they have to fit well with what goes on, not only microscopically, but also macroscopically, with what goes on in the world.
That was to me the challenge: picking communicational primitives which could be understood in systems at a reasonably high level as well as in the way these systems are implemented at a very low level. To some extent I think we have succeeded. In CCS and CSP the communicational primitives are robust with change of level of abstraction to some extent. But still, the emphasis ought to be on modelling what happens in real systems, whether they are human-made systems like operating systems, or whether they exist already. There's a subtle change from the Turing-like question of what are the fundamental, smallest sets of primitives that you can find to understand computation. I think some of that applies in concurrency, like naming: what is the smallest set of primitives for naming? So some of that applies. But as we move towards mobility, understanding systems that move about globally, you need to commit yourself to a richer set of semantic primitives. I think we are in a terrific tension between (a) finding a small set of primitives and (b) modelling the real world accurately.
Can you describe the development from CCS to Pi in its historic timeline? What was your first concurrency formalism and how did you change it? What didn't work? How was it perceived?
I think it's an interesting story. I was working with what I called Behaviour Algebra in 1978. The set of primitives included parallel composition, prefixing and those things. I wanted to make this small set of primitives do everything or do as much as possible, and to understand them semantically, preferably in Domain Theory; but that didn't work. The idea of labelled transition systems became very important because it was a replacement for what didn't work, which was encoding things into Domain Theory. Because encoding things into Domain Theory, the equivalences became either too rich or too poor, I mean too big or too small. They didn't hit what I regarded as correct notion of behavioural equivalence. There seemed to be something inescapably missing in Domain Theory in respect to this. So labelled transition systems -- following on from Gordon Plotkin's work on operational semantics and bringing the idea of labels in -- became central.
But automata already had labels!
Yes, that's true, they did, and Keller had labelled transition systems. The idea of a label being, as it were, a vehicle of interaction became totally important. So it's hard to know where it came from: Keller with his labelled transition systems, or automata theory? Making it a shared action was already present in Petri to some extent. But doing it algebraically and compositionally, that was an important thing. When I gave lectures in Aarhus in '79, that was essentially how the book on CCS evolved. What you are asking about is how it developed. What people perhaps don't know is that I was talking to Mogens Nielsen there and we were really trying to make the Pi-calculus work at that time. I remember vividly discussions in his office in Aarhus, and we couldn't make it work. So CCS became the Pi-calculus without communication of channels.
You were aware that CCS had shortcomings? You didn't think you had hit on the right formalism and later changed your mind.
Absolutely not. But it seemed to me -- and later all the more -- that it was good not to go the whole way, because there was a certain nice, manageable modesty about concurrency in CCS -- same thing with CSP, really. You have two levels: you have data and then you have that data being moved around, but you don't have the one feeding on the other. You like to move the values around, but not to move the means of movement. It was very important, I think, to see concurrent calculi without movement. Because then you could see how much could be done without movement. And the answer is: a tremendous amount could be done. The CSP people have shown that. A tremendous number of systems can be handled. And then you could begin see just where the barrier lay, what things you couldn't model. Anyway, whatever justification you could have for inventing something without mobility, Mogens Nielsen and I tried to get it to work for the label passing, but we didn't succeed. So then CCS became what it was.
Was there a theorem that you proved that made you think "OK, CCS is definitely interesting enough"? I admire that you said "CCS is not quite right but it is still already very interesting and we can use it to understand a lot of things". I would have been unhappy with it and thrown it away. Any particular key event, that made you think "wow, this CCS is jolly interesting, let's run with it"?
Two key events: one is being able to prove that behavioural equivalences were actually congruences. That wasn't easy. I had been working on that previously. Getting the idea that they were congruences made you feel that you are getting at some new kind of essence, because then you could think of the congruence classes as the "real things" that you were talking about. These were the denotees of your semantics. That it was a congruence was important because the primitives seemed to be themselves, rather just an elegant set. If they had resisted this congruence proof that would have been awful. That would have been the end of it.
The other thing was -- and I think probably this happened not immediately, probably roundabout 1982 -- when we discovered this logic ...
Yes. You could represent bisimilarity, which of course came about from talking to David Park, and previously was called "observational equivalence". Of course it was the same relation, almost the same relation, but of course the bisimilarity technology was terribly important. So, whichever one of them it was, it turned out to be captured by the Hennessy-Milner Logic. The fact that you could write specifications or you could have a very very simple logic which captures that relation exactly ...
It is an infinitary logic, so in some sense it is not simple.
Yes, it's an infinitary logic, it has summation, infinite sums, but if you cut down, you get quite nice a finitary logic, if you do certain things to CCS. It is not entirely tidy, I agree, but it was close enough. I remember saying to some of the Petri-Net people "look, because we now have a logic that matches the behavioural equivalence, isn't that interesting"? I found they didn't respond very well to that. In any case, to me it was one of the events, which said --- bar getting a few things right and wondering about this infinitary notion --- that we were on the right track.
One thing that strikes me about process-calculists is their obsession with equivalences and congruency. Other communities don't care about these things and they don't understand why we do and we don't understand why they don't. If you agree with my assessment, where does your obsession with these things, your aesthetic feeling that they are important, stem from? A working programmer never deals with truly compositional systems, they always have the operating system in the background. Do you think it's to do with your mathematical education? Because you have been exposed to algebra?
Maybe. But also, in denotational semantics you have the idea that you understand the meaning of a program and that the meaning is a function of some kind.
That appears to be a mathematician's way of thinking.
Oh sure, absolutely. But you also believe that by doing it mathematically, you provide a firm foundation which will enable people to go further, though they don't have any wish to know it.
Could it not be the case that the world of computation is not compositional?
Yes, but you have to push these things very hard. Somebody will say to you: "of course it's not compositional, look at the operating system" and then you can say, OK, wait a minute, we have to make the operating system one of the participants, one of the agents in this population of agents that are interacting. We won't achieve compositionality until we've done that, until we make explicit that agency, until we recognise all the agents that are there. We must ask ourselves, how are they interacting? Is there a sense in which a single program interacts with the operating system? All of those things need to be tackled. It's almost as though we have to prove that we can be compositional. Nobody will pay attention until we have. Eventually people want, or I want them to want, to be able to talk about a process in the same way that they talk about a function. That's why I'm not interested in short-term gains!
About equivalences: you were originally thinking in terms of weak traces? And through David Park you were lead to bisimulation?
Oh no, definitely not. The original book on CCS, in 1980, has something called observation equivalence and it has something called strong equivalence. Strong equivalence, although it was not defined in terms of maximal fixpoints, coinductively, turns out to coincide with bisimilarity. That's because of the fact of image-finite transitions. So strong equivalence coincides exactly, as it turns out, with strong bisimilarity. But we missed the coinductive proof technique.
Weak observation equivalence turned out not to coincide quite, because it was defined as the limit of an omega-chain, each member slightly finer than the previous. It turns out that we were wrong to think the maximal fixpoint would be reached as the limit of a decreasing omega-chain. It has to go to a higher ordinal. Apart from that difference, and that difference shows up only in quite a sophisticated way, we already had weak bisimilarity, but not of course the bisimulation proof technique, which is so important.
It was because David Park was visiting Edinburgh, I think in 1981. It was his sabbatical and he was living in my house, reading my book and he came down at breakfast time when I was washing the dishes and said "there is something wrong with this". And then I said "oh god, what's wrong"? "Well, this isn't a maximal fixpoint"! And I said "should it be?" or something and then we went for a walk and the answer was: yes, of course! Not only do we have a coinductive proof technique with wonderful gains, but we also have coincidence very nearly with what's already there.
So when we went for the walk; the main topic was: what should we call this thing? David wanted to call it "mimicry" and I said "that's difficult to say, let's call it bisimulation". He said "that's got five syllables" and I said, "it doesn't matter, people will be happy with it". I named it, but he brought the idea. In fact it was so close to what I had done. But the proof technique, and the maths behind it -- it seems to me -- are very much better than what I had in that first book, where I was proving inductively that things were ultimately equivalent by proving that they were n-equivalent for all n; that was an inductive not a coinductive proof.
Who is David Park?
He did work originally on program schemata with David Luckham and Michael Paterson. They were famous for their work on program schemata. This was before Scott, so they were looking at the semantics of imperative programs, and looking at the decidability of the equivalence of these things, under all interpretations of the function symbols, and finding some very beautiful results. He got his PhD at MIT, but he was English. Anyway, he was a great friend of mine. We knew each other when I was in Swansea. Before we came together again on this concurrency stuff, he had worked with mu-calculus and maximal fixpoints. So he came prepared with the maximal fixpoint view.
People knew and cared about maximal fixpoints before bisimulations?
Oh yes, very definitely. Maximal and minimal fixpoints. David was a great expert in the extraordinary richness that you get when you have a maximal and a minimal fixpoint operator in something like the mu-calculus. So he brought all that knowledge to bear. To him it stuck out very strongly what I was doing wrong in CCS. Essentially he came along with this thing and it fitted in. When I wrote the book in 1990, I tried to tell the story about how this fitted in and how important it was.
Of course he died quite early, around 1990. It was important for me that he would get just recognition for this. CCS was already designed, but this particular bisimilarity technique seemed to be very important, at least for those of us who believe in equivalence and congruence. The reason that that's so important is that by doing that we are getting at some kind of denotation of processes.
Even now I am talking with Tony Hoare, who is much more interested in the idea of what it means for a program to meet its specification. We are now trying to reconcile the CCS approach that regards denotations as congruence classes, and the CSP which talks about set-theoretic denotations such as failures and traces and so on, and has very nice ordering relations, so that if the specification is larger than the implementation it means that the implementation is correct. The ordering notion between processes is of course the other main important thing in process calculi. I think we want both; we want the notion of denotations, and perhaps they are congruence classes, and we want the orderings or preorderings representing improvement of, or refinement from, a specification.
What is the inverse of these refinement relations? I suspect it is related to transactions?
I suppose there is not a symmetry here. If somebody states a specification, then what they are saying is: I want a program to satisfy this and there would be many different programs that satisfy it. To produce any one program that satisfies this is satisfactory. Given that program, you could refine it further and it would of course still satisfy the original specification because that is the refinement ordering. But coarsening it might mean that it then would not satisfy that specification that it originally did, because you would coarsen it in a different direction. So coarsening doesn't have the same pragmatic interest. I think that's a fair way of putting it.
Of course one is often not interested in the execution details. One sometimes wants to see a large chunk of commands as an atomic blob, one wants to forget most of the details.
Maybe you are talking about a different abstract machine, modelling at a different level of abstraction.
Yes, but isn't refinement the inverse to that? You have a command that says "send message from x to y" and when you refine that, it becomes "send message from x to a, from a to b, from b to c and from c to y" ...
I think that's a different kind of refinement. Say you state a specification in logic. You say that all communication must satisfy the following formulae. That allows a whole space of programs to satisfy that. But they will all be talking about communications at a particular level of abstraction. You've got the idea of different levels of abstraction as orthogonal to the question of refinement. Maybe some people will call abstraction refinement, maybe there's confusion ...
Probably me who confuses things!
But I think everybody is confused about that. Different levels of abstraction is not the same thing as refining from a specification to an implementation.
You talked to Mogens Nielsen in Aarhus and you published your CCS book. What came next? You were still pushing towards mobility I guess? How did the transition to Pi come about? It seems to me that the key steps were collapsing everything into names and finding the labelled transitions that make this work. Is that correct?
Yes. You know the paper by Nielsen and Engberg?
I have seen it being cited. I have not personally read it.
It is in "Proof, Language and interaction" (MIT Press), a collection of essays written for me in 2000, edited by Gordon Plotkin, Colin Stirling and Mads Tofte. I can show it to you. Nielsen and Engberg, in '86, wrote a paper called "A calculus of communicating systems with label passing". The point is that they got over one of the difficulties that Mogens and I had found. So they contributed a substantial step towards the Pi-calculus in that paper. They never published it then. Now it was published in that book of essays. We (Joachim Parrow, David Walker and I) cited their original report in our first paper on the Pi-Calculus. And we put some kind of summary of what we thought they had contributed and what we had added to it to make the Pi-Calculus. The interesting thing is, as I remember talking to Mogens back in 1980, one of the things that didn't fit was the CCS renaming operation, which is subtly different from substitution of new names for existing names.
Because renaming can be infinitary?
No. It's the fact that applying a renaming operator at the outside of a process is different from doing a syntactic substitution of names throughout that process. As I remember, Mogens and I didn't succeed in making the label passing work, and one of the reasons was because of that renaming operator. Now if you look at Engberg and Nielsen's ECCS, Extended CCS they call it, that operator is no longer present; they explicitly omit it..
There may have been other reasons; I don't remember why label-passing didn't work out in my talks with Mogens. In any case we didn't make this big step towards the Pi-Calculus that he and Uffe made later. Then, knowing that, Joachim Parrow, David Walker and I worked very hard to try and get only one kind of name. Mogens and Uffe had various kinds of name. We did quite a lot to simplify it down. We experimented with the idea of only bound names in messages. We tried all sorts of different things to make sure that we weren't missing a trick. It took us about three years from about '86, '87, to '88, '89 to get it straight. It was a matter of not only cutting things out, but making sure that you couldn't cut any more out, since we wanted it to be as close to definitive as we could. That was an interesting process and I have kept a heap of memos that we all wrote. It always takes experimenting with different possibilities and there do seem to be quite a number of possibilities.
Did you have, as one of the possibilities, what we now call the asynchronous Pi-Calculus, where you don't have output prefixing and sums?
I think we wanted the sum because it gives you normal forms, and it gave us the algebra for CCS. We were reluctant to do without that. It seemed to me that keeping the sum, although perhaps not utterly necessary, gave you simpler applications, simpler illustrations. And it was in the tradition of CCS anyway. It didn't seem to be safe to leave it out. I'm glad that people have done all sorts of things since, indicating exactly when we need it and what it does to axioms.
I find the story of the sum quite fascinating because although these calculi appear to just have one computational operation, data exchange (names for Pi), if you have unrestricted sums, there's a second kind of silent communication that communicates which summand is chosen. That is also what ultimately leads to unrestricted sum being computationally more expressive. I always wondered if it had occurred to you at the time that this additional communication was happening.
The way I thought about it in CCS (never mind the Pi-Calculus, because the problem arises just as much in CCS, particularly the fact that weak bisimulation is not a congruence until you take care where summation is allowed) is this: It appears to me that summation is like the superposition of states and that observation causes the resolution of a sum into one or another of its states. For that reason it is a much more esoteric combinator than parallel composition.
Of course input is a form of sum, just slightly more well-behaved.
Yes, slightly more well-behaved. Yes, I think that's right. I agree that there's some overlap here between summation, parallel composition and input and so on. I don't know whether that's fully resolved yet. There are problems still around, but people are able to discuss them now in the context of the full Pi-Calculus. I think it's good to go on doing that. And I must admit that I gain more insight now that I look at graphical models, because the way summation works in bigraphs is quite unexpected. It can mimic the CCS form of sum really quite closely. The Pi-Calculus is a step towards more spatially conscious models with regions and something more, almost geometric, where we might get more insight into what summation does. I think the Pi-Calculus benefits from some kind of graphical story being told. Maybe we shouldn't go too far into that just now.
Who managed to find the labelled transitions that pass scope and when? It seems to me that that must have been a breakthrough.
I suppose so. The point is that it was always going to do that, if it worked. Even when Mogens and I discussed it in the first instance, that possibility was around. I don't think that was a late discovery. I think that was an inevitable consequence of passing labels and doing it in the freest possible way.
But in CCS labels don't have any internal structure. That is very different in the Pi-Calculus. I don't know if the Pi-Calculus was the first calculus to have a rich structure in the labels ....
Do not confuse transition-labels, which have structure, with names which don't have structure. I think the Pi-Calculus was the first calculus whose labels have an almost embarrassing structure. The fact that you had to have restriction as part of the labels, that was very worrying. In fact that's what led me later to look at these labels as contexts, because it seems to me that there's got to be a story about when you need more structure in the labels. It's almost as though we were very lucky in CCS that we didn't need any extra structure. We had a little bit of structure: we had the tau operation. That was all. The advent of the Pi-Calculus indicated that more work had to be done on transition systems in general to see exactly when labels should have structure. I guess we still haven't got the answer.
Another thing that strikes me is that most formalisms have reductions rather than labelled transitions. Now, with chemical semantics, the presentation of the Pi-Calculus is much much simpler than with labelled semantics. Yet the labelled transitions came first. Why?
Oh, that's easy: you can't do behavioural analysis with the chemical semantics. By the way, chemical semantics is terrific: it doesn't supply labels, but it supplies structural congruence -- in the original form in fact. But I think the strength of the labels is that you get the chance of congruential behaviour, because the labels encapsulate what it is that an agent contributes to a reaction; not just whether it can react, but what reactions it could conceivably take part in, if only somebody else would do something.
It is a minimal representation of that!
A minimal representation of all the interactions it could conceivably take part in and what it could contribute to them. That's the intuition of why they supply congruences. That to me is very simple. You need this notion of what an agent contributes.
Labelled semantics does two things: it gives the semantics of the raw processes, the computational steps and it helps you reason about the congruences. Reduction semantics just does the first thing and then you have some horrid rules about congruences and you need some nice tools to reason about this.
The concept of names and naming is very important in your work. When did it occur to you that this is a fundamental notion in computing? That you have these points where you can interact, that you can hide. Did you have a specific moment of revelation or is it the result of years of research?
I don't want to be wise after the event, but I think it was when we found out that you could encode data as processes with name passing.
That was when the Pi-Calculus already existed?
I'm not sure, you see. I think we probably felt that we were going to be able to do this. That made us confident that you can get all the data, not by means of other kinds of objects, but as processes. And the way you access the data is via interaction and the interaction is via names. I don't know at which point it would have happened, possibly in our memos we wrote (and kept) over a period of about two or three years, we may have something that suggests that it's going to be OK just to cut down to names because we going to be able to get data from processes. I think it was quite an early thing, because I don't think we we would wanted to put the Pi-Calculus forward without knowing that there was going to be a story about data. So, although we did the story about data later, probably not first paper, we said something about it in our first paper.
Did it ever worry you that your names were pure, had no internal structure, whereas all the names in computing applications heavily rely on internal structure?
No that didn't worry me because all practical computing has to build towers of structure in order to get something useful. What is interesting is the role of matching and mismatching. I just wrote a little paper called "What's in a name?" in honour of Roger Needham, who died earlier this year; around 1990 he wrote a paper on pure names, from the point of view of operating systems. Do you know that paper? ("Naming", in Distributed Systems (ed. Sape Mullender), Addison-Wesley 1993).
That was written more or less at the time of the Pi-Calculus. So I thought I would write a paper to see what you could do with pure names. And I conjectured that the Pi-Calculus is doing something like all of the things you can do with pure names. You can create them, you can use them to call, you can use them in what I call a co-call, so communicating is two things going on: you can call on a name and you can co-call on a name. Co-call is the negative, call is a positive, if you like. And you can test names. And synchronised action is the coming together of a calling and a co-calling. You can tell a story which says: this is perhaps all you can do with pure names. Is there anything else -- that's a challenge -- you can do with pure names? The Pi-Calculus could be regarded as just supplying a minimal environment to allow you to do all the things with pure names, that you could imagine doing with them. This paper is about six pages long and I'm telling a story which other people would recognise; it's nothing particularly new.
I think that pure names are now seen to occupy a terribly important place in the foundations of computing. And that's not just because of the Pi-Calculus, the operating systems people know about it as well. In that paper I include an example of Roger Needham's about using composite names in, say, directories. It shows that composite names, at least in simple cases, can be represented by a mixture of use of pure names together with matching and mismatching. So I think that the control structure of the Pi-Calculus together with pure names actually gives you what you need for composite names, but it would be hard to prove that.
One of the main uses for composite names is efficiency, because when you pass a composite name, you communicate not only the point of interaction but also, in some sense, how to get there. This brings me to the great schism in the theory of computing between semantics and complexity. There's currently virtually no connection between the two. Did you ever try to combine the two? Do you think that in the future, when the two will have been reconciled, all the nice mathematical structure that you and your colleagues have developed will survive in a recognisable form?
I'm beginning to think that it's perfectly OK for complexity theory and modelling, or semantics, to be two aspects of computation theory that are not necessarily mutually explicable. In other words, they're independent. I don't think that we will make a breakthrough which consists of uniting them, by changing some of the primitives from one side or the other. Shannon had a quantitative theory of information, and that does not seem to tell us about structure at all, about structure of communications. It simply tells us how few or how short messages can be and that's very exciting, but I don't see any reason why we should expect these things to come any closer to each other.
I disagree. For a start there are typing systems that control very tightly what kind of recursion you can have and that has ramifications for complexity. And intuitively, as a programmer, complexity is part of the semantics of a program: if popping an element from the stack takes n^3 steps where n is the number of elements on the stack, it's not a stack, it's nonsense. If I'm a customer, if I pay a million pounds for a program that flies my airplane, I want a proof: not only that it works correctly but also that it computes all its results in time. I certainly hope that the two will eventually be merged, it may just be very hard to get there.
Maybe I was too strong. Being able to create types which have implications for complexity is important, but that doesn't mean that complexity theory is then taken over by semantics. It's more that we can fit ourselves into certain categories or complexity classes by various means. I don't know enough about logics for complexity; descriptive complexity --involving specialised logics-- is well advanced. Maybe the logicians will refute what I was saying. I don't feel that it's high on the priority list for us to try to do the union. The reason is that we have so much structure we're trying to model, particularly with the internet and the populations of machines -- even nano-machines, if you admit nanotechnology -- that we have to try to grasp. It's like algebra has not much to do with complexity. Algebra is about structure. We have to grapple with structure, perhaps as a higher priority. Perhaps I should retreat into that position.
You mentioned Shannon's theorem and you said that the key primitive is calling and co-calling. To me all this is about communication. Communication is the key thing that happens. What strikes me about these process-theoretic models is that communication isn't directly captured as an entity with structure. It is in the formalisms, but there is no algebra of communications. We can say a name is this or that, a process is this, but there doesn't seem to be a corresponding account of communication. Have you ever wondered about this?
My first reaction is: I don't see why there should be an algebra of something in order for it to be a primary entity. The clue may be that, since communication is about interaction, in order to make a full story about communication you need to have some notion of the structure of actions which sits alongside the names that represent the contiguities of agents. What process calculi are trying to do is to say something about the structure of actions as well as about the elementary interactions. And you can't say much interesting about the elements of interaction which are just the names, unless you have structure of actions as well.
I'm agreeing with you, you can't use names without also having something else. If you're going to have a nice theory, you got to have something else around. So what process calculi try to do is have the minimal amount of extra around to get a good story. I agree with you, it's puzzling. But that's rather beautiful. You can't get at the names. They are a little bit like quarks. You can't separate out the names from the things which are naming.
I'm happy to press ahead with what you may call phenomenological theory. To look at the phenomenon of names and then see if we can tell nice stories about names in various different calculi. And then perhaps we get to know more as we look at the different calculi. For example it comes up with all the geometrical calculi, spatial calculi. Names seem to represent the contiguity of these things. So we may get a better feeling for names from that slightly different geometrical presentation of actions.
All your published calculi feature point-to-point communication. Have you thought about other forms like broadcasting or the more wave-like communications that we seem to be seeing in physics?
I always though broadcast would be a derived operation.
I'm not sure. There are separation results that seem to suggest otherwise.
Maybe. KVS Prasad has done a nice theory of CBS (Calculus of Broadcasting Systems).
If I remember their work correctly, Ene and Muntean have shown that broadcasting and point-to-point are separated by some reasonably natural conditions.
That's interesting. It always seemed to me that point-to-point was a more modular notion, a more controllable thing. Broadcast seems to require some medium within which your message is floating even though it isn't accessed. It always seemed to me important that there should be be no place where a message is floating, unless that place is itself modelled as an agent. So it seemed to me that point-to-point captured that particular attitude more directly. But, point-to-point being just two taking part, I did work with synchronous CCS which had prime names; each agent could be synchronising several prime names, essentially requiring several things to occur in a single interaction. That seemed to me to be rather nice. But somehow synchronous CCS doesn't seem nearly as useful as CCS.
The other thing is that CSP talks about engaging in an action as many agents as possible. Any number of agents can engage in the same action. CSP has had lots and lots of applications, but I'm not quite sure how much they have taken advantage of this possibility of many agents engaging part in the same action. I'm really rather puzzled by all that and I seem to have taken the path of two participants. Others may take other paths.
Some recent developments take Pi-Calculi into maybe expected, maybe unexpected directions: I mean modelling biological interaction on the DNA level in terms of Pi-Calculi. Did that surprise you or did you always think that your models are more general, that they don't just talk about conventional computation?
It took me by surprise when I first heard Ehud Shapiro at the Weizmann Institute. He came and gave us a talk about it. Two things struck me. First of all he really needs something more spatial. So I suggested to him that he used Ambients. Which he is now doing. In fact, Cardelli is now working with Shapiro. Cardelli has invented the term "Biograph" which represents the application of Ambients to biological phenomena. What my Bigraphs are trying to do is to combine Ambients and Pi-Calculus. There is a coming together of these things. Certainly I was initially surprised and then I realised that it was perhaps the geometrical thing, the spatial thing, together with some mobility that really is what the biological people need. I don't know what combination of these things is best for them. But certainly I think there's a lot to be gained by looking for it.
You think it will not just be the biologists who will get a better tool to work with but that we will also gain novel insights from their modelling efforts?
We must look at what the biologists need and say: they are using some of what we do perhaps, but is there something that they want which we are not giving them? Cardelli thinks so. He talked to me that other day about a new class of calculi where action takes place on the membrane boundaries themselves, rather than within or outside the cell. So it may well be that we get very strong insights from them. What we should be doing is trying to join it all up all the time; they may get insights for our possible or actual applications, and we get them from theirs as well. We must try to keep the whole thing under control to make sure that we do something which is as relevant as possible to both without being prolix, in other words, without creating something which has a lot of bells and whistles. We are trying to find fundaments. We should always be open to different applications, in case they can help us focus on a better fundamental notion.
Something that has always puzzled me is how domain theory became so dominant as a paradigm. I suspect that it's because of Scott's amazing result that tempted a lot of people to work in the domain theoretic way, partly because he reuses a lot of conventional mathematics that people might have been familiar with. I suspect that in the late '60s there was a lot of work on formalising semantics and people were going in different directions and then Scott came along ... Do you agree or disagree with this?
Yes, I think domains, which Scott also called "Information Systems", are to do with how much information you need in order to do such and such. Somehow this was very satisfactory, because it gave you a nice theory of function calls and interaction and higher-order functions. All of those things could fit together in the notion of finite quantities of information.
What I didn't understand is why we couldn't immediately transfer that domain theory to work properly for concurrency. The first thing that I did in '73, '74 was to try to represent non-deterministic processes by powerdomains. In fact, I had talks with Gordon Plotkin about powerdomains and Gordon eventually produced the right definition of them. By itself, this gave no account of concurrency. But nonetheless, one could then say "OK, if I have powerdomains I can represent non-deterministic processes, and perhaps I can also represent concurrent processes by particular kinds of domains". So I looked at domains of processes and looked at the Plotkin powerdomain and it produced an equivalence which was too strong a congruence. And I looked at the weaker powerdomains, the Hoare powerdomain and the Smyth powerdomain, and they produced a congruence that was too weak; I'm certain that these were accurate statements. What I couldn't do was to get something that was operationally both sound and complete, that captured what I regarded as operational equivalence.
So, without understanding why domains weren't working I said "well, let's look for something else". That's where the notion of observational equivalence, or what subsequently turned into bisimilarity, came from. And I still don't know why domain theory didn't supply what was needed. Possibly a clue is that domain theory failed in one other way: it failed to solve the Full Abstraction Problem for PCF. In fact that seems to have been left to Game Semantics to do ...
And now Pi!
Yes, but it's the Game Semantics that does it. Pi can then tell the story of the solution, it's the notion of interaction, game-theoretic notion of interaction -- am I not right? -- that gives the key to the full abstraction theorem.
Our Full Abstraction theorem is entirely in Pi, no reference to games whatsoever. You take the Pi-Calculus, the conventional encodings of Lambda-Calculi into Pi and apply a typing system to root out exactly those processes that could break Full Abstraction.
The initial papers by Abramsky, Jagadeesan and Malacaria and by Hyland and Ong, I think, were talking about games. More recently this was connected to Pi.
Of course. The games people were definitely the first.
I must say I don't know the story terribly well since then. What they indicated was something that domain theory didn't have, which was to do with patterns of interaction. Now that's precisely what we're into in a big way with concurrency. Maybe for the full answer we will have to look for this development of game theory and connect it to the Pi-Calculus. But I'm not sorry to have left domains behind or at least adopted other ideas for a time, because we were able to look at the operational phenomenon more directly. I think that's been more important for a while.
If you look at programming languages that are really used, C, C++, Java. They don't contain much that wasn't around in the late '60s. Does that surprise you? Does it disappoint you? People who buy programming languages seem to be happy to ignore all the lessons the should have learned from semantics. Or hasn't semantics produced anything really worthwhile? Or is it something you don't care about?
I do care that languages should be informed by theories. But they have been partially informed; Ada had something like the CSP or CCS version of communication, the "alt" command representing waiting for one of several different alternatives to happen, so you could see parallel composition sitting in there behind the scene if you look at it with a benevolent eye. So there was an influence, particularly of CSP on Ada. And they both had an influence on the Lotos specification language.
I actually think the best way forward for us now is to look at concurrent calculi as modelling theories for modelling interactions, whether they occur in programs or in outside programs. We should try to widen the scene. I'm hoping that things like ubiquitous computing and nano-technology will make it absolutely necessary for us to have theories for modelling populations of interacting agents with different mobile structure, because then we'll be able to see that modelling is necessary in order to understand those populations.
Languages should emerge from that. They should be treated as a part of a modelling theory. Up to now I don't think we had sufficient incentive to make sure that our languages are close to scientific models. It's only with the onset of computation as a global phenomenon that modelling those interactions becomes so scientifically important that it is bound to have its effect on programming languages.
I see it as unfortunate that languages haven't been influenced. For example, type structures could have avoided a lot of the Year 2000 problem. We know that nothing serious went wrong, but in the two or three years before that people didn't know that it wasn't going to go wrong. If they had used types, which had been around for two decades, they would have been able to know that certain things couldn't go wrong. The theories were around, which would help people to know that certain things couldn't happen, barring physical accidents or bad implementations of course. We are used to the idea that theories can sit around and not be used. We just have to accept it.
Any particular reason, anything about the semantics community that prevents dissemination of our results? Or does it just take 20 years to go mainstream?
I think it's only when we move to concurrency that we have enough to claim that we have a theory of computation which is independent of mathematical logic or goes beyond what logicians have studied, what algorithmists have studied. We're only two decades on from that. We're gaining confidence, so we can say that this could possibly be a model suitable for the world. We must push that. We must get into nano-technology, into ubiquity and say: not only must we try model these, we must be prepared to develop our models further, so that they do fit those things. In other words become actual scientific models in the sense of what really happens, rather than modelling what happens in our clean, abstract world.
We were lacking in confidence, I think we are gaining in confidence. But we have to push it outwards. That can only happen through modelling.
Can you tell me who your PhD students were and what they are doing now.
There are twentyone. Nineteen have completed and twentyone altogether. Who are they all? George Milne worked on processes before we ever got around to process calculi, and is a Professor in Australia. Avra Cohn and Raymond Aubin worked with me on LCF.
That was in Edinburgh?
That was in Edinburgh. Avra married Mike Gordon and is busy with her family; I last heard of Raymond at Concordia University, but he may have moved on long ago. Mike Sanderson worked on recursion in CCS, and now teaches at Essex. Alan Mycroft worked on ML but especially on abstract interpretation, and is here in Cambridge.
Luís Damas did polymorphic type inference in ML, and I believe is in academic computing still in Portugal. Brian Monahan also worked with LCF; he's now a senior researcher with HP Labs in Bristol, working in the Trusted Systems Lab. Kim Larsen worked in concurrency, on a species of higher-order transition system; he now leads a group in model-checking (UppAal) at Aalborg. Kevin Mitchell implemented CCS on an abstract machine with a network of processors; that's interesting, because he had trouble in expressing just how the implementation matched the original. (The answer was invented later by Joachim Parrow: coupled simulation!)
KVS Prasad is at Gothenburg where he designed a successful calculus of broadcasting systems (CBS). Mads Tofte did semantics of Standard ML and also provided the theoretical basis for polymorphic types for references; he is now President of the IT University in Copenhagen. Faron Moller worked on axioms for concurrency, and the unique decomposition of processes; he's now a Professor in Swansea. Dave Berry worked on a generic form of operational semantics, and now works in the National e-Science Centre in Edinburgh.
Is that the Berry of the Berry function?
No, that was Gérard Berry at INRIA. Chris Tofts worked on proof rules for concurrent programs; he later applied CCS to ant colonies and continues to apply process calculi and logics to biology. Davide Sangiorgi designed higher-order Pi-Calculus and laid its theoretical foundation; he's now a professor at Bologna. Peter Sewell proved that bisimilarity of finite state processes isn't finitely axiomatisable (these negative results are hard!); he now has a Royal Society Research Fellowship in Cambridge, running a group on validating heavy-duty protocols. Dave Turner worked on sorting in the Pi-Calculus, and designed the abstract machine for Pi which later (with Benjamin Pierce) led to their definition of Pict.
Alex Mifsud worked on Action Calculi and control structures, went back to Malta, and recently came to UK on a short-term work assignment but I've temporarily lost track of him. Jamey Leifer worked with me on the maths of contextual labelled transition systems, which now underpins Bigraphs; he's currently at INRIA in Paris. Those are all the people who have finished. Two people are coming up to finishing; Alistair Turnbull whose topic is MIN, a linear graphical abstract machine-cum-calculus which can underpin threaded languages like Java, and Ole-Hoegh Jensen (collaborating with me by phone from Aalborg) on Bigraphs, especially how they model both Pi-Calculus and Ambients.
That's all. Most of them are still in computing. Some of them are very well known; I'm proud of all of them. I was already 40 at the time I started, didn't have ten years of supervising that some people have. Also I never found it very easy to supervise more then two people or three people at once.
What is your style of supervision? Is it laissez-faire or do you give them theorems to prove?
I need to be closely involved, but I don't give them theorems to prove. I work with people. I don't feel comfortable unless I know pretty much what's going on. I don't like it if somebody went too far away from what we talked about, because then I don't know how to judge or whether his theorems work or not. That's why I don't want to supervise more than two or three people at once. But of course I had some splendid interactions with those people and it is still continuing.
What were the major institutional obstacles or helper in your career? Did you find it easy to get funding? Or was it easy getting funding twenty years ago?
It was always OK. I never had a grant application turned down.
I can't say that for myself.
I've almost always had two research associates working at some grant or other. Not now any longer. I'm dating from 1973 when I started with the LCF work in Edinburgh. I always had perhaps a couple of students and a couple of research people with PhDs already. So I always had a good five or six people working on what I wanted. I have been very lucky in that way. It makes me sympathise with people who sometimes get grant applications rejected. There is a lot of competition. I think it is true that perhaps if they get to know you they will go on funding you. I started at a time when the competition perhaps wasn't so great. I didn't have any difficulties. All my difficulties have been created by myself.
Presumably being a fellow of the Royal Society and a Turing Award winner might have helped?
I haven't benefited any more since that. I've always had the funding I wanted. Certainly people accept that I must have done something interesting. But that doesn't help particularly, because usually it's people who don't know what it is anyway. I never seem to have benefited with respect to my immediate colleagues since that -- they're as critical as they always were!
The other day a physicist described computer science as an auxiliary science, not a real science to me. When you started computing really was a new field. Did you find that other more established fields looked down on your discipline or didn't take it seriously?
It's not been a problem, except sometimes with perception within the university. But then Edinburgh was always very good in this way. It was seen that computing was very important. Cambridge also. That wasn't too much of a difficulty. I think the view of other scientists sometimes takes one's breath away. Physicists clearly don't think of anything we're doing as science. Probably what they mean is: it can't be a science because we are always making things. What are we doing science of? We are doing the science of our own constructions. But I think the boundary between understanding one's own constructions and understanding the world is breaking down. You've only got to look at bioinfomatics. In chemistry, in chemical engineering, there are structures which we are making which deserve to be understood by the same models as we understand the natural world.
Ultimately physical theories must be able to describe physicists doing physics, i.e. itself, because physicists and physical theories are part of the natural world.
That's an interesting thought.
But they are very far away from being able to do that.
I do think we have some problems. I was influenced by Herbert Simon. He talked about the "The Science of the Artificial". He does point out the difficulty that our science changes as we do it. We could not conceivably be talking about populations of interacting agents unless we had the internet and nano-technology suggesting them to us. We don't have big enough imaginations, or it wouldn't have been worth starting. It becomes something worth starting. Once it's worth starting we can see that it might relate to the natural world, so we can actually use this blurred boundary to be able to model both, what goes on in the real world and what goes on with what we make. The internet is both a natural phenomenon just as much as it was built. Nobody built the internet, in some sense. In some sense it happened by means of a lot of contributory activity.
But compared with physicists we have a problem with verifying our ideas. Of course we can determine whether something is a correct derivation in ZFC, but are our definitions right? It is a difficult problem.
Yes, it is.
Is it just aesthetics? What have your scientific controversies been about?
Say we worry about compositionality. We talk to the Petri-net people. In the past they didn't represent it. They said "we can model the whole of a system". Then we said "can you model the system that I'm building now and then model how that would be synthesised with some other system"? They said "we come along afterwards and model the whole thing by a big net". Then the controversy gets resolved by what works well in practice. Our models seem to have to be justified by pragmatic considerations. Take types. They get justified by certain errors that don't happen in programs.
There does seem to be quite a lot of mileage in justifying a theory because of how it helps build things. I wish I could say this has the same purity as verifying a models against the natural world. What we don't have is repeatable experiments. That's the real weakness. We have to be able to validate our theories even though we don't have the luxury of repeating our experiments.
This predicament is shared with many other sciences, economics, sociology ...
And they are all regarded as second rate equally by the physicists. I don't know how we can get out of that. Unless again we do model something in the real world. Possibly in Biology we may be able to. At the moment Shapiro at the Weizmann Institute is verifying his models against what is happening in practise. If you equip Bioambients or the Pi-Calculus or whatever, with probabilistic transitions, then you can actually run the abstract model and compare the physical happening with it and see whether that really agrees. I don't know enough to how much credit to give to it, but Shapiro says he does get agreement. So insofar as we are able to model real live things we perhaps can validate against the real world.
It is also the case that programming languages are getting better and programming productivity is above what you would get in C.
Yes, and also a chap called Howard Smith who wrote a book called "Business Process Management -- The Third Wave". It's the philosophy behind the language BPML (Business Process Management Language), designed and implemented by Assaf Arkin, who works for a for a small company called Intalio, in association with BPML.org, a public domain institution which tries to understand business process modelling. He based it on the Pi-Calculus. Assaf Arkin knows a lot about the Pi-Calculus. Howard Smith and Peter Fingar, who wrote the book together, say that as a result they get a better model of business processes than before. He claims, because of Arkin's work, that it does take its ideas from the Pi-Calculus, and that they have been able to do things they could not do before. It seems very promising.
I don't know very much about business processes. Greg Meredith at Microsoft was also in on the beginning of this. There again you can get validation but not in the pure, repeatable experiment form. No sooner we create these things and put them into business process modelling languages, than people will redesign their business processes, so we are no longer able to be talking about the same thing. We have to find some way of telling a story about this as a real validation even though it isn't one of the pure kind found in physics. I don't know how to do that.
You worked for a computer company and you probably got involved with Microsoft through the new Cambridge computer science building. What are your views on collaboration with industry? Good thing? Bad thing?
I think it's necessary. I don't want to collaborate on products, but we are quite lucky to have something like Microsoft Lab with ex-colleagues and people like Tony Hoare and Luca Cardelli and Cédric Fournet. These people are clearly working on abstract topics which we share, but they are also working with the product division, probably doing things that we don't know as well.
There is some sense of ideas flowing along that chain, from the product people in Seattle, say, through these researchers here, to me and to our students and going back again. We don't know what goes on further down the chain always, but we are able to talk to people here. Fournet is doing some very interesting things, only half of which I know, but we are still able to talk about the ideas behind them.
Similarly with Cardelli, I don't know how much they are applying his work on Biographs, but he certainly has relationships with the program division that I don't know about. On the whole I'm happy with that. I don't think I would be happy to be closer to industry. I think there are long-term ideas about what is technically viable that you can only work out if you have time to spend, if you haven't got industrial pressure on you. I would rather be an academic so that there is no question of having to survive that kind of pressure.
What do you make of the increasing patenting of ideas in computing?
It's terrifying. It's ridiculous and terrifying at the same time.
I agree. I wouldn't be surprised if there was some patent application of some minor variant of the Pi-Calculus or some typing system. Is that something you discuss with your peers when you have your government advisory meetings?
I haven't done recently. I am aware that people ought to be doing this. I signed a petition the other day which went on behalf of several European academics to the European Parliament. But I must admit that we are interested in ideas that can be patented and that enormous damage could be done by an invention, for example in nano-technology, that could conceal some kind of misunderstanding. I think we should fight, but can't claim to have done much fighting myself.
Why is the Pi-Calculus called Pi-Calculus?
Very good question. You start from Lambda, Mu-Calculus already exists, you go through the Greek alphabet. Nu was not very easy to pronounce and anyway it has another perfectly good use now, so it was good to avoid that. Xi, I think comes next: I wouldn't want to call something the Xi-Calculus because its difficult to say. Omicron is the next one and that's no good, because that's just the composition symbol. And Pi comes next.
But it already has another meaning.
It does, that's true. But it also stands for permutations in another part of mathematics. I think we are sufficiently distant.
It's not "pi" for process?
It is "pi" for process and "pi" for pointer and parallel and a number of things. The point is that there's no really good reason. It did occur to me that it is something very easy to say. Turns out that people do find it easy to say. I didn't want to use an acronym, like CCS. I wanted to use something that is a pure name. It is a pure name. It even sounds pure.
Does it lead to confusion sometimes?
No. It's only things which are trying to mean something, but mean different things to different people, that cause confusion.
Grand Challenges, that is my last question: why and what is it about?
You're talking about the Grand Challenges Exercise initiated by the UK Computing Research Committee (UKCRC). It seemed that we ought to express the scientific aspiration of the computer science community. We should make clear that we aspire not only to make things work but to have a science, an academic body of knowledge. The first thing about a Grand Challenge is that it is supposed to represent a bold aspiration that is purely scientific; not on the back of some new funding initiative, or thought up because it would be easy to find practical applications within three or four years. It is trying to look beyond all that and represent our academic interest.
You can ask how to translate these nice notions, of being independent of funding and representing our academic interests, into something more specific. And then you can say we don't know what our aspirations are. So why don't we ask people to say what their aspirations are? That's what we are doing, but in such a way that, when they have said it and the community gets behind them, you create something which has enough credibility that people are prepared to follow it through. For example the community will create workshops in order to discuss how to address those problems. In other words, the community will feel encouraged to plan its future towards those objectives. And, incidentally, the Research Council might fund workshops that try to design projects to approach the objectives.
That's what we are trying to do. Of course these aspirations won't stand still. Our steady state should be a slowly rolling menu of aspirations, each of which is the focus of a community. Of course, this is not just a UK community; the Grand Challenge Exercise is a way to maximise the part that UK communities play in truly international research.
$Id: index.html,v 1.49 2011/09/09 14:46:43 mfb21 Exp $