Review of Visual Motion of Curves and Surfaces

by Roberto Cipolla and Peter Giblin

Cambridge University Press, 1999
ISBN: 0 521 63251 X (hardback)
Price GBP 30 (US$ 49.95)
192 pp.

Reviewed by David Young for AISB Quarterly, Spring/Summer 2000

Not so long ago, artificial intelligences inhabited "blocks worlds". In these simulated environments, objects could be manipulated without regard to physical niceties such as friction and gravity. When visual perception entered, the blocks took on simple polyhedral shapes. Their images consisted of straight lines meeting in neat junctions, and geometrical relationships between the blocks' faces and edges could be inferred using search and logic, classical AI's core tools.

The contrast between these perfect shapes and almost any real object was stark, and generalisation followed swiftly. Cylindrical and spherical surfaces together with more exotic varieties such as generalised cylinders and superquadrics were introduced; all of these are defined by mathematical formulae whose parameters can be adjusted to fit image data. All, however, are limited in scope: most real objects are just too complex to fit into a tidy compartment of shape-space. One way forward is to develop theory which transcends particular classes of model by working with generic descriptions of shape, and which can be specialised as necessary to the objects in hand. It is at this level that Cipolla and Giblin have written Visual Motion of Curves and Surfaces, a valuable monograph crammed with theoretical detail.

The book is motivated by one question: how much can we learn about the shape of a smooth object if all we have to work with is one or more "apparent contours"? An apparent contour is the outline or profile of the object as seen in a two-dimensional image; that is, the boundary of the silhouette, plus the boundaries of any lobes which hide another part of the object's surface from the viewer. Drawing on the groundbreaking work of J.J. Koenderink and O.D. Faugeras, Cipolla and Giblin show that a great deal can be inferred, but that the limits of "shape from contour" have yet to be established.

After a brief introduction, the differential geometry of curves and surfaces is introduced. Here, the essential mathematical tools needed to describe the shape of smooth surfaces are set out systematically and carefully. The presentation is concise, and sometimes careful rereading is needed (at least by this reader) to properly appreciate a point: one or two particularly powerful techniques would have benefited from some more discussion. Nonetheless, time spent on this chapter is well rewarded, as the key descriptors of surface patches, the first and second fundamental forms, emerge in their full mathematical elegance.

The third chapter lays further foundations, introducing the formalism needed to describe projections from three-dimensional space into two-dimensional images. This material is at the heart of the approach. In an image of a smooth, unmarked object (a white pottery vase, say), the only sharp edges will be at the apparent contours. These are the projections of points where the line of sight just skims over the surface of the object, and this straightforward geometrical fact provides the basis for a rich theory of how the shape of the surface governs the characteristics of the image curve. Clearly this is a one-way street, since many different objects could produce the same image boundary, so an immediate extension is needed to multiple images from different viewpoints, and this is provided in the following chapter. The central point here is that the visible boundary (the "contour generator") of a smooth object is not a physical aspect of it, as a surface marking is, but rather slips over its surface as the camera is moved. This leads to a special way of describing surfaces which in effect uses lines of sight and contour generators as its latitude and longitude. With this in hand, the chapter goes on to look at "visual events": the qualitative changes in the image contour that occur at special points of the camera's motion, generally as regions of the surface come into or drop out of sight.

Armed with these powerful analytical tools, Cipolla and Giblin move on in their final two chapters to deal with the reconstruction of object shape from sets of image profiles, first with knowledge of the camera motion, then without. This brings us to considerations of practical computer vision, and it would be possible, even conventional, to assume a low-level black box to extract the image contours, avoiding descending below the abstract level of the previous chapters. Rather generously, however, Cipolla and Giblin get stuck in properly with a highly condensed but careful account of modern edge detection, contour tracking, camera models and camera calibration. It is refreshing to see these practicalities handled deftly. The underlying mathematics of the geometry of camera motion, an area that has matured greatly in the last decade or so, is also well described in these two chapters.

This at last puts them in a position to demonstrate the state of the art by displaying reconstructions of real objects. When the camera motion is known, the distorting profiles project back to an envelope of the shape. For the much harder problem of unknown camera motion, it turns out that the special points where the image motion is tangential to the apparent contour are crucial, and once found, allow strong inferences about both the camera motion and object shape. A special case, motivated by practical applications, is that of an object on a turntable, and this is given detailed attention. By the end of the book, both the possibilities and the limitations of current techniques are clear.

So to the book's general qualities. The writing is straightforward and direct. The figures are excellent; there are plenty of them, they are clear and very much to the point (and it would be almost impossible to communicate most of the material without good figures). I found no errors worth mentioning. The index is helpful and the bibliography full. The book achieves its stated goal of being largely self-contained.

There are few pages without equations, and a reasonable knowledge of vector geometry, matrices and differential calculus is expected from the reader, who must keep track of a fair number of symbols. It is perhaps a shame that the idiosyncratic use of the roman numerals I and II for the fundamental forms is retained here: they do not make good algebraic symbols. Indeed, the term "form" is rather mystifying until one discovers that forms are simply a class of polynomial expressions (and so for once, "form" and "function" mean the same). On the other hand, there are many enjoyable pieces of terminology: you can impress your friends with flecnodal and hyperbolic points, epipolar tangencies, beaks, lips, swallowtails, crosscaps, and the immodestly-named essential and fundamental matrices.

My one serious quarrel is with the opening sentence: "Computer Vision is the automatic analysis of sequences of images for the purpose of recovering three-dimensional surface shape." Is this really an up-to-date view? There is surely now a recognition that computer vision may encompass a wide variety of tasks, which may or may not entail recovering explicit surface shape. In fact, this disagreement hardly colours my view of the remainder of the book, partly because shape recovery will always be one of computer vision's more important applications, more fundamentally because the geometry of shapes and their appearance in images is central to a broad understanding of vision, regardless of one's view of its goals.

What then is the significance of the work? There are clear practical applications to model acquisition, but Cipolla and Giblin make no claims about its relevance to biological vision (beyond a passing reference to a supportive psychophysical result); nor do they claim any special role for the approach within machine vision. It is clear, however, that understanding how to represent and manipulate shape is one of the central issues of research in vision, and that understanding how to combine different projections of an object is another. The theoretical material of the book is therefore important, and this accessible presentation of it is of great value to anyone seriously interested in developing ideas about the visual perception of surfaces.

Nonetheless, the theory is essentially restricted to smooth curves and surfaces. The shapes it deals with are the sculptures of Henry Moore (effectively exploited in the final chapter) not those of Giacometti; vases and jugs not trees or hedgehogs. Does this mean that we are, after all, just dealing with a more sophisticated blocks world? No: the degree of generality and applicability to the real world go well beyond what that implies. What this book does show is how much scope there is for further imaginative and innovative contributions to the study of shape and its visual perception.