Keywords
Citation
Andrew, A.M. (2000), "Object Recognition in Man, Monkey, and Machine", Kybernetes, Vol. 29 No. 3, pp. 392-398. https://doi.org/10.1108/k.2000.29.3.392.1
Publisher
:Emerald Group Publishing Limited
The cover of this book shows a reproduction of the Warhol picture of a Campbell’s soup tin, presumably indicating that important questions about visual perception can be illustrated in everyday contexts. The slightly gaudy outer appearance may disguise the fact that the book contains valuable papers covering the latest developments and thoughts about visual object recognition. The editors are well qualified to deal with the topic, as one belongs to a relevant department in Brown University, Providence, Rhode Island and the other is Director of the Max Planck Institute for Biological Cybernetics in Tübingen, Germany.
In the first of the seven papers the two editors refer to a previous special issue of Cognition, in 1984, on “visual cognition”, with object recognition as one of the themes. They observe that in the intervening years there has been much progress in understanding, along with a change of viewpoint. The central problem of object recognition arises from the fact that three‐dimensional objects are recognised from two‐dimensional retinal inputs, which vary with orientation of the object and other factors.
In 1984 the approach was strongly influenced by the work of David Marr, who suggested a rather complete three‐dimensional reconstruction, so that recognition might be said to be “object‐centred”. Subsequent studies have favoured a more “viewer‐centred” or “image‐based” approach in which recognition comes more directly from detection of local features of the two‐dimensional scene. It is emphasised that the earlier view is not overthrown since certain observations require it. Both types of recognition mechanism appear to be used in the brain, but a large amount of evidence indicates that the “image‐based” one plays a major part, and merits a corresponding place in recognition artefacts. The inadequacy of a strictly “object‐centred” approach is confirmed by the limited success of artificial schemes based on it.
Support for the altered viewpoint comes from work under three main headings, whose evidence is described as converging. These are human psychophysics, neurophysiology, and machine vision. The second paper in the book is mainly concerned with machine vision and is by Shimon Ullman on “Three‐dimensional object recognition based on combination of views”. It is illustrated by processing of views of a model Volkswagen Beetle car from two different angles so as to synthesise a view from an intermediate angle, and similar transformations are performed on images of faces. The relevance to human and other primate perception is discussed with the conclusion that object recognition may involve many quite diverse interacting processes.
The next paper is on “Recovery of 3D volume from two‐tone images of novel objects” and is primarily concerned with explaining certain aspects of human visual perception. Pictures of familiar objects, using only two tones (i.e. black and white) are readily recognisable, but this is shown to involve top‐down processing depending on memory of the objects, since unfamiliar objects are much less readily ascribed three‐dimensional shapes. The following paper, having the first‐named editor as joint author, treats a related issue, namely the reconciling of the image‐based approach with the fact that people can generalise over object classes and can recognise objects from unfamiliar viewpoints. Techniques of computer graphics are used in the experiments.
The following paper has the title “Evidence accumulation in cell populations responsive to faces: an account of generalisation of recognition without mental transformations”. As indicated by the explicit reference to cells, this reports neurophysiological experiments on monkeys, with recording from single neurons. A large and complex mass of data is presented in support of the view that generalisation is achieved without the reconstruction and mental rotation that are required according to Marr’s view, and there are many references to earlier studies by the same group. The detailed results here are a major piece of evidence supporting the new viewpoint on object recognition, and account for the reference to “monkey” in the book’s title.
The remaining two papers treat particular aspects of the relationship of object recognition to other functions. In the former, it is pointed out that object recognition tends to be considered separately from categorisation, although the two must interact or even be aspects of one process. A new principle of “diagnostic recognition” is introduced.
The final paper notes that vision not only serves object recognition but also determination of action. The latter requires, at the very least, estimation of position and orientation of the object, and of rates of change. The discussion is related to brain function and to observations on brain‐damaged subjects.
There is clearly a very great deal of important material in this book, which brings readers up‐to‐date on current thinking in this important area, with implications for psychology and neurophysiology as well as for AI and robotics.