Next: Other sources of Up: Vision Previous: Exam-like questions

Stereopsis

Once we've obtained the primal sketch the next stage is to try and work out some image regions corresponding to real objects (or bits of object), and some properties of these regions which will later enable us to determine the 3-D shape of the object. One important property is the distance from the viewer, and one important process that allows us to determine this is the use of stereopsis.

To get the general idea, consider the figure below, where someone is looking at the corner of a box, using both eyes:

If we know the angles and , and the spacing between the eyes (b), then with a little elementary geometry we can find out how far the box is away from the viewer. We can use the sine law (remember that?), to conclude that:

Now, as , we can conclude that:

So, from knowing the directions in which an image point is perceived in each eye we can work out how far that image point is away.

Now, this is all very well, but there is one tricky problem. How do we know that a particular image point in one eye corresponds to another particular image point in another? Or to put it another way, how do we find the correspondences between the two images? This is vital for stereopsis, as the whole process relies on using the viewing angles from each eye to a common point in space.

So, the hard bit of stereopsis is developing good algorithms for doing this feature matching. One issue is whether to try to do this at an early stage of visual processing, matching up primitive features, like lines and blobs, or whether to wait until we have some recognised objects. The latter would involve, say, recognising (in both images) that a particular point in the image corresponds to the corner of a widget, checking that there are no other possible widget matches, and using the corner of the widget as the basis of the stereopsis calculation, thus working out how far the widget is away (and hence its size). However, it turns out that it is better to work with more primitive features. These may be harder to match, but by doing stereopsis at this stage makes object recognition easier.

Now, a given image might have hundreds of lines and blobs. If we tried each line in turn, and independently of the others tried to find a possible match in the other image then this wouldn't work to well. However, we can develop algorithms that are effective by exploiting a simple property of the physical world - the depth of a particular object from the viewer will change continuously. As a result, the stereo disparity (ie, difference in position in the two images) of nearby image features will be similar. So, once we've worked out that edge23 in one eye corresponds to edge38 in the other, and they are some distance d apart in the images (if we superimposed them), we can guess that a nearby feature blob23 in the first eye is likely to match a blob about d apart in the other image. There are efficient algorithms that have been developed that use this fact (that stereo disparity in features generally changes continuously), but we don't have time to go into them.



Next: Other sources of Up: Vision Previous: Exam-like questions


alison@
Fri Aug 19 10:42:17 BST 1994