Next: Exam-like questions Up: Vision Previous: Line Labelling

Towards a 3-D representation

Using stereopsis, motion, texture and (possibly) line labelling allows us to partition a scene into objects, and obtain depth and orientation information for surfaces on that object. We now have to obtain a 3-D model from that, and then try to work out what object(s) it is. I won't attempt to describe this process, just mention the sorts of 3-D object representations used, and some of the problems with matching these to models of actual known objects.

OK, so our language for representing 3-D objects will be in terms of shape primitives such as cones, cylinders, blocks etc. A typical shape (a banana?) might be:


shape55:
  shape:   cylinder
  end1:    shape23
  end2:    shape22
  length:  20cm
  width:   4cm
  curvature: 0.1
  colour:  yellow
  texture: smooth

shape23:
  shape:   cone
etc

To recognise the object as a banana we need a library of 3-D models. Bananas obviously come in a whole range of shapes and sizes, within certain limits, so our model should specify the range of possible values (e.g., 10cm-30cm), not a precise value. To recognise an object would involve checking through the library of objects to find one that fits. If there are only a few objects in your world (widgets and wodgets) this may be fairly straightforward. If there are lots, then we will need to worry about clever indexing schemes, and take advantage of the hierarchical structure of the models (e.g., worrying about the banana's basic (bent) cylindrical shape before worrying about its conical ends.

Exam-like questions

alison@
Fri Aug 19 10:42:17 BST 1994