Recognizing Objects

By | June 4, 2009

It’s a lot easier for us than it is for computers:

[W]hat we regard as the simple process of “recognition” would leave many computers stumped. Even something as apparently simple as recognising a birthday cake would normally require computers to be fed with information on what a cake generally looks like, the various shapes and sizes it comes in, the different forms and numbers of candles and other decorations you are likely to find adorning it, etc.

In brief, computers might be able to calculate pi to hundreds of decimal points and model complex weather patterns, but they may find it impossible, without complex and painstaking programming, to recognise a human whose grown their hair or realise that Chihuahuas and Dobermans belong to the same species.

One of the most intriguing scenarios for how robots will become a part of our everyday reality is their introduction as househhold assistants / servants — particularly for the elderly. But object recognition is crucial to performing household fucntions in a meaningful way. This little fellow can tell you that it isn’t as easy as it sounds:

Here are some things we would get awfully tired of saying to the house robot:

“No, the polo shirt — not the sweater vest.”

“Um, looks like some of the tomatoes you put in the spaghetti sauce were actually apples.”

“Thanks for folding and ironing the laundery, but it looks like we’re going to need a new cat.”

Luckily, some top people are on the case:

[Belgian researcher Luc] Van Gool is involved in a project, Cognitive-Level Annotation Using Latent Statistical Structure (CLASS — http://class.inrialpes.fr/), which is developing technologies to recognise visually specific objects, such as your car, or classes of object, such as a random car on the street.

“The recognition of an object as belonging to a particular group is a harder problem for a computer than the recognition of a specific object. The reason is that object classes show large variability among their members,” Van Gool points out.

The 3.5-year, EU-funded project managed to achieve technological improvements compared with previous efforts. It developed a system in which the description of the objects is based on the appearance of many separate, small patches. Such localised features give the necessary robustness to deal with the massive variations mentioned earlier. In addition, CLASS created special mechanisms – known as efficient approximate neighbourhood searches – for the comparison of an image or an object with huge numbers of reference images.

Sounds like an excellent start. But I think we’ll still have to be awfully careful if we don’t want our cats to end up ironed and folded.