A valentine, things, and stuff
Rick Nelson, Editor in Chief -- Test & Measurement World, 3/1/2009 2:00:00 AM
|
|
|
I received a Valentine’s Day card last month, having on its cover a photograph of a couple dancing on a bridge near a lamppost in what appeared to be a European city. Your obvious question might be, why would someone send me a valentine? But my question was, which city?
Google Images lets you attempt to answer such questions using an iterative approach by searching, for example, for “Budapest bridge lamppost” or “Dresden bridge lamppost.” (My card turned out to depict the Charles Bridge in Prague.)
It seems there should be a more straightforward approach—instead of sorting through images returned by Google in response to text-based queries, one should be able to submit an image and get text back. Such capabilities do appear in limited contexts. Fans of mysteries will know that police officers can submit fingerprints to the FBI’s IAFIS and get in return the name and criminal history of a suspect. Also, police and casinos use face-recognition systems to search for criminals or card counters. But generalized image recognition remains elusive.
That may change, based on work presented by Geremy Heitz at the Automated Imaging Association’s 17th annual business conference, held February 4–6 in San Diego. Heitz, a PhD candidate at Stanford, described work he is doing on high-level scene interpretation. Heitz noted that humans can readily analyze scenes—they can identify a cityscape, for example, and detect objects and establish relationships between them: “The car passes a bus on the road, while people walk past a building.”
That’s a tough task for computers. As an example, Heitz presented a blurred rectangle of pixels, which neither man nor machine could identify. But when he presented that rectangle in the context of a country road, it became clear to people that the rectangle represented a moving vehicle. Computers, however, have trouble dealing with context and are likely to mistake a cow standing in a meadow for a motorcycle.
To help solve that problem, Heitz is working with “things” and “stuff.” A thing, he said, is an object with a specific size and shape, while stuff is a material defined by a homogeneous or repetitive pattern. Given a things-and-stuff (TAS) model, a computer can make an educated guess. For instance, given “stuff” equals grass, “thing” is more likely to equal cow than motorcycle. Heitz has concluded that spatial context gained through a TAS approach can improve any machine-vision sliding-window-detector technique.
Several attendees at the AIA meeting seemed eager to adapt the TAS model to industrial machine-vision applications, but that might be premature. If, for example, I am expecting to see a diode (thing) at a specific place on a printed-circuit board (stuff), I don’t want my automated-optical-inspection system to infer “diode” if a resistor has mistakenly been placed in the diode’s spot. Indeed, Heitz said typical applications for his work lie in the surveillance and security fields. Nevertheless, his work sheds light on how people analyze images, and that knowledge will likely yield techniques that have broad applicability (Ref. 1) across a range of machine-vision specialties.
| REFERENCE |
|
No related content found.
- 0 rated items found.
Datasheets.com Electronic Parts & Inventory Search
185 million searchable parts
- Part Number
- Description
- Inventory
- Products
- Manufacturers























