Object recognition – technology in the field ofcomputer vision for finding and identifying objects in an image or video sequence. Humans recognize a multitude of objects in images with little effort, despite the fact that the image of the objects may vary somewhat in different view points, in many different sizes and scales or even when they are translated or rotated. Objects can even be recognized when they are partially obstructed from view. This task is still a challenge for computer vision systems. Many approaches to the task have been implemented over multiple decades.
Changes in lighting and color usually don't have much effect on image edges
Strategy:
Detect edges in template and image
Compare edges images to find the template
Must consider range of possible template positions
Measurements:
Good – count the number of overlapping edges. Not robust to changes in shape
Better – count the number of template edge pixels with some distance of an edge in the search image
Best – determine probability distribution of distance to nearest edge in search image (if template at correct position). Estimate likelihood of each template position generating image
Also called Alignment, since the object is being aligned to the image
Correspondences between image features and model features are not independent – Geometric constraints
A small number of correspondences yields the object position – the others must be consistent with this
General Idea:
If we hypothesize a match between a sufficiently large group of image features and a sufficiently large group of object features, then we can recover the missing camera parameters from this hypothesis (and so render the rest of the object)
Strategy:
Generate hypotheses using small number of correspondences (e.g. triples of points for 3D recognition)
Project other model features into image (backproject) and verify additional correspondences
Use the smallest number of correspondences necessary to achieve discrete object poses
For each object, set up an accumulator array that represents pose space – each element in the accumulator array corresponds to a “bucket” in pose space.
Then take each image frame group, and hypothesize a correspondence between it and every frame group on every object
For each of these correspondences, determine pose parameters and make an entry in the accumulator array for the current object at the pose value.
If there are large numbers of votes in any object's accumulator array, this can be interpreted as evidence for the presence of that object at that pose.
The evidence can be checked using a verification method
Note that this method uses sets of correspondences, rather than individual correspondences
Implementation is easier, since each set yields a small number of possible object poses.
Improvement
The noise resistance of this method can be improved by not counting votes for objects at poses where the vote is obviously unreliable
§ For example, in cases where, if the object was at that pose, the object frame group would be invisible.
These improvements are sufficient to yield working systems
Keypoints of objects are first extracted from a set of reference images and stored in a database
An object is recognized in a new image by individually comparing each feature from the new image to this database and finding candidate matching features based on Euclidean distance of their feature vectors.
Genetic algorithms can operate without prior knowledge of a given dataset and can develop recognition procedures without human intervention. A recent project achieved 100 percent accuracy on the benchmark motorbike, face, airplane and car image datasets from Caltech and 99.4 percent accuracy on fish species image datasets.[9][10]
^Worthington, Philip L., and Edwin R. Hancock. "Object recognition using shape-from-shading." IEEE Transactions on Pattern Analysis and Machine Intelligence 23.5 (2001): 535-542.
^Brown, M., and Lowe, D.G., "Recognising PanoramasArchived 2014-12-25 at theWayback Machine," ICCV, p. 1218, Ninth IEEE International Conference on Computer Vision (ICCV'03) - Volume 2, Nice,France, 2003
^Thomas Serre, Maximillian Riesenhuber, Jennifer Louie, Tomaso Poggio, "On the Role of Object-Specific features for Real World Object Recognition in Biological Vision." Artificial Intelligence Lab, and Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Center for Biological and Computational Learning, Mc Govern Institute for Brain Research, Cambridge, MA, USA
^Christian Demant, Bernd Streicher-Abel, Peter Waszkewitz, "Industrial image processing: visual quality control in manufacturing"Outline of object recognition atGoogle Books
^Heikkilä, Janne; Silvén, Olli (2004). "A real-time system for monitoring of cyclists and pedestrians".Image and Vision Computing.22 (7):563–570.doi:10.1016/j.imavis.2003.09.010.
^Jung, Ho Gi; Kim, Dong Suk; Yoon, Pal Joo; Kim, Jaihie (2006). "Structure Analysis Based Parking Slot Marking Recognition for Semi-automatic Parking System". In Yeung, Dit-Yan; Kwok, James T.; Fred, Ana; Roli, Fabio; de Ridder, Dick (eds.).Structural, Syntactic, and Statistical Pattern Recognition. Lecture Notes in Computer Science. Vol. 4109. Berlin, Heidelberg: Springer. pp. 384–393.doi:10.1007/11815921_42.ISBN978-3-540-37241-7.