
AN EXPERIMENT WITH IMAGE RECOGNITION
|
The Problem
Given a photograph (which may be damaged), we wish to train the computer to associate to it one of the descriptions in a given database. The method should perform such associations in a simple and efficient manner.
The Procedure
We work with images (digitized colour photographs) which have 14,400 pixels. An image is stored as a 14,400 ´ 1 matrix (called I) with entries from 0 to 255. Each entry represents a colour number for a particular pixel. A text description similarly, has 1800 pixels and is stored as an 1800 ´ 1 matrix (called T).
We have a database of n initial images called the initial training set, (each member of which is denoted as Ij, 1£ j £ n) and a target set of their respective text descriptions, each of which is denoted as Tj, 1£ j £ n.
We associate each Ij to its description Tj by finding a 1,800 ´ 14,400 matrix P such that for all j:
P*Ij=Tj
where * is matrix multiplication.
|
That is, we premultiply any image I in the training set by P, and we get its description T.
Now suppose we have a photograph (which might have some errors, and may or may not be in the training set). We want to find its text description. We first convert the picture into a 14,400 ´ 1 image matrix and then premultiply it by P. The resulting 1,800 ´ 1 text matrix is compared with the target set descriptions already in the database. The text description from the target set that is "closest" to the text matrix is displayed provided the text matrix is not "too far" from the target set. If it is "too far", an error message saying "no certain match found" is displayed.
This very simple method can also be used for voice recognition. It is useful and fairly efficient as long as the training set is not too large. Have a look at our computer display.
Copyright © 2022 ICICI Centre for Mathematical Sciences
All rights reserved. Send us your suggestions at
|