Image analysis

in information processingin

Elements of information processing

Written by

Vladimir Slamecka

Emeritus Professor of Information and Computer Science, Georgia Institute of Technology, Atlanta. Coauthor ofNational Information Systems.

Vladimir Slamecka

Fact-checked by

Britannica Editors

Encyclopaedia Britannica's editors oversee subject areas in which they have extensive knowledge, whether from years of experience gained by working on that content or via study for an advanced degree....

Britannica Editors

History

The content analysis of images is accomplished by two primary methods:image processing andpattern recognition. Image processing is a set of computational techniques for analyzing,enhancing, compressing, and reconstructing images. Pattern recognition is an information-reduction process: the assignment of visual or logical patterns to classes based on the features of these patterns and their relationships. The stages in pattern recognition involvemeasurement of the object to identify distinguishing attributes, extraction of features for the defining attributes, and assignment of the object to a class based on these features. Both image processing and pattern recognition have extensive applications in various areas, including astronomy, medicine, industrial robotics, and remote sensing by satellites.

Speech analysis

The immediate objective of content analysis of digital speech is theconversion of discrete sound elements into their alphanumeric equivalents. Once so represented,speech can be subjected to the same techniques of content analysis as natural-language text—i.e., indexing and linguistic analysis. Converting speech elements into their alphanumeric counterparts is an intriguing problem because the “shape” of speech sounds embodies a wide range of many acoustic characteristics and because the linguistic elements of speech are not clearly distinguishable from one another. The technique used in speech processing is to classify the spectral representations of sound and to match the resulting digital spectrographs against prestored “templates” so as to identify the alphanumeric equivalent of the sound. (The obverse of this technique, thedigital-to-analog conversion of such templates into sound, is a relatively straightforward approach to generatingsynthetic speech.)

Speech processing is complex as well as expensive in terms of storage capacity and computational requirements. State-of-the-artspeech recognition systems can identify limited vocabularies and parts of distinctly spoken speech and can be programmed to recognize tonal idiosyncracies of individual speakers. When morerobust and reliable techniques become available and the process is made computationally tractable (as is expected with parallel computers), humans will be able to interact with computers via spoken commands and queries on a routine basis. In many situations this may make the keyboard obsolete as a data-entry device.

Storage structures for digital-form information

Digital information isstored in complex patterns that make itfeasible to address and operate on even the smallest element of symbolic expression, as well as on larger strings such as words or sentences and on images and sound.

From the viewpoint of digital information storage, it is useful to distinguish between“structured” data, such as inventories of objects that can be represented by short symbol strings and numbers, and“unstructured” data, such as the natural-language text of documents or pictorial images. The principal objective of all storage structures is tofacilitate the processing of data elements on the basis of their relationships; the structures thus vary with the type of relationship they represent. The choice of a particular storage structure is governed by the relevance of the relationships it allows to be represented to the information-processing requirements of the task or system at hand.

In information systems whose store consists of unstructured databases of natural-languagerecords, the objective is to retrieve records (or portions thereof) on the basis of the presence in the records of words or short phrases thatconstitute the query. Since there exists an index as a separate file that provides information about the locations of words and phrases in thedatabase records, the relationships that are of interest (e.g., word adjacency) can be calculated from the index. Consequently, the database text itself can be stored as a simple ordered sequential file of records. The majority of the computations use the index, and they access the text file only to pull out the records or those portions that satisfy the result of the computations. The sequential file structure remains popular, with document-retrievalsoftware intended for use with personal computers andCD-ROM databases.

When relationships betweendata elements need to be represented as part of the records so as to make more efficient the desired operations on these records, two types of “chained”structures are commonly used: hierarchical and network. In thehierarchical file structure, records are arranged in a scheme resembling a family tree, with records related to one another from top to bottom. In thenetwork file structure, records are arranged in groupings known as sets; these can be connected in any number of ways, giving rise to considerable flexibility. In both hierarchical and network structures, the relationships are shown by means of“pointers” (i.e., identifiers such as addresses or keys) that become part of the records.

Another type of database storage structure, therelational structure, has become increasingly popular since the late 1970s. Its major advantage over the hierarchical and network structures is the ability to handle unanticipated data relationships without pointers. Relational storage structures are two-dimensional tables consisting of rows and columns, much like theconceptual library catalog mentioned above. The elegance of the relational model lies in its conceptual simplicity, the availability of theoretical underpinnings (relational algebra), and the ability of its associated software to handle data relationships without the use of pointers. The relational model was initially used for databases containing highly structured information. In the 1990s it largely replaced the hierarchical and network models, and it also became the model of choice for large-scale information-management applications, both textual and multimedia.

The feasibility of storing large volumes of full text on an economical medium (the digital optical disc) has renewed interest in the study of storage structures that permit more powerful retrieval and processing techniques to operate oncognitive entities other than words, to facilitate more extensive semantic content andcontext analysis, and to organize text conceptually into logical units rather than those dictated byprinting conventions.

Query languages

The uses of databases are manifold. They provide a means of retrieving records or parts of records and performing various calculations before displaying the results. The interface by which such manipulations are specified is called the query language. Whereas early query languages were originally so complex that interacting with electronic databases could be done only by specially trained individuals, recent interfaces are more user-friendly, allowing casual users to access database information.

The main types of popular query modes are themenu, the “fill-in-the-blank” technique, and the structured query. Particularly suited for novices, the menu requires a person to choose from severalalternatives displayed on the video terminal screen. Thefill-in-the-blank technique is one in which the user is prompted to enter key words as search statements. The structured query approach is effective with relational databases. It has a formal, powerfulsyntax that is in fact a programming language, and it is able to accommodate logical operators. One implementation of this approach, theStructured Query Language (SQL), has the form

select [field Fa, Fb, . . ., Fn]

from [database Da, Db, . . ., Dn]

where [field Fa = abc]and [field Fb = def].

Structured query languages support database searching and other operations by using commands such as “find,” “delete,” “print,” “sum,” and so forth. The sentencelike structure of an SQL query resembles natural language except that its syntax is limited and fixed. Instead of using an SQL statement, it is possible to represent queries in tabular form. The technique, referred to asquery-by-example (or QBE), displays an empty tabular form and expects the searcher to enter the search specifications intoappropriate columns. The program then constructs an SQL-type query from the table and executes it.

The most flexible query language is of course natural language. The use ofnatural-language sentences in a constrained form to search databases is allowed by some commercial database management software. These programs parse the syntax of the query; recognize its action words and their synonyms; identify the names of files, records, and fields; and perform the logical operations required. Experimental systems that accept such natural-language queries in spoken voice have been developed; however, the ability to employ unrestricted natural language to query unstructured information will require further advances inmachine understanding of natural language, particularly in techniques of representing the semantic andpragmatic context of ideas. The prospect of an intelligent conversation between humans and a large store of digitally encoded knowledge is notimminent.

Movatterモバイル変換

Image analysis

Elements of information processing

Speech analysis

Storage structures for digital-form information

Query languages