CROSS REFERENCE TO RELATED APPLICATIONSThis is a divisional of application Ser. No. 09/375,951, filed Aug. 17, 1999.[0001]
TECHNICAL FIELD OF THE INVENTIONThe present invention is directed, in general, to image retrieval systems and, more specifically, to an image retrieval system using color-based segmentation to retrieve region-based images.[0002]
BACKGROUND OF THE INVENTIONThe advent of digital television (DTV), the increasing popularity of the Internet, and the introduction of consumer multimedia electronics, such as compact disc (CD) and digital video disc (DVD) players, have made tremendous amounts of multimedia information available to consumers. As video and animated graphics content becomes readily available and products for accessing it reach the consumer market, searching, indexing and identifying large volumes of multimedia data becomes even more challenging and important.[0003]
The term “visual animated data” herein refers to natural video, as well as to synthetic 2D or 3D worlds, or to a mixture of both video and graphics. Different criteria are used to search and index the content of visual animated data, such as a video clip. Video processing devices operating as image retrieval systems have been developed for searching frames of visual animated data to detect, identify and label objects of a particular shape or color, or to detect text in the frames, such as subtitles, advertisement text, or background image text, such as a street sign or a “HOTEL” sign.[0004]
Many of the existing image retrieval systems require a template image in order to search for all the images that resemble the template. For many applications, sub-image matching or object shape-based matching might be more desirable than full-image matching. For instance, a user may wish to retrieve images of red cars from an archive of images, but may not want to retrieve the remaining portion of the original image. Alternatively, a user may have a particular interest in retrieving all images that include a particular shape or a combination of shapes. This type of image retrieval is known as “region-based image retrieval.”[0005]
The extraction of image regions in an automatic and robust fashion is an extremely difficult task. Although image segmentation techniques have been studied for more than thirty years, segmentation of color images in real-world scenes is still particularly challenging for computer vision applications. This is primarily due to illumination changes in images, such as shade, highlights, and sharp contrast. For example, nonuniform illumination produces nonuniformity in the values of image pixels in RGB and YUV color spaces in conventional image segmentation techniques.[0006]
There is, therefore, a need in the art for improved video processing devices capable of performing region-based image retrieval. In particular, there is a need for improved region-based image retrieval systems capable of performing color-based segmentation that are less sensitive to illumination conditions.[0007]
SUMMARY OF THE INVENTIONTo address the above-discussed deficiencies of the prior art, it is a primary object of the present invention to provide, for use in an image retrieval system capable of analyzing an image comprising a plurality of pixels in a first color model format, an image processing device capable of detecting and retrieving from the image a selected image portion. The image processing device comprises an image processor capable of converting the plurality of pixels in the image from the first color model format to a (Y,r,θ) color model format, wherein for each pixel in the plurality of pixels, Y is an intensity component indicating a total amount of light, r is a saturation component indicating an amount of white light mixed with a color of the pixel, and θ is a hue component indicating the color of the pixel. The image processor is capable of grouping spatially adjacent ones of the plurality of pixels into a plurality of image regions according to hue components of the adjacent pixels and performing a merging process wherein a first image region and a second image region proximate the first image region are merged into a composite region if a hue difference between the first and second image regions is less than a predetermined hue difference threshold.[0008]
According to an exemplary embodiment of the present invention, the image processor is capable of determining a histogram of hue components of the pixels in the image, the histogram indicating a number of pixels of similar hue in the image.[0009]
According to one embodiment of the present invention, the image processor is capable of determining a dominant hue in the image using a peak detection algorithm on the histogram.[0010]
According to another embodiment of the present invention, the image processor is capable of determining and marking ones of the plurality of image regions having less than a predetermined minimum number of pixels and disregarding the marked image regions during the merging process.[0011]
According to still another embodiment of the present invention, the image processor is capable of determining and marking achromatic ones of the plurality of image regions having less than a predetermined minimum number of pixels and disregarding the marked achromatic image regions during the merging process.[0012]
According to yet another embodiment of the present invention, the first and second image regions are merged if a number of pixels in the first image region and a number of pixels in the second image region are greater than a predetermined image region size threshold.[0013]
According to a further embodiment of the present invention, the image processor is capable of determining a plurality of adjacent regions to the first image region and calculating merit values for the plurality of adjacent regions, wherein a merit value of a first selected adjacent region is equal to a ratio of a common perimeter of the first image region and the first selected adjacent region to a total perimeter of the first selected adjacent region.[0014]
According to a still further embodiment of the present invention, the image processor selects the second image region to be merged with the first image region according to a merit value of the second image region.[0015]
The foregoing has outlined rather broadly the features and technical advantages of the present invention so that those skilled in the art may better understand the detailed description of the invention that follows. Additional features and advantages of the invention will be described hereinafter that form the subject of the claims of the invention. Those skilled in the art should appreciate that they may readily use the conception and the specific embodiment disclosed as a basis for modifying or designing other structures for carrying out the same purposes of the present invention. Those skilled in the art should also realize that such equivalent constructions do not depart from the spirit and scope of the invention in its broadest form.[0016]
Before undertaking the DETAILED DESCRIPTION, it may be advantageous to set forth definitions of certain words and phrases used throughout this patent document: the terms “include” and “comprise,” as well as derivatives thereof, mean inclusion without limitation; the term “or,” is inclusive, meaning and/or; the phrases “associated with” and “associated therewith,” as well as derivatives thereof, may mean to include, be included within, interconnect with, contain, be contained within, connect to or with, couple to or with, be communicable with, cooperate with, interleave, juxtapose, be proximate to, be bound to or with, have, have a property of, or the like; and the term “processor” or “controller” means any device, system or part thereof that controls at least one operation, such a device may be implemented in hardware, firmware or software, or some combination of at least two of the same. It should be noted that the functionality associated with any particular controller may be centralized or distributed, whether locally or remotely. Definitions for certain words and phrases are provided throughout this patent document, those of ordinary skill in the art should understand that in many, if not most instances, such definitions apply to prior, as well as future uses of such defined words and phrases.[0017]
BRIEF DESCRIPTION OF THE DRAWINGSFor a more complete understanding of the present invention, and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, wherein like numbers designate like objects, and in which:[0018]
FIG. 1 illustrates an exemplary image retrieval system in accordance with one embodiment of the present invention;[0019]
FIG. 2 illustrates an exemplary original image file and a converted image file in the segmentation work space of the image retrieval system in FIG. 1;[0020]
FIG. 3 illustrates an exemplary color space for converting image files in accordance with one embodiment of the present invention; and[0021]
FIG. 4 is a flow diagram which illustrates the operation of an image retrieval system in accordance with one embodiment of the present invention.[0022]
DETAILED DESCRIPTIONFIGS. 1 through 4, discussed below, and the various embodiments used to describe the principles of the present invention in this patent document are by way of illustration only and should not be construed in any way to limit the scope of the invention. Those skilled in the art will understand that the principles of the present invention may be implemented in any suitably arranged image retrieval system.[0023]
FIG. 1 illustrates exemplary[0024]image retrieval system100 in accordance with one embodiment of the present invention.Image retrieval system100 comprisesimage processing system110,external databases180,monitor185, anduser devices190.Image processing system110 provides the means for retrieving region-based images from within selected image files.
[0025]External databases180 provides a source for retrieval of a digitized visual image or images as well as other information for use by the system, as required. These databases may be provided through access with a local area network (LAN), wide area network (WAN), internet, and/or other sources such as direct access to data through external devices such as tape, disk, or other storage devices.
[0026]Monitor185 provides means for visual display of the retrieved images. User device(s)190 represents one or more peripheral devices that may be manipulated by the user ofimage retrieval system100 to provide user inputs for the system. Typical peripheral user input devices include a computer mouse, a keyboard, a lightpen, a joystick, a touch-table and associated stylus, or any other device that may selectively be used to enter, to select, and to manipulate data, including all or portions of the retrieved image(s). User device(s)190 may also include output devices, such as a color printer, which can be utilized to capture a particular retrieved or modified image.
[0027]Image processing system110 comprisesimage processor120, random access memory (RAM)130,disk storage140, user input/output (I/O)card150,video card160, I/O interface170, andprocessor bus175.RAM130 further comprisessegmentation work space132 andimage retrieval controller134.Processor bus175 transfers data between all of the components ofimage processing system110.Image processor120 provides over-all control forimage processing system110 and performs the image processing needed to implement image segregation of the present invention, as well as other requirements for image retrieval and editing systems. This includes processing of color images in accordance with the principles of the present invention, processing image editing functions, processing of digitized video images for transfer to monitor185 or for storage indisk storage140, and control of data transfer between the various elements of the image processing system. The requirements and capabilities forimage processor120 are well known in the art and need not be described in greater detail other than as required for the present invention.
[0028]RAM130 provides random access memory for temporary storage of data produced byimage processing system110, which is not otherwise provided by components within the system.RAM130 includes memory forsegmentation work space132,image retrieval controller134, as well as other memory required byimage processor120 and associated devices.Segmentation work space132 represents the portion ofRAM130 in which the initial video image and any modified region-based images are temporarily stored during the color segmentation process.Segmentation work space132 provides means for defining image region(s) and segmenting image(s), shapes, and areas of the same color from an externally or internally supplied original visual image without impacting the original data so that the original data and image can be recovered, as required.Image retrieval controller134 represents a portion ofRAM130 that is dedicated to storage of an application program executed byimage processor120 to perform region-based image retrieval using color-based segmentation of the present invention.Image retrieval controller134 may execute well-known editing techniques, such as smoothing or boundary detection between images, as well as the novel techniques for image separation associated with the present invention.Image retrieval controller134 may also be embodied as a program on a CD-ROM, computer diskette, or other storage media that may be loaded into a removable disk port indisk storage140 or elsewhere, such as inexternal databases180.
[0029]Disk storage140 comprises one or more disk systems, including a removable disk, for “permanent” storage of programs and other data, including required visual data and the program instructions ofimage retrieval controller134. Depending upon system requirements,disk storage140 may be configured to interface with one or more bidirectional buses for the transfer of visual data to and fromexternal databases180, as well as the rest of the system. Depending upon specific applications and the capability ofimage processor120,disk storage140 can be configured to provide capability for storage of a large number of color images.
User I/[0030]O card150 provides means for interfacing user device(s)190 to the rest ofimage processing system100. User I/O card150 converts data received fromuser devices190 to the format ofinterface bus175 for transfer to imageprocessor120 or to RAM130 for subsequent access byimage processor120. User I/O card150 also transfers data to user output devices such as printers.Video card160 provides the interface betweenmonitor185 and the rest ofimage processing system110 throughdata bus175.
I/[0031]O interface170 provides an interface betweenexternal databases180 and the rest ofimage processing system100 throughbus175. As previously discussed,external databases180 has at least one bidirectional bus for interfacing with I/O interface170. Internal toimage processing system110, I/O interface170 transfers data received fromexternal databases180 todisk storage140 for more permanent storage, to imageprocessor120, and to RAM130 to provide temporary storage for segmentation and monitor display purposes.
FIG. 2 illustrates exemplary[0032]original image file210 and convertedimage file220 insegmentation work space132 of the image retrieval system in FIG. 1.Original image file210 provides storage for each pixel (labeled 1 though n) associated with the original image received fromexternal databases180 in, for example, RGB format. The storage space for each pixel is sized for the maximum number of color value bits required for the particular implementation, as well as any other bits of information typically available for a color image system. Conventional RGB-based color image systems cover a range from 8bits/pixel to 24bits/pixel, though larger systems can be accommodated with appropriate memory increases. The convertedimage file220 provides n storage locations for the pixels in the (Yrθ) format of the present invention.
FIG. 3 illustrates[0033]exemplary color space300 for use in converting image files in (RGB) format or (YUV) format to (Yrθ) format in accordance with one embodiment of the present invention.Color space300 represents color in terms of intensity (Y), which indicates the total amount of light, saturation (r), which indicates the amount of white light mixed with color, and hue (θ), which represents the type of color which is present.Image processor120 converts pixels from, for example, (RGB) format or (YUV) format in the original image file to (Yrθ) format using one or more of the following formulae:
V=R−Y
U=B−Y
θ arctan (V/U)
r=(U2+V2)1/2
Y=Y
In a similar manner,[0034]image processor120 may convert pixels in other color space formats to (Yrθ) format.
FIG. 4 depicts flow diagram[0035]400 which illustrates the operation ofimage retrieval system100 in accordance with one aspect of the present invention. Initially, the stored RGB formatted image file received fromexternal databases180 is converted to (Yrθ) format using the conversion equations and is stored in converted image file132 (process step405). Next,image processor120 uses the n pixels in (Yrθ) format to develop a one-dimensional (1-D) histogram of hue (θ) for each converted pixel (process step410). The histogram is restricted to pixels for which r>5 and Y>40. This is because at small values of r, θ is unstable and, when Y is low, θ is meaningless (indicates a low level of light which causes colors to merge toward black or achromatic).
The dominant color or colors, d(θ), is/are then determined from the histogram using a peak detection algorithm (process step[0036]415) that identifies the color or colors having the highest proportions of pixels. The histogram is examined and pixels are identified as having color (chromatic) or no color (achromatic). The dominant color(s) and chromatic or achromatic information is stored inRAM132 segmentation work space12 for later use.
Next,[0037]image processor120 examines the converted image pixels and groups them according to color and location. Pixels with the same color label (Yrθ description) are examined to determine their proximity to others within the color group. Spatially adjacent pixels with the same color label are grouped together as image regions (process step420).
Chromatic image regions with less than a predefined minimum threshold number of pixels (e.g., 10 pixels) and achromatic regions with less than a predefined minimum threshold number of pixels (e.g., 20 pixels) are marked off for post-processing. Achromatic regions with more than a predefined maximum threshold number of pixels (e.g., 20 pixels) are also marked off to prevent them from being merged with other regions. In addition, the remaining chromatic image regions are grouped by size and chromacity as a basis for initial merging. One embodiment of the present invention identifies comparatively large image regions with greater than, for example, 200 pixels as potential merger candidates (process step[0038]425).
Next,[0039]image processor120 examines the comparatively large image regions to determine color, θ, similarity and the amount of mutual border space (shared perimeter) with other suitable regions. One embodiment of the present invention uses a merit function which determines the percentage of shared border or perimeter space compared to the sum of the individual region perimeters:
merit func.=shared perimeter/(perimeter1+perimeter2)
Using this merit function, two neighboring regions are selected as initial candidates for image merging (process step[0040]430).
The colors of the selected image regions are examined to determine the degree of similarity. If the difference between colors is less than a pre-defined threshold difference (for example 10°), the regions are merged, the combined region replaces the merged regions in the large region segmentation work space, and the process continues. If the color difference between neighboring regions is greater than the threshold, the regions are not merged, and the process continues until no more mergers are possible (process step[0041]435).
Once all large regions are merged with those of like or similar color space,[0042]image processor120 examines the smaller regions previously identified for post-processing to determine shared perimeters and similar color indicator with the merged regions. The smaller regions are then merged with larger regions with shared perimeters and similar θ and the result is stored in the segmentation work space (process step440). At this point, the merged image regions are stored as a segmented image file which is then available for use byimage processor120 via control by user manipulation of user devices. The segmented image files may then be stored indisk storage140 for later retrieval and use.
Although the present invention has been described in detail, those skilled in the art should understand that they can make various changes, substitutions and alterations herein without departing from the spirit and scope of the invention in its broadest form.[0043]