Movatterモバイル変換


[0]ホーム

URL:


US8457416B2 - Estimating word correlations from images - Google Patents

Estimating word correlations from images
Download PDF

Info

Publication number
US8457416B2
US8457416B2US11/956,333US95633307AUS8457416B2US 8457416 B2US8457416 B2US 8457416B2US 95633307 AUS95633307 AUS 95633307AUS 8457416 B2US8457416 B2US 8457416B2
Authority
US
United States
Prior art keywords
word
image
images
image representation
correlation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related, expires
Application number
US11/956,333
Other versions
US20090074306A1 (en
Inventor
Jing Liu
Bin Wang
Zhiwei Li
Mingjing Li
Wei-Ying Ma
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft CorpfiledCriticalMicrosoft Corp
Priority to US11/956,333priorityCriticalpatent/US8457416B2/en
Assigned to MICROSOFT CORPORATIONreassignmentMICROSOFT CORPORATIONASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS).Assignors: LI, MINGJING, LI, ZHIWEI, LIU, JING, MA, WEI-YING, WANG, BIN
Priority to PCT/US2008/075525prioritypatent/WO2009035930A1/en
Publication of US20090074306A1publicationCriticalpatent/US20090074306A1/en
Application grantedgrantedCritical
Publication of US8457416B2publicationCriticalpatent/US8457416B2/en
Assigned to MICROSOFT TECHNOLOGY LICENSING, LLCreassignmentMICROSOFT TECHNOLOGY LICENSING, LLCASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS).Assignors: MICROSOFT CORPORATION
Expired - Fee Relatedlegal-statusCriticalCurrent
Adjusted expirationlegal-statusCritical

Links

Images

Classifications

Definitions

Landscapes

Abstract

Word correlations are estimated using a content-based method, which uses visual features of image representations of the words. The image representations of the subject words may be generated by retrieving images from data sources (such as the Internet) using image search with the subject words as query words. One aspect of the techniques is based on calculating the visual distance or visual similarity between the sets of retrieved images corresponding to each query word. The other is based on calculating the visual consistence among the set of the retrieved images corresponding to a conjunctive query word. The combination of the content-based method and a text-based method may produce even better result.

Description

PRIORITY AND RELATED APPLICATIONS
This patent application claims priority to U.S. provisional patent application Ser. No. 60/972,167, entitled “Estimating Word Correlations from Image Search” filed on Sep. 13, 2007. This patent application is also related to the commonly-owned U.S. patent application Ser. No. 11/956,331, entitled “DUAL CROSS-MEDIA RELEVANCE MODEL FOR IMAGE ANNOTATION”, filed on even date with the present patent application, which U.S. patent application is hereby incorporated by reference.
BACKGROUND
Word correlations broadly refer to semantic relationships existing amount words. For example, a measure of semantic similarity may be defined between two words to reflect how likely they are semantically related to each other. Two words might be closely related to each other as synonyms or antonyms, but may also be related to each other in less direct semantic meanings. Word correlations may be used to expand queries in general web search or to refine the keyword annotations in image search.
The current methods for defining and finding word correlations are text-based. Text-based methods typically estimate the correlation between two words via statistical or lexical analysis. For example, corpus-based co-occurrence statistics may provide useful solutions. Lexical analysis based on WORDNET, a semantic lexicon for the English language, may also provide useful solutions. WORDNET is considered to be better suited than conventional dictionaries for computerized analysis of semantic relationships because WORDNET groups English words into sets of synonyms called synsets, provides short, general definitions, and records the various semantic relations between these synonym sets. These text-based methods all have limitations. The corpus-based method only utilizes the textual information. Semantic lexicons such as WORDNET can only handle a limited number of words.
Although corpus-based co-occurrence statistics and semantic lexicons help to produce a combination of dictionary and thesaurus that is more intuitively usable, and to support automatic text analysis and artificial intelligence applications, further improvement on methods for defining and finding word correlations is desirable, particularly in the search-related context.
SUMMARY
In this disclosure, content-based techniques are proposed to estimate word correlations. The techniques provide an image representation of each subject word that is being analyzed, and estimate word correlations using visual features of image representations of the words. The image representations of the subject words may be generated by retrieving images from data sources (such as the Internet) using image search with the subject words as query words.
One aspect of the techniques is based on calculating a visual distance or visual similarity between the sets of retrieved images corresponding to each query word. Another aspect is based on calculating a visual consistence among the set of the retrieved images corresponding to a conjunctive query word. In one embodiment, a dynamic partial variance (DPV) is used for analyzing visual consistence based on multiple image features. The method selects a subset of low-variance image features from multiple image features extracted from the image set, and calculates a partial variance of the subset of image features. The subset of image features may be selected dynamically as the conjunctive image representation varies. In another embodiment, the content-based method is combined with a text-based method to produce even better result.
The obtained word correlations represent semantic relationships existing amount the subject words, and may be used in a variety of applications such as expanding queries in general web search or to refining keyword annotations in image search.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
BRIEF DESCRIPTION OF THE FIGURES
The detailed description is described with reference to the accompanying figures. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The use of the same reference numbers in different figures indicates similar or identical items.
FIG. 1 is a block illustration of an exemplary process for estimating word correlation based on visual distance or visual similarity of an image search result.
FIG. 2 is a flowchart of the exemplary process ofFIG. 1.
FIG. 3 is a block illustration of an exemplary process for estimating word correlation based on visual consistence of a conjunctive image representation of the two words submitted with conjunctive operator (AND).
FIG. 4 is a flowchart of the exemplary process ofFIG. 3.
FIG. 5 shows an exemplary environment for implementing the method of the present disclosure.
DETAILED DESCRIPTION
Overview
The present disclosure describes techniques for estimating word correlations from image representations of the subject words. In contrast to conventional methods which estimate word correlations directly based on textual information of the words, the techniques described herein first obtains image representations of the subject words and then indirectly derive the word correlation by analyzing the image features of the image representations. It is believed that, with properly chosen image representations of the subject words, the relationship among the image features of the image representations of the subject words constitutes an alternative, and potentially more effective, source of information from which the semantic relationship of the subject words may be derived. The techniques described herein open a new domain for word correlation estimation.
One aspect of the techniques is based on the visual distance between the sets of retrieved images corresponding to each querying word respectively. The other is based on the visual consistence among the search result, taking into consideration the distribution of visual features among the resulting images of the search. A combination of text-based and content-based methods may produce even better result.
In one embodiment, the image representations are obtained using image search, such as Web image search. Commercial image search is becoming increasingly more popular and powerful. When a keyword is submitted as a query to the image search engine, a set of images are returned as the search result, among which top-ranked images are deemed more relevant to the query. The techniques disclosed herein treat a search result as a visual representation of the query. The visual features of the retrieved images are explored to provide additional information to estimate the word correlation.
Because the Web represents the largest publically available corpus with aggregate statistical and indexing information, the Web search-based method may provide a comprehensive solution for estimating word correlations. Furthermore, a rapidly growing Web bodes well for the presently described techniques, as the techniques may become even more effective as the image content of the Web becomes richer.
One aspect of the present disclosure is a method for estimating word correlations. The method first provides an image representation of a first word and an image representation of a second word, and then estimates a correlation between the first word and the second word at least partially based on visual features of the image representations.
In one embodiment, estimating the correlation between the first word and second word is done by calculating a visual distance or a visual similarity between the image representation of the first word and the image representation of the second word. As will be illustrated in the exemplary embodiments described herein, the image representation of each subject word may be an image set including a plurality of images selected from an image database such as a Web image database. For example, the image representations of the first word and the second word may be provided by conducting an image search using the respective word (the first word or the second word) as query word, and selecting a plurality of images from search results. Preferably, the images are selected from top returns of the search results. The method calculates the visual distance or visual similarity between the two image sets to estimate the correlation between the two subject words.
In an alternative embodiment, the method provides a conjunctive image representation of the first word and the second word, and estimates the correlation between the first word and second word further based on visual features of the conjunctive image representation of the first and the second words. The conjunctive image representation may be an image set having a plurality of images, and in this case the method estimates the correlation between the first word and second word by calculating a visual consistence of the images in the image set. Similarly, the conjunctive image representation of the first word and the second word may be provided by first performing an image search using the first word and the second word as a conjunctive query word, and selecting a plurality of images from search results.
In one exemplary embodiment, a partial variance scheme is introduced to estimate the correlation between the first word and the second word. The method selects a subset of image features from a plurality of image features extracted from the image set, and calculates a partial variance of the subset of image features. As described in exemplary embodiments below, the subset of image features may include image features having relatively low variance. The subset of image features may be selected dynamically as the conjunctive image representation varies.
The techniques are described in further detail below. Exemplary processes are described using block illustrations and flowcharts. The order in which the method is described is not intended to be construed as a limitation, and any number of the described method blocks may be combined in any order to implement the method, or an alternate method.
Exemplary Embodiments
A. Word Correlation Estimation Based on Visual Distance of Image Search Result
FIG. 1 is a block illustration of an exemplary process for estimating word correlation based on visual distance or visual similarity of an image search result.
Block102 represents a first subject word x being submitted as a search query word.Block104 represents a second subject word y being submitted as a search query word. The search query words x and y (102 and104) are submitted separately to a search engine which conducts an image search through aWeb database110. The first image search using the first query word x results in an image setIx122, while the second image search using the second query word y results in an image setIy124.
Visual distance orvisual similarity130 between the image set Ix and the image set Iy is then calculated using an algorithm described in a later section of this description. Content-basedword correlation140 is finally calculated based on thevisual similarity130.
FIG. 2 is a flowchart of the exemplary process ofFIG. 1. Atblock210, the method submits word x and word y as separate query words to an image search engine, and obtains two sets of images, denoted as Sx and Sy, which are the top n results in searched pages of the respective search.
Atblock220, the method extracts, from each image of the image sets Sx and Sy, multi-modal visual features, and normalizes the extracted multi-modal visual features to characterize various visual properties of the images.
Atblock230, the method calculates a visual distance or visual similarity between the two image sets Sx and Sy using an algorithm described in a later section of the present description.
Atblock240, the method calculates a word correlation between word x and word y based on the visual distance or visual similarity between the two image sets Sx and Sy calculated above.
Given a keyword query, image search engines such as LIVE SEARCH, GOOGLE, or YAHOO! usually return good searching results, especially those images on the first page. Thus, top-ranked images as collective image sets can be roughly treated as a visual representation of the respective query word. The visual similarity between two resulting image sets can be then used to represent the relation between corresponding query words. The visual similarity can be quantitatively calculated based on the visual features of the image set. In one embodiment, each image has multiple image features represented by a vector, and each image set forms of vector set. The disclosed techniques adopt a simple strategy to calculate the visual similarity between both vector sets. The visual-based word correlation is given as follows:
KVC(x,y)=SI(I(x),I(y))=m,nS(Im(x),In(y))
where KVC(x,y) indicates the visual-based word correlation between words x and y; I(x) and I(y) indicate the resulting image sets by the word x and y respectively; Im(x) is the mthimage in the image set I(x); SI(.) is the similarity between both sets; and S(.) is the similarity between two images, and m, n=1, 2, . . . , M, where M is the number of top images from the resulting images.
B. Word Correlation Estimation Based on Visual Consistence of Image Search Result
As discussed herein, top-ranked images in a search result can be roughly treated as the visual representation of the search concept. From another point of view, the visual consistence among these resulting images reflects the semantic uniqueness of the query keyword to some extent. For example, word ‘jaguar’ may represent an animal, a car or a plane. That is, the word has more than one specific meaning. When the word is used as a query, the search results from an image search engine may include images of different concepts. Those images have varied colors, shape or texture, thus are not visually consistent.
When two words with the conjunctive operator (AND) are submitted as one query, the returned images are those indexed by both words. The returned images may be considered as a conjunctive image representation of the two words submitted with the conjunctive operator (AND). If the words are not semantically related to each other, they are unlikely to be associated with the same images or visually similar images together. Accordingly, the search results may be very noisy and not visually consistent. Based on this consideration, the visual consistence of the search results is believed to be a good indicator of the correlation between words.
FIG. 3 is a block illustration of an exemplary process for estimating word correlation based on visual consistence of the conjunctive image representation of the two words submitted with the conjunctive operator (AND).
Block302 represents a first subject word x being submitted as a search query word.Block304 represents a second subject word y being submitted as a search query word.Block306 represents a conjunctive word x AND y being submitted as a search query word. The search query words x, y and x AND y (302,304 and306) are submitted separately to a search engine which conducts an image search through aWeb database310. The first image search using the first query word x results in an image setIx322, the second image search using the second query word y results in an image setIy324, while the third image search using the conjunctive query word x AND y results in an image setIxy326.
Visual consistence330 in the image set Ixy is then calculated using an algorithm described in a later section of the present description. Content-basedword correlation340 is finally calculated based on thevisual similarity330.
FIG. 4 is a flowchart of the exemplary process ofFIG. 3. Atblock410, the method submits word x, word y and conjunctive word x AND y as separate query words to an image search engine, and obtains a three sets of images, denoted as Sx, Sy, and Sxy which are the top n results in searched pages of the respective search.
At block420, the method extracts, from each image of the image sets Sx, Sy, and Sxy, multi-modal visual features, and normalizes the extracted multi-modal visual features to characterize various visual properties of the images.
Atblock430, the method calculates a visual consistence of the image set Sxy using an algorithm described herein below.
Atblock440, the method calculates a word correlation between word x and wordy based on the visual consistence of the image set Sxy.
In one embodiment, the disclosed techniques use the variances of visual features to describe the visual consistence. Generally, less variance of visual features corresponds to greater consistence on the visual appearance, and vice versa.
With multimodal image features extracted, it is preferred that some image features are given more weight than others in calculating the visual variance in order to better imitate actual human visual cognition. From studies of cognitive psychology, it is learned that human infers overall similarity based on the aspects that are similar among the compared objects, rather than based on the dissimilar ones. Accordingly, in one embodiment, a new measure called Dynamic Partial Variance (DPV), which focuses on the features with low variances and activates different features for different image sets, is introduced as described below.
Assuming the variances of each dimensional feature among images in set S are ordered as var1(S)≦var2(S)≦ . . . ≦vard(S), the DPV is defined as:
DPV(S)=1li=1l<dvari(S)(2)
where d is the total dimension for various visual features, and l is the number of similar aspects activated in the measure. This allows a subset of image features to be selected from all image features extracted from the image set, and a selective partial variance of the subset of image features to be calculated. The subset of image features includes image features having relatively low variance. A threshold may be set for the purpose of making the cut among the image features. The subset of image features may be selected dynamically as the conjunctive image representation varies. That is, the image features that make the cut to be selected into the subset for calculating DPV may be different for different image representations. Furthermore, l the number of similar aspects activated in the measure might also vary from one image representation to another.
To ensure that the DPVs of the resulting images given a word-pair query are comparable to each other, these values are normalized according to the semantic uniqueness of the single word, i.e., the DPV of the resulting images given a single-word query. The normalized correlation between word x and word y is given as:
KCC(x,y)=exp(-σ2·DPV(Sxy)min{DPV(Sx),DPV(Sy)})(3)
where σ2>0 is a smoothing parameter.
C. Combined Word Correlation
The above-described content-based methods for estimating word correlations may be used either separately or in combination. For example, the two types of word correlations described above with different characteristics may be combined to calculate a combined word correlation.
To make different correlations complement each other, an exemplary embodiment proposes to unify all word relations in a linear form after they are normalized into [0, 1]. Better performance is expected when more sophisticated combinations are used. An exemplary linear combination is given as:
SWWR=εKSCS+(1−ε)KCCS  (4)
where 0<ε<1, and experientially ε=0.5 in an exemplary implementation.
Furthermore, either or both of the above content-based methods may be further combined with an alternative method, such as a text-based method for estimating word correlations, to estimate a more balanced word correlation using a formula similar to the above Eq. (4).
One example of text-based method for estimating word correlations is statistical correlation by search. This method is based on the following hypothesis: the relative frequency whereupon two words appear on the web within the same documents is a measure of their semantic distance. Based on this hypothesis, an embodiment of the disclosed method defines a word correlation using image search (e.g., GOOGLE image searcher) to obtain a general correlation measure for any word-pair.
For example, a word correlation using image search may be defined according to the following equation, borrowing a concept of Normalized GOOGLE Distance (NGD) for the text retrieval:
NGD(x,y)=max{logf(x),logf(y)}-logf(x,y)logG-min{logf(x),log(f(y)}
where x and y are two words in the lexicon; G is the total number of images indexed by the search engine (e.g., GOOGLE search engine); f(x) is the count of images which are indexed by the word x in the image searcher; f(y) is the count of images which are indexed by the word y in the image searcher; and f(x, y) is the count of images which are indexed by both x and y. A bounded measure which increases according to the degree of semantic correlation is also used. As a result, a general transform F(.) to obtain the NGD-based word correlation may be defined as:
KSCS(x,y)=F(NGD(x,y))=exp[-γ·NGD(x,y)]
where γ is an adjustable factor.
Although the above Eq. (5) is similar to an existing method defining a Normalized GOOGLE Distance (NGD) for the text retrieval, the meaning is quite different. In the textural NGD, x and y are two query words for text search, G is the total number of web pages indexed by GOOGLE, f(x) is the count of pages where word x appears, f(y) is the count of pages where word y appears, and f(x, y) is the count of pages where both x and y appear.
In general, the smaller the value of NGD is, the more relevant the two words are on semantics. Though the NGD may be less effective on evaluating the triangular inequality property of distance, it provides an effective measure of how far two terms are related semantically. Furthermore, experimental results demonstrate that the NGD stabilizes with the growing GOOGLE dataset. That is, the method may become scale invariant as the scale increases.
Since the range of NDG values is from 0 to ∞, SWWR(x, y) ranges from 0 to 1. The value of G (i.e., the total number of indexed images) has not been published officially, but it may be inferred from available reported results or regulated in experiments. One embodiment refers to a report in August 2005 in which the total number of indexed images in GOOGLE image searcher is reported to be 2,187,212,422 (http[colon][forward_slash forward_slash]blog.searchenginewatch.com/blog/050809-200323). With the fast development of various web techniques, one may conservatively double the count two years after that report. An exemplary G is then set to be about 4.5 billion in experiments used for testing the disclosed method.
Word correlations obtained using the techniques described herein may have a great variety of uses. One example is image annotation in which word correlation is used to refine and expand candidate annotations. A well-defined word correlation enhances the performance of image annotation.
Implementation Environment
The above-described method may be implemented with the help of a computing device, such as a server, a personal computer (PC) or a portable device having a computing unit.
FIG. 5 shows an exemplary environment for implementing the method of the present disclosure.Computing system501 is implemented withcomputing device502 which includes processor(s)510, I/O devices520, computer readable media (e.g., memory)530, and network interface (not shown). The computerreadable media530 storesapplication program modules532 and data534 (such as word correlation data).Application program modules532 contain instructions which, when executed by processor(s)510, cause the processor(s)510 to perform actions of a process described herein (e.g., the processes ofFIGS. 1-4).
For example, in one embodiment, computerreadable medium530 has stored thereupon a plurality of instructions that, when executed by one ormore processors510, causes the processor(s)510 to:
(a) provide an image representation of a first word and an image representation of a second word; and
(b) estimate a correlation between the first word and the second word at least partially based on visual features of the image representations of the first word and the second word.
In one embodiment, the processor(s)510 provides image representations of words by conducting image searches via network(s)590 to retrieve image data from multiple data sources such asservers541,542 and543.
It is appreciated that the computer readable media may be any of the suitable memory devices for storing computer data. Such memory devices include, but not limited to, hard disks, flash memory devices, optical data storages, and floppy disks. Furthermore, the computer readable media containing the computer-executable instructions may consist of component(s) in a local system or components distributed over a network of multiple remote systems. The data of the computer-executable instructions may either be delivered in a tangible physical memory device or transmitted electronically.
It is also appreciated that a computing device may be any device that has a processor, an I/O device and a memory (either an internal memory or an external memory), and is not limited to a personal computer. For example, a computer device may be, without limitation, a server, a PC, a game console, a set top box, and a computing unit built in another electronic device such as a television, a display, a printer or a digital camera.
CONCLUSION
In contrast to conventional methods, the techniques described herein do not estimate word correlations directly based on textual information of the words. Instead, the techniques first obtain image representations of the subject words and then indirectly derive the word correlation by analyzing the image features of the image representations. With properly chosen image representations of the subject words, especially with the help of modern image search engine and the growing image data source on the Internet, the relationships among the image features of the image representations constitute an alternative and/or additional, and potentially more effective, source of information for deriving the semantic relationship of words (e.g., word correlations). The techniques described herein thus open a new field for word correlation estimation.
It is appreciated that the potential benefits and advantages discussed herein are not to be construed as a limitation or restriction to the scope of the appended claims.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described. Rather, the specific features and acts are disclosed as exemplary forms of implementing the claims.

Claims (21)

What is claimed is:
1. A system for estimating word correlations, comprising:
a processor; and
a computer readable storage media having stored thereupon a plurality of instructions that, when executed by the processor, cause the processor to perform acts comprising:
providing a first image representation set having a first plurality of images representing a first word in response to a first search based at least in part on the first word and a second image representation set having a second plurality of images representing a second word in response to a second search based at least in part on the second word;
extracting, from at least one image of the first image representation set and from at least one image of the second image representation set, multi-modal visual features; and
estimating a correlation between the first word and the second word at least partially based on a calculation of a mathematical function having at least the extracted multi-modal visual features corresponding to the first image representation set and at least the extracted multi-modal visual features corresponding to the second image representation set as variables thereto, wherein estimating the correlation includes:
selecting one of a first correlation function for a calculation of an inter-set visual aspect correlation based on multiple images of the first image representation set and multiple images of the second image representation set and a second correlation function for a calculation of an intra-set visual aspect correlation based on multiple images within one of the first image representation set and the second image representation set; and
calculating the selected correlation function.
2. The system as recited inclaim 1, wherein the selected correlation function is the first correlation function, and wherein calculating the selected correlation function comprises:
calculating a visual distance or a visual similarity between the first image representation set and the second image representation set.
3. The system as recited inclaim 1, wherein providing the first and the second image representation sets representing the first word and the second word comprises:
conducting an image search using the respective first word or second word as query word; and
selecting a third plurality of images from search results.
4. The system as recited inclaim 3, wherein the image search is conducted on the Internet using an image search engine.
5. The system as recited inclaim 3, wherein the third plurality of images are selected from top returns of the search results.
6. The system as recited inclaim 1, further comprising:
providing a set of images representing conjunction of the first word and the second word, the set of images comprising a third plurality of images,
wherein estimating the correlation between the first word and second word is further based on visual features of the set of images representing conjunction of the first word and the second word.
7. The system as recited inclaim 6, wherein estimating the correlation between the first word and second word comprises calculating a visual consistence of the set of images representing conjunction of the first word and the second word.
8. The system as recited inclaim 6, wherein providing the set of images representing conjunction of the first word and the second word comprises:
performing an image search using the first word and the second word as a conjunctive query word; and
selecting a fourth plurality of images from search results.
9. The system as recited inclaim 6, wherein the selected correlation function is the second correlation function, and wherein calculating the selected correlation function comprises:
calculating a first visual consistence of the first image representation set representing the first word;
calculating a second visual consistence of the second image representation set representing the second word; and
calculating a third visual consistence of the set of images representing conjunction of the first word and the second word; and wherein the estimating the correlation between the first word and the second word comprises evaluating a function having the first visual consistence, the second visual consistence and the third visual consistence as terms thereof.
10. A system for estimating word correlations, comprising:
a processor; and
a computer readable storage media having stored thereupon a plurality of instructions that, when executed by the processor, cause the processor to perform acts comprising:
providing a first image representation set having a first plurality of images representing a first word in response to a first search based at least in part on the first word and a second image representation set having a second plurality of images representing a second word in response to a second search based at least in part on the second word;
extracting, from at least one image of the first image representation set and from at least one image of the second image representation set, multi-modal visual features; and
estimating a correlation between the first word and the second word at least partially based on a calculation of a mathematical function having at least the extracted multi-modal visual features corresponding to the first image representation set and at least the extracted multi-modal visual features corresponding to the second image representation set as variables thereto, wherein estimating the correlation between the first word and the second word includes:
calculating a content correlation between the multi-modal visual features of the image representations of the first word and the second word;
calculating a text correlation between the first word and the second word using a text-based method; and
estimating the correlation between the first word and the second word by combining the content correlation and the text correlation.
11. The system as recited inclaim 10, wherein providing the first and the second image representation sets representing the first word and the second word comprises:
conducting an image search using the respective first word or second word as query word; and
selecting a third plurality of images from search results.
12. The system as recited inclaim 11, wherein the image search is conducted on the Internet using an image search engine.
13. The system as recited inclaim 11, wherein the third plurality of images are selected from top returns of the search results.
14. A method for estimating word correlations, the method comprising:
providing a computing device with a conjunctive image representation of a first word and a second word, the conjunctive image representation comprising a first set of multiple images;
providing the computing device with a first image representation of the first word and a second image representation of the second word, the first image representation of the first word comprising a second set of multiple images, and the second image representation of the second word comprising a third set of multiple images; and
estimating, by the computing device, a correlation between the first word and second word at least partially based on a calculation of visual features calculated from the first set of multiple images comprising the conjunctive image representation of the first word and the second word, the second set of multiple images comprising the first image representation of the first word and the third set of multiple images comprising the second image representation of the second word.
15. The method as recited inclaim 14, wherein estimating, by the computing device, the correlation between the first word and second word comprises calculating, by the computing device, a visual consistence of the first set of multiple images.
16. The method as recited inclaim 14, wherein estimating, by the computing device, the correlation between the first word and the second word comprises:
selecting a subset of image features from a plurality of image features extracted from the first set of multiple images, wherein the subset of image features includes at least one image feature having a low variance according to a preset variance threshold; and
calculating a partial variance of the subset of image features.
17. The method as recited inclaim 16, wherein selecting the subset of image features is performed dynamically as the conjunctive image representation varies.
18. The method as recited inclaim 14, wherein providing the conjunctive image representation of the first word and the second word comprises:
conducting an image search using the first word and the second word as a conjunctive query word; and
selecting a plurality of images from search results.
19. One or more computer readable storage media device having stored thereupon a plurality of instructions that, when executed by a processor, causes the processor to:
provide a first image representation of a first word in response to a first search based at least in part on the first word and a second image representation of a second word in response to a second search based at least in part on the second word, the first image representation comprising a first set of images having a first plurality of images, the second image representation comprising a second set of images having a second plurality of images;
select one of a first correlation function for a calculation of an inter-set visual aspect correlation based on multiple images of the first set of images and multiple images of the second set of images or a second correlation function for a calculation of an intra-set visual aspect correlation based on multiple images within one of the first set of images and the second set of images;
calculate the selected correlation function; and
estimate a correlation between the first word and the second word at least partially based on the calculation of the selected correlation function.
20. The computer readable storage media device as recited inclaim 19, wherein the processor estimates the correlation between the first word and second word by calculating a visual distance or a visual similarity between the first image representation of the first word and the second image representation of the second word.
21. The computer readable storage media device as recited inclaim 19, wherein the plurality of instructions, when executed by a processor, further causes the processor to:
provide a conjunctive image representation of the first word and the second word, wherein the processor estimates the correlation between the first word and second word further based on visual features of the conjunctive image representation of the first word and the second word.
US11/956,3332007-09-132007-12-13Estimating word correlations from imagesExpired - Fee RelatedUS8457416B2 (en)

Priority Applications (2)

Application NumberPriority DateFiling DateTitle
US11/956,333US8457416B2 (en)2007-09-132007-12-13Estimating word correlations from images
PCT/US2008/075525WO2009035930A1 (en)2007-09-132008-09-06Estimating word correlations from images

Applications Claiming Priority (2)

Application NumberPriority DateFiling DateTitle
US97216707P2007-09-132007-09-13
US11/956,333US8457416B2 (en)2007-09-132007-12-13Estimating word correlations from images

Publications (2)

Publication NumberPublication Date
US20090074306A1 US20090074306A1 (en)2009-03-19
US8457416B2true US8457416B2 (en)2013-06-04

Family

ID=40452415

Family Applications (1)

Application NumberTitlePriority DateFiling Date
US11/956,333Expired - Fee RelatedUS8457416B2 (en)2007-09-132007-12-13Estimating word correlations from images

Country Status (2)

CountryLink
US (1)US8457416B2 (en)
WO (1)WO2009035930A1 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US8538957B1 (en)*2009-06-032013-09-17Google Inc.Validating translations using visual similarity between visual media search results
US8572109B1 (en)2009-05-152013-10-29Google Inc.Query translation quality confidence
US8577909B1 (en)2009-05-152013-11-05Google Inc.Query translation using bilingual search refinements
US8577910B1 (en)2009-05-152013-11-05Google Inc.Selecting relevant languages for query translation
US20150121290A1 (en)*2012-06-292015-04-30Microsoft CorporationSemantic Lexicon-Based Input Method Editor
US20150154232A1 (en)*2012-01-172015-06-04Google Inc.System and method for associating images with semantic entities
US9990433B2 (en)2014-05-232018-06-05Samsung Electronics Co., Ltd.Method for searching and device thereof
US10007679B2 (en)2008-08-082018-06-26The Research Foundation For The State University Of New YorkEnhanced max margin learning on multimodal data mining in a multimedia database
US10162865B2 (en)2015-10-082018-12-25Microsoft Technology Licensing, LlcGenerating image tags
US10319035B2 (en)2013-10-112019-06-11Ccc Information ServicesImage capturing and automatic labeling system
US11314826B2 (en)2014-05-232022-04-26Samsung Electronics Co., Ltd.Method for searching and device thereof

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US8086620B2 (en)2007-09-122011-12-27Ebay Inc.Inference of query relationships
US8171043B2 (en)*2008-10-242012-05-01Yahoo! Inc.Methods for improving the diversity of image search results
US9262509B2 (en)*2008-11-122016-02-16Collective, Inc.Method and system for semantic distance measurement
US10210179B2 (en)*2008-11-182019-02-19Excalibur Ip, LlcDynamic feature weighting
US8489589B2 (en)*2010-02-052013-07-16Microsoft CorporationVisual search reranking
US20110242108A1 (en)*2010-03-312011-10-06Microsoft CorporationVisualization of complexly related data
US9082040B2 (en)*2011-05-132015-07-14Microsoft Technology Licensing, LlcIdentifying visual contextual synonyms
US20130086024A1 (en)*2011-09-292013-04-04Microsoft CorporationQuery Reformulation Using Post-Execution Results Analysis
US8886657B2 (en)2011-09-302014-11-11The Boeing CompanyAssociative memory visual evaluation tool
US20140250115A1 (en)*2011-11-212014-09-04Microsoft CorporationPrototype-Based Re-Ranking of Search Results
CN105045781B (en)*2015-08-272020-06-23广州神马移动信息科技有限公司Query term similarity calculation method and device and query term search method and device
EP3188041B1 (en)*2015-12-312021-05-05Dassault SystèmesUpdate of a machine learning system
US10331684B2 (en)*2016-06-032019-06-25International Business Machines CorporationGenerating answer variants based on tables of a corpus
US20210141716A1 (en)*2017-03-172021-05-13Uilicious Private LimitedSystems, methods and computer readable media for ambiguity resolution in instruction statement interpretation
US10771573B2 (en)*2018-06-082020-09-08International Business Machines CorporationAutomatic modifications to a user image based on cognitive analysis of social media activity
JP2022112541A (en)*2021-01-222022-08-03株式会社リコー Information processing device, question answering system, information processing method and program

Citations (52)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US5630121A (en)*1993-02-021997-05-13International Business Machines CorporationArchiving and retrieving multimedia objects using structured indexes
US5680636A (en)1988-05-271997-10-21Eastman Kodak CompanyDocument annotation and manipulation in a data processing system
US5852823A (en)1996-10-161998-12-22MicrosoftImage classification and retrieval system using a query-by-example paradigm
US5987457A (en)*1997-11-251999-11-16Acceleration Software International CorporationQuery refinement method for searching documents
US6272484B1 (en)1998-05-272001-08-07Scansoft, Inc.Electronic document manager
US6397181B1 (en)1999-01-272002-05-28Kent Ridge Digital LabsMethod and apparatus for voice annotation and retrieval of multimedia data
US6415282B1 (en)1998-04-222002-07-02Nec Usa, Inc.Method and apparatus for query refinement
WO2002099697A1 (en)2001-04-122002-12-12Hong-Kyu LeeMethod and system for displaying the searching result with specified image
US20030110181A1 (en)1999-01-262003-06-12Hinrich SchuetzeSystem and method for clustering data objects in a collection
US20030123737A1 (en)2001-12-272003-07-03Aleksandra MojsilovicPerceptual method for browsing, searching, querying and visualizing collections of digital images
US6687416B2 (en)1998-10-192004-02-03Sony CorporationMethod for determining a correlation between images using multi-element image descriptors
US20040049734A1 (en)2002-09-102004-03-11Simske Steven J.System for and method of generating image annotation information
US20040205545A1 (en)2002-04-102004-10-14Bargeron David M.Common annotation framework
US20040225686A1 (en)2003-04-082004-11-11Jia LiSystem and method for automatic linguistic indexing of images by a statistical modeling approach
US20040267733A1 (en)1999-12-172004-12-30Kim Si HanInformation coding and retrieval system and method thereof
US20050004910A1 (en)2003-07-022005-01-06Trepess David WilliamInformation retrieval
US6850644B1 (en)1999-10-012005-02-01Samsung Electronics Co., Ltd.Method for analyzing texture of digital image
US20050055344A1 (en)2000-10-302005-03-10Microsoft CorporationImage retrieval systems and methods with semantic and feature based relevance feedback
US20050071365A1 (en)2003-09-262005-03-31Jiang-Liang HouMethod for keyword correlation analysis
US20050131951A1 (en)2001-03-302005-06-16Microsoft CorporationRelevance maximizing, iteration minimizing, relevance-feedback, content-based image retrieval (CBIR)
US20050165763A1 (en)2002-02-112005-07-28Microsoft CorporationStatistical bigram correlation model for image retrieval
US20050228825A1 (en)2004-04-062005-10-13Tsun-Yi YangMethod for managing knowledge from the toolbar of a browser
US20050235272A1 (en)2004-04-202005-10-20General Electric CompanySystems, methods and apparatus for image annotation
US6970860B1 (en)2000-10-302005-11-29Microsoft CorporationSemi-automatic annotation of multimedia objects
US20060020597A1 (en)2003-11-262006-01-26Yesvideo, Inc.Use of image similarity in summarizing a collection of visual images
US20060041564A1 (en)2004-08-202006-02-23Innovative Decision Technologies, Inc.Graphical Annotations and Domain Objects to Create Feature Level Metadata of Images
US20060061595A1 (en)2002-05-312006-03-23Goede Patricia ASystem and method for visual annotation and knowledge representation
US7028253B1 (en)2000-10-102006-04-11Eastman Kodak CompanyAgent for integrated annotation and retrieval of images
US7043094B2 (en)2001-06-072006-05-09Commissariat A L'energie AtomiqueProcess for the automatic creation of a database of images accessible by semantic features
US20060155684A1 (en)2005-01-122006-07-13Microsoft CorporationSystems and methods to present web image search results for effective image browsing
US20060161520A1 (en)2005-01-142006-07-20Microsoft CorporationSystem and method for generating alternative search terms
US20060173909A1 (en)2005-01-312006-08-03Carlson Gerard JAutomated image annotation
US20060195858A1 (en)2004-04-152006-08-31Yusuke TakahashiVideo object recognition device and recognition method, video annotation giving device and giving method, and program
US20060206475A1 (en)2005-03-142006-09-14Microsoft CorporationSystem and method for generating attribute-based selectable search extension
US20070005571A1 (en)2005-06-292007-01-04Microsoft CorporationQuery-by-image search and retrieval system
US7162691B1 (en)2000-02-012007-01-09Oracle International Corp.Methods and apparatus for indexing and searching of multi-media web pages
US20070052734A1 (en)2005-09-062007-03-08General Electric CompanyMethod and apparatus for annotating images
US20070067345A1 (en)*2005-09-212007-03-22Microsoft CorporationGenerating search requests from multimodal queries
US7209815B2 (en)2004-12-282007-04-24Snap-On IncorporatedTest procedures using pictures
US7231381B2 (en)2001-03-132007-06-12Microsoft CorporationMedia content search engine incorporating text content and user log mining
US20070133947A1 (en)2005-10-282007-06-14William ArmitageSystems and methods for image search
US20070174320A1 (en)2006-01-252007-07-26Bridgewell IncorporatedMethod and system for generating a concept-based keyword function, search engine applying the same, and method for calculating keyword correlation values
US20070174269A1 (en)2006-01-232007-07-26Microsoft CorporationGenerating clusters of images for search results
US20070214114A1 (en)*2006-03-132007-09-13Microsoft CorporationProjecting queries and images into a similarity space
US20070239778A1 (en)*2006-04-072007-10-11Eastman Kodak CompanyForming connections between image collections
US7283992B2 (en)2001-11-302007-10-16Microsoft CorporationMedia agent to suggest contextually related media content
US20070276820A1 (en)*2006-05-252007-11-29International Business Machines CorporationSystem, method and program for key work searching
US20080071744A1 (en)2006-09-182008-03-20Elad Yom-TovMethod and System for Interactively Navigating Search Results
US20080267503A1 (en)2007-04-262008-10-30Fuji Xerox Co., Ltd.Increasing Retrieval Performance of Images by Providing Relevance Feedback on Word Images Contained in the Images
US20090094234A1 (en)2007-10-052009-04-09Fujitsu LimitedImplementing an expanded search and providing expanded search results
US20100114908A1 (en)2008-11-042010-05-06Microsoft CorporationRelevant navigation with deep links into query
US20100114888A1 (en)2008-10-242010-05-06Van Zwol RoelofDigital image retrieval by aggregating search results based on visual annotations

Patent Citations (54)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US5680636A (en)1988-05-271997-10-21Eastman Kodak CompanyDocument annotation and manipulation in a data processing system
US5630121A (en)*1993-02-021997-05-13International Business Machines CorporationArchiving and retrieving multimedia objects using structured indexes
US5852823A (en)1996-10-161998-12-22MicrosoftImage classification and retrieval system using a query-by-example paradigm
US5987457A (en)*1997-11-251999-11-16Acceleration Software International CorporationQuery refinement method for searching documents
US6415282B1 (en)1998-04-222002-07-02Nec Usa, Inc.Method and apparatus for query refinement
US6272484B1 (en)1998-05-272001-08-07Scansoft, Inc.Electronic document manager
US6687416B2 (en)1998-10-192004-02-03Sony CorporationMethod for determining a correlation between images using multi-element image descriptors
US20030110181A1 (en)1999-01-262003-06-12Hinrich SchuetzeSystem and method for clustering data objects in a collection
US6397181B1 (en)1999-01-272002-05-28Kent Ridge Digital LabsMethod and apparatus for voice annotation and retrieval of multimedia data
US6850644B1 (en)1999-10-012005-02-01Samsung Electronics Co., Ltd.Method for analyzing texture of digital image
US20040267733A1 (en)1999-12-172004-12-30Kim Si HanInformation coding and retrieval system and method thereof
US7162691B1 (en)2000-02-012007-01-09Oracle International Corp.Methods and apparatus for indexing and searching of multi-media web pages
US7028253B1 (en)2000-10-102006-04-11Eastman Kodak CompanyAgent for integrated annotation and retrieval of images
US7099860B1 (en)*2000-10-302006-08-29Microsoft CorporationImage retrieval systems and methods with semantic and feature based relevance feedback
US20050055344A1 (en)2000-10-302005-03-10Microsoft CorporationImage retrieval systems and methods with semantic and feature based relevance feedback
US6970860B1 (en)2000-10-302005-11-29Microsoft CorporationSemi-automatic annotation of multimedia objects
US7231381B2 (en)2001-03-132007-06-12Microsoft CorporationMedia content search engine incorporating text content and user log mining
US20050131951A1 (en)2001-03-302005-06-16Microsoft CorporationRelevance maximizing, iteration minimizing, relevance-feedback, content-based image retrieval (CBIR)
WO2002099697A1 (en)2001-04-122002-12-12Hong-Kyu LeeMethod and system for displaying the searching result with specified image
US7043094B2 (en)2001-06-072006-05-09Commissariat A L'energie AtomiqueProcess for the automatic creation of a database of images accessible by semantic features
US7283992B2 (en)2001-11-302007-10-16Microsoft CorporationMedia agent to suggest contextually related media content
US20030123737A1 (en)2001-12-272003-07-03Aleksandra MojsilovicPerceptual method for browsing, searching, querying and visualizing collections of digital images
US20050165763A1 (en)2002-02-112005-07-28Microsoft CorporationStatistical bigram correlation model for image retrieval
US20040205545A1 (en)2002-04-102004-10-14Bargeron David M.Common annotation framework
US20060061595A1 (en)2002-05-312006-03-23Goede Patricia ASystem and method for visual annotation and knowledge representation
US20040049734A1 (en)2002-09-102004-03-11Simske Steven J.System for and method of generating image annotation information
US7234106B2 (en)2002-09-102007-06-19Simske Steven JSystem for and method of generating image annotation information
US20040225686A1 (en)2003-04-082004-11-11Jia LiSystem and method for automatic linguistic indexing of images by a statistical modeling approach
US20050004910A1 (en)2003-07-022005-01-06Trepess David WilliamInformation retrieval
US20050071365A1 (en)2003-09-262005-03-31Jiang-Liang HouMethod for keyword correlation analysis
US20060020597A1 (en)2003-11-262006-01-26Yesvideo, Inc.Use of image similarity in summarizing a collection of visual images
US20050228825A1 (en)2004-04-062005-10-13Tsun-Yi YangMethod for managing knowledge from the toolbar of a browser
US20060195858A1 (en)2004-04-152006-08-31Yusuke TakahashiVideo object recognition device and recognition method, video annotation giving device and giving method, and program
US20050235272A1 (en)2004-04-202005-10-20General Electric CompanySystems, methods and apparatus for image annotation
US20060041564A1 (en)2004-08-202006-02-23Innovative Decision Technologies, Inc.Graphical Annotations and Domain Objects to Create Feature Level Metadata of Images
US7209815B2 (en)2004-12-282007-04-24Snap-On IncorporatedTest procedures using pictures
US20060155684A1 (en)2005-01-122006-07-13Microsoft CorporationSystems and methods to present web image search results for effective image browsing
US20060161520A1 (en)2005-01-142006-07-20Microsoft CorporationSystem and method for generating alternative search terms
US20060173909A1 (en)2005-01-312006-08-03Carlson Gerard JAutomated image annotation
US20060206475A1 (en)2005-03-142006-09-14Microsoft CorporationSystem and method for generating attribute-based selectable search extension
US20070005571A1 (en)2005-06-292007-01-04Microsoft CorporationQuery-by-image search and retrieval system
US20070052734A1 (en)2005-09-062007-03-08General Electric CompanyMethod and apparatus for annotating images
US20070067345A1 (en)*2005-09-212007-03-22Microsoft CorporationGenerating search requests from multimodal queries
US20070133947A1 (en)2005-10-282007-06-14William ArmitageSystems and methods for image search
US20070174269A1 (en)2006-01-232007-07-26Microsoft CorporationGenerating clusters of images for search results
US20070174320A1 (en)2006-01-252007-07-26Bridgewell IncorporatedMethod and system for generating a concept-based keyword function, search engine applying the same, and method for calculating keyword correlation values
US20070214114A1 (en)*2006-03-132007-09-13Microsoft CorporationProjecting queries and images into a similarity space
US20070239778A1 (en)*2006-04-072007-10-11Eastman Kodak CompanyForming connections between image collections
US20070276820A1 (en)*2006-05-252007-11-29International Business Machines CorporationSystem, method and program for key work searching
US20080071744A1 (en)2006-09-182008-03-20Elad Yom-TovMethod and System for Interactively Navigating Search Results
US20080267503A1 (en)2007-04-262008-10-30Fuji Xerox Co., Ltd.Increasing Retrieval Performance of Images by Providing Relevance Feedback on Word Images Contained in the Images
US20090094234A1 (en)2007-10-052009-04-09Fujitsu LimitedImplementing an expanded search and providing expanded search results
US20100114888A1 (en)2008-10-242010-05-06Van Zwol RoelofDigital image retrieval by aggregating search results based on visual annotations
US20100114908A1 (en)2008-11-042010-05-06Microsoft CorporationRelevant navigation with deep links into query

Non-Patent Citations (90)

* Cited by examiner, † Cited by third party
Title
"Flickr", retrieved on Dec. 31, 2008 at <<http://www.flickr.com/>>, 1 pg.
"Flickr", retrieved on Dec. 31, 2008 at >, 1 pg.
"Google Image Search", retrieved on Dec. 31, 2008 at <<http://images.google.com/>>, Google, 2008, 1 pg.
"Google Image Search", retrieved on Dec. 31, 2008 at >, Google, 2008, 1 pg.
"How Do You Create Search Features for Internet Explorer 8 Beta 2?", retrieved on Dec. 31, 2008 at <<http://www.code-magazine.com/article.aspx?quickid=0811072&page=2>>, CoDe Magazine, 2008, vol. 5, Issue 3, IE8, 9 pages.
"How Do You Create Search Features for Internet Explorer 8 Beta 2?", retrieved on Dec. 31, 2008 at >, CoDe Magazine, 2008, vol. 5, Issue 3, IE8, 9 pages.
"Live Image Search", retrieved on Dec. 31, 2008 at <<http://image.live.com>>, Microsoft Corp., 2009, 1 pg.
"Live Image Search", retrieved on Dec. 31, 2008 at >, Microsoft Corp., 2009, 1 pg.
"XML Search Suggestions Format Specification", retrieved on Dec. 31, 2008 at <<http://msdn.microsoft.com/en-us/library/cc848863(VS.85).aspx>>, Microsoft Corporation 2009, 6 pages.
"XML Search Suggestions Format Specification", retrieved on Dec. 31, 2008 at >, Microsoft Corporation 2009, 6 pages.
"Yahoo! Image Search", retrieved on Dec. 31, 2008 at <<http://images.search.yahoo.com/>>, Yahoo! 2008, 1pg.
"Yahoo! Image Search", retrieved on Dec. 31, 2008 at >, Yahoo! 2008, 1pg.
Baeza-Yates et al., "Query Recommendation using Query Logs in Search Engines", retrieved on Dec. 31, 2008 at <<http://www.dcc.uchile.cl/˜churtado/clustwebLNCS.pdf>>, 10 pages.
Baeza-Yates et al., "Query Recommendation using Query Logs in Search Engines", retrieved on Dec. 31, 2008 at >, 10 pages.
Barnard, et al., "Matching Words and Pictures", retrieved at <<http://jmlr.csail.mit.edu/papers/volume3/barnard03a/barnard03a.pdf>>, Journal of Machine Learning Research, 2003, pp. 1107-1135.
Barnard, et al., "Matching Words and Pictures", retrieved at >, Journal of Machine Learning Research, 2003, pp. 1107-1135.
Beeferman, et al., "Agglomerative Clustering of a Search Engine Query Log", retrieved on Dec. 31, 2008 at <<http://www.dougb.com/papers/kdd.pdf>>, 10 pgs.
Beeferman, et al., "Agglomerative Clustering of a Search Engine Query Log", retrieved on Dec. 31, 2008 at >, 10 pgs.
Boyd, et al., "Convex Optimization", Book, Cambridge University Press, Mar. 2004, 730 pages.
Burstein, "Building an IE8 Visual Search Suggestion Provider for my Twitter Friends", retrieved on Dec. 31, 2008 at <<http://blogs.microsoft.co.il/blogs/bursteg/archive/2008/12/17/building-an-ie8-visual-search-suggestion-provider-for-my-twitter-friends.aspx>>, 13 pgs.
Carpineto, et al., "An Information Theoretic Approach to Automatic Query Expansion", retrieved on Dec. 31, 2008 at <<http://search.fub.it/claudio/pdf/TOIS2001.pdf>>, 35 pages.
Carpineto, et al., "An Information Theoretic Approach to Automatic Query Expansion", retrieved on Dec. 31, 2008 at >, 35 pages.
Cascia, et al., "Combining Textual and Visual Cues for Content-based Image Retrieval on the World Wide Web", retrieved at <<http://www.cs.bu.edu/techreports/pdf/1998-004-combining-text-and-vis-cues.pdf>>, IEEE, BU CS TR98-004, Jun. 1998, 5 pgs.
Cascia, et al., "Combining Textual and Visual Cues for Content-based Image Retrieval on the World Wide Web", retrieved at >, IEEE, BU CS TR98-004, Jun. 1998, 5 pgs.
Crum, "Yahoo Suggestions for Image Search", retrieved on Dec. 31, 2008 at <<http://www.webpronews.com/topnews/2008/12/03/yahoo-suggestions-for-image-search>>, Dec. 3, 2008, pp. 1-4.
Crum, "Yahoo Suggestions for Image Search", retrieved on Dec. 31, 2008 at >, Dec. 3, 2008, pp. 1-4.
Datta, et al., "Image Retrieval: Ideas, Influences, and Trends of the New Age", retrieved on Dec. 31, 2008 at <<http://infolab.stanford.edu/˜wangz/project/imsearch/review/JOUR/datta.pdf>>, ACM Computing Surveys, vol. 40, No. 2, Article 5, Publication date : Apr. 2008, 60 pages.
Datta, et al., "Image Retrieval: Ideas, Influences, and Trends of the New Age", retrieved on Dec. 31, 2008 at >, ACM Computing Surveys, vol. 40, No. 2, Article 5, Publication date : Apr. 2008, 60 pages.
Final Office Action for U.S. Appl. No. 11/956,331, mailed on Jul. 8, 2011, Mingjing Li, "Dual Cross-Media Relevance Model for Image Annotation", 21 pgs.
Final Office Action for U.S. Appl. No. 12/369,421, mailed on Sep. 15, 2011, Linjun Yang, "Visual and Textual Query Suggestion", 21 pages.
Flickner, et al., "Query by Image and Video Content: The QBIC System", at <<http://www.cs.virginia.edu/˜son/cs662.s06/QBIC.pdf>>, IEEE, Sep. 1995, pp. 23-32.
Flickner, et al., "Query by Image and Video Content: The QBIC System", at >, IEEE, Sep. 1995, pp. 23-32.
Frey, et al., "Clustering by Passing Messages Between Data Points", retrieved on Dec. 31, 2008 at <<http://www.psi.toronto.edu/affinitypropagation/FreyDueckScience07.pdf>>, Science Magazine, vol. 315, Feb. 16, 2007, 23 pages.
Frey, et al., "Clustering by Passing Messages Between Data Points", retrieved on Dec. 31, 2008 at >, Science Magazine, vol. 315, Feb. 16, 2007, 23 pages.
Guo, et al., "Enhanced Max Margin Learning on Multimodal Data Mining in a Multimedia Database", retrieved at <<http://delivery.acm.org/10.1145/1290000/1281231/p340-guo.pdf?key1=1281231&key2=0098944911&coll=GUIDE&dl=&CFID=15151515&CFTOKEN=6184618>>, ACM, KDD'07, Aug. 12-15, 2007, pp. 340-349.
He, et al., "Learning an Image Manifold for Retrieval", retrieved on Dec. 31, 2008 at <<http://delivery.acm.org/10.1145/1030000/1027532/p17-he.pdf?key1=1027532&key2=4414980321&coll=GUIDE&dl=GUIDE&CFID=16377253&CFTOKEN=92469850>>, MM 2004, Oct. 10-16, New York, USA, pp. 17-23.
Hua, et al., "Semantic Knowledge Extraction and Annotation for Web Images", In the Proceedings of the 13th Annual ACM International Conference on Multimedia, Nov. 2005, pp. 467-470.
Huang, et al., "Spatial Color Indexing and Applications", retrieved on Dec. 31, 2008 at <<http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=710779&isnumber=15374>>, IEEE Xplore, pp. 602-607.
Huang, et al., "Spatial Color Indexing and Applications", retrieved on Dec. 31, 2008 at >, IEEE Xplore, pp. 602-607.
Inoue, "On the Need for Annotation-Based Image Retrieval", available at least as early as Nov 7, 2007, at <<http://research.nii.ac.jp/˜m-inoue/paper/inoue04irix.pdf>>, National Institute of Informatics, Tokyo, Japan, 3 pgs.
Inoue, "On the Need for Annotation-Based Image Retrieval", available at least as early as Nov 7, 2007, at >, National Institute of Informatics, Tokyo, Japan, 3 pgs.
Jeon, et al., "Automatic Image Annotation and Retrieval using Cross-Media Relevance Models", at <<http://ciir.cs.umass.edu/pubfiles/mm-41.pdf>>, ACM, SIGIR'03, Jul. 28-Aug. 1, 2003, 8 pgs.
Jeon, et al., "Automatic Image Annotation and Retrieval using Cross-Media Relevance Models", at >, ACM, SIGIR'03, Jul. 28-Aug. 1, 2003, 8 pgs.
Jia, et al., "Finding Image Exemplars Using Fast Sparse Affinity Propogation", retrieved on Dec. 30, 2008 at <<http://delivery.acm.org/10.1145/1460000/1459448/p639-jia.pdf?key1=1459448&key2=6654980321&coll=GUIDE&dl=GUIDE&CFID=16934217&CFTOKEN=19327438>>, MM 2008, Oct. 26-31, Vancouver, BC, Canada, pp. 639-642.
Jin, et al., "Effective Automatic Image Annotation via a Coherent Language Model and Active Learning", available at least as early as Jun. 12, 2007, at <<http://delivery.acm.org/10.1145/1030000/1027732/p892-jin.pdf?key1=1027732&key2=2189361811&coll=GUIDE&dl=GUIDE&Cfid=21118987&CFTOKEN=73358540>>, ACM, MM'04, Oct. 10-16, 2004, pp. 892-899.
Jing et al, "An Effective Region-Based Image Retrieval Framework," ACM, Multimedia '02 Proceedings of the tenth ACM international conference on Multimedia, Dec. 2002, 10 pgs.
Kang, et al., "Regularizing Translation Models for Better Automatic Image Annotation", available at least as early as Jun. 12, 2007, at <<http://delivery.acm.org/10.1145/1040000/1031242/p350-kang.pdf?key1=1031242&key2=0070461811&coll=GUIDE&dl=GUIDE&CFID=21120579&CFTOKEN=59010486>>, ACM, CIKM'04, Nov. 8-13, 2004, pp. 350-359.
Lam-Adesina, et al., "Applying Summarization Techniques for Term Selection in Relevance Feedback", ACM, New York, New York, Annual ACM Conference on Research and Development in Information Retrieval, Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, Sep. 2001, pp. 1-9.
Lew, et al., "Content-based Multimedia Information Retrieval: State of the Art and Challenges", retrieved on Dec. 31, 2008 at <<http://www.liacs.nl/home/mlew/mir.survey16b.pdf>>, ACM Transactions on Multimedia Computing, Communications, and Applications, Feb. 2006, 26 pgs.
Lew, et al., "Content-based Multimedia Information Retrieval: State of the Art and Challenges", retrieved on Dec. 31, 2008 at >, ACM Transactions on Multimedia Computing, Communications, and Applications, Feb. 2006, 26 pgs.
Li, et al., "Image Annotation by Large-Scale Content-based Image Retrieval", at <<http://delivery.acm.org/10.1145/1190000/1180764/p607-li.pdf?key1=1180764&key2=0704643811&coll=GUIDE&dl=GUIDE&CFID=22942426&CFTOKEN=26272205>>, ACM, MM'06, Oct. 23-27, 2006, pp. 607-610.
Li, et al., "Statistical Correlation Analysis in Image Retrieval", available at least as early as Jul. 4, 2007, at <<http://research.microsoft.com/˜zhengc/papers/PR02li.pdf>>, 12 pgs.
Li, et al., "Statistical Correlation Analysis in Image Retrieval", available at least as early as Jul. 4, 2007, at >, 12 pgs.
Liu, et al., "An Adaptive Graph Model for Automatic Image Annotation", available at least as early as Jun. 12, 2007, at <<http://delivery.acm.org/10.1145/1180000/1178689/p61-liu.pdf?key1=1178689&key2=6322461811&coll=GUIDE&dl=GUIDE&CFID=21123267&CFTOKEN=91967441>>, ACM, MIR'06, Oct. 26-27, 2006, pp. 61-69.
Non-Final Office Action for U.S. Appl. No. 12/369,421, mailed on Feb. 16, 2012, Linjun Yang et al., "Visual and Textual Query Suggestion", 23 pages.
Office action for U.S. Appl. No. 11/956,331, mailed on Mar. 18, 2013, Li et al., "Dual Cross-Media Relevance Model for Image Annotation", 22 pages.
PCT Search Report for PCT Application No. PCT/US2008/075525, mailed Jan. 9, 2009 (11 pages).
Riaz et al., "Efficient Image Retrieval Using Adaptive Segmentation of HSV Color Space" International conference on Computational Sciences and Its Applications (ICCSA), Jun. and Jul. 2008, 6 pages.
Rui, et al., "A Novel Approach to Auto Image Annotation Based on Pair-wise Constrained Clustering and Semi-naive Bayesian Model", IEEE, In the Proceedings of the 11th International Multimedia Modelling Conference, Jan. 2005, pp. 322-327.
Saber, et al., "Automatic Image Annotation Using Adaptive Color Classification", retrieved at <<http://www.rit.edu/˜esseee/docs/15.pdf>>, Graphical Models and Image Processing, vol. 58, No. 2, Mar. 1996, pp. 115-126.
Saber, et al., "Automatic Image Annotation Using Adaptive Color Classification", retrieved at >, Graphical Models and Image Processing, vol. 58, No. 2, Mar. 1996, pp. 115-126.
Saykol, et al., "A Semi Automatic Object Extraction Tool for Querying in Multimedia Databases", available at least as early as Nov. 14, 2007, at <<http://www.cs.bilkent.edu.tr/˜bilmdg/papers/mis01.pdf>>, Department of Computer Engineering, Bilkent University, Ankara, Turkey, 10 pgs.
Saykol, et al., "A Semi Automatic Object Extraction Tool for Querying in Multimedia Databases", available at least as early as Nov. 14, 2007, at >, Department of Computer Engineering, Bilkent University, Ankara, Turkey, 10 pgs.
Sigurbjornsson, et al., "Flickr Tag Recommendation based on Collective Knowledge", retrieved on Dec. 31, 2008 at <<http://www2008.org/papers/pdf/p327-sigurbjornssonA.pdf>>, WWW 2008, Apr. 21-25, Beijing, China, pp. 327-336.
Sigurbjornsson, et al., "Flickr Tag Recommendation based on Collective Knowledge", retrieved on Dec. 31, 2008 at >, WWW 2008, Apr. 21-25, Beijing, China, pp. 327-336.
Smeulders, et al., "Content-Based Image Retrieval at the End of the Early Years", retrieved on Dec. 31, 2008 at <<http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=00895972>>, IEEE Transactions on Pattern Analysis and Machine Intellegence, vol. 22, No. 12, Dec. 2000, pp. 1349-1380.
Smeulders, et al., "Content-Based Image Retrieval at the End of the Early Years", retrieved on Dec. 31, 2008 at >, IEEE Transactions on Pattern Analysis and Machine Intellegence, vol. 22, No. 12, Dec. 2000, pp. 1349-1380.
Snoek, et al., "MediaMill: Video Search Using a Thesaurus of 500 Machine Learned Concepts", retrieved on Dec. 31, 2008 at <<http://staff.science.uva.nl/˜cgmsnoek/pub/snoek-demo-samt2006.pdf>>, Intelligent Systems Lab Amsterdam, Informatics Institute, University of Amsterdam, The Netherlands, pp. 1-2.
Snoek, et al., "MediaMill: Video Search Using a Thesaurus of 500 Machine Learned Concepts", retrieved on Dec. 31, 2008 at >, Intelligent Systems Lab Amsterdam, Informatics Institute, University of Amsterdam, The Netherlands, pp. 1-2.
Suh, et al., "Semi-Automatic Image Annotation Using Event and Torso Identification", available at least as early as Nov. 14, 2007, at <<http://hcil.cs.umd.edu/trs/2004-15/2004-15.pdf>>, 4 pgs.
Suh, et al., "Semi-Automatic Image Annotation Using Event and Torso Identification", available at least as early as Nov. 14, 2007, at >, 4 pgs.
Swets, et al., "Using Discriminant Eigenfeatures for Image Retrieval", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, 1996, pp. 831-836, 6 pgs.
Wang, et al., "Automatic Image Annotation and Retrieval Using Subspace Clustering Algorithm", available at least as early as Jun. 12, 2007, at <<http://delivery.acm.org/10.1145/1040000/1032621/p100-wang.pdf?key1=1032621&key2=2890461811&coll=GUIDE&dl=GUIDE&CFID=21121103&CFTOKEN=32821806>>, ACM, MMDB'04, Nov. 13, 2004, pp. 100-108.
Wang, et al., "Automatic Image Annotation and Retrieval Using Weighted Feature Selection", available at least as early as Jun. 12, 2007, at <<http://www.utdallas.edu/˜lkhan/papers/MTA1.pdf>>, 17 pgs.
Wang, et al., "Automatic Image Annotation and Retrieval Using Weighted Feature Selection", available at least as early as Jun. 12, 2007, at >, 17 pgs.
Weinberger, et al., "Resolving Tag Ambiguity", retrieved on Dec. 31, 2008 at <<http://research.yahoo.com/files/ctfp6043-weinberger.pdf>>, MM 2008, Oct. 26-31, Vancouver, BC, Canada, 9 pages.
Weinberger, et al., "Resolving Tag Ambiguity", retrieved on Dec. 31, 2008 at >, MM 2008, Oct. 26-31, Vancouver, BC, Canada, 9 pages.
Wen, et al., "Clustering User Queries of a Search Engine", retrieved on Dec. 31, 2008 at <<https://research.microsoft.com/en-us/um/people/jrwen/jrwen—files/publications/QC-WWW10.pdf>>, WWW10, May 1-5, 2001, Hong Kong, pp. 162-168.
Wen, et al., "Clustering User Queries of a Search Engine", retrieved on Dec. 31, 2008 at >, WWW10, May 1-5, 2001, Hong Kong, pp. 162-168.
Worring, et al., "The MediaMill Large-lexicon Concept Suggestion Engine", retrieved on Dec. 31, 2008 at <<http://staff.science.uva.nl/˜cgmsnoek/pub/worring-demo-acm2006.pdf>>, MM 2006, Oct. 23-27, Santa Barbara, CA., 2 pages.
Worring, et al., "The MediaMill Large-lexicon Concept Suggestion Engine", retrieved on Dec. 31, 2008 at >, MM 2006, Oct. 23-27, Santa Barbara, CA., 2 pages.
Xu, et al., "Query Expansion Using Local and Global Document Analysis", retrieved on Dec. 31, 2008 at <<http://citeseerx.ist.psu.edu/viewdock/download?doi=10.1.1.49.3174&rep=rep1&type=pdf>>, Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts, MA., 8 pages.
Xu, et al., "Query Expansion Using Local and Global Document Analysis", retrieved on Dec. 31, 2008 at >, Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts, MA., 8 pages.
Xue, et al., "Improving Web Search Using Image Snippets", retrieved on Dec. 31, 2008 at <<http://cs.nju.edu.cn/zhouzh/zhouzh.files/publication/toit08.pdf>>, ACM 2008, 27 pages.
Xue, et al., "Improving Web Search Using Image Snippets", retrieved on Dec. 31, 2008 at >, ACM 2008, 27 pages.
Yu, et al., "Improving Pseudo-Relevance Feedback in Web Information Retrieval Using Web Page Segmentation", retrieved on Dec. 31, 2008 at <<http://www2003.org/cdrom/papers/refereed/p300/p300-Yu.html>>, WWW2003, May 20-24, 2003, Budapest, Hungary, 13 pages.
Yu, et al., "Improving Pseudo-Relevance Feedback in Web Information Retrieval Using Web Page Segmentation", retrieved on Dec. 31, 2008 at >, WWW2003, May 20-24, 2003, Budapest, Hungary, 13 pages.
Zhou, et al., "Ranking on Data Manifolds", retrieved on Dec. 31, 2008 at <<http://www.kyb.mpg.de/publications/pdfs/pdf2334.pdf>>, Max Planck Institute for Biological Cybernetics, Germany, 8 pages.
Zhou, et al., "Ranking on Data Manifolds", retrieved on Dec. 31, 2008 at >, Max Planck Institute for Biological Cybernetics, Germany, 8 pages.
Zhu et al., "New Query Refinement and Semantics Integrated Image Retrieval System with Semiautomatic Annotation Scheme", Journal of Electronic Imaging, vol. 10(4), Oct. 2001, 11 pages.

Cited By (19)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US10007679B2 (en)2008-08-082018-06-26The Research Foundation For The State University Of New YorkEnhanced max margin learning on multimodal data mining in a multimedia database
US8572109B1 (en)2009-05-152013-10-29Google Inc.Query translation quality confidence
US8577909B1 (en)2009-05-152013-11-05Google Inc.Query translation using bilingual search refinements
US8577910B1 (en)2009-05-152013-11-05Google Inc.Selecting relevant languages for query translation
US8538957B1 (en)*2009-06-032013-09-17Google Inc.Validating translations using visual similarity between visual media search results
US10268703B1 (en)*2012-01-172019-04-23Google LlcSystem and method for associating images with semantic entities
US20150154232A1 (en)*2012-01-172015-06-04Google Inc.System and method for associating images with semantic entities
US9171018B2 (en)*2012-01-172015-10-27Google Inc.System and method for associating images with semantic entities
US9600496B1 (en)*2012-01-172017-03-21Google Inc.System and method for associating images with semantic entities
US9959340B2 (en)*2012-06-292018-05-01Microsoft Technology Licensing, LlcSemantic lexicon-based input method editor
US20150121290A1 (en)*2012-06-292015-04-30Microsoft CorporationSemantic Lexicon-Based Input Method Editor
US10319035B2 (en)2013-10-112019-06-11Ccc Information ServicesImage capturing and automatic labeling system
US9990433B2 (en)2014-05-232018-06-05Samsung Electronics Co., Ltd.Method for searching and device thereof
US10223466B2 (en)2014-05-232019-03-05Samsung Electronics Co., Ltd.Method for searching and device thereof
US11080350B2 (en)2014-05-232021-08-03Samsung Electronics Co., Ltd.Method for searching and device thereof
US11157577B2 (en)2014-05-232021-10-26Samsung Electronics Co., Ltd.Method for searching and device thereof
US11314826B2 (en)2014-05-232022-04-26Samsung Electronics Co., Ltd.Method for searching and device thereof
US11734370B2 (en)2014-05-232023-08-22Samsung Electronics Co., Ltd.Method for searching and device thereof
US10162865B2 (en)2015-10-082018-12-25Microsoft Technology Licensing, LlcGenerating image tags

Also Published As

Publication numberPublication date
US20090074306A1 (en)2009-03-19
WO2009035930A1 (en)2009-03-19

Similar Documents

PublicationPublication DateTitle
US8457416B2 (en)Estimating word correlations from images
US8571850B2 (en)Dual cross-media relevance model for image annotation
Wu et al.Learning to tag
US9558263B2 (en)Identifying and displaying relationships between candidate answers
RinaldiAn ontology-driven approach for semantic information retrieval on the web
US8321424B2 (en)Bipartite graph reinforcement modeling to annotate web images
US20130268519A1 (en)Fact verification engine
CN101692223A (en)Refining a search space inresponse to user input
WO2009059297A1 (en)Method and apparatus for automated tag generation for digital content
He et al.A framework of query expansion for image retrieval based on knowledge base and concept similarity
Sharma et al.Finding similar patents through semantic query expansion
US20090177633A1 (en)Query expansion of properties for video retrieval
Hu et al.Intelligent information retrieval applying automatic constructed fuzzy ontology
ThuonKhmer Semantic Search Engine (KSE): Digital Information Access and Document Retrieval
Chen et al.A similarity-based method for retrieving documents from the SCI/SSCI database
Dahir et al.Query expansion based on modified Concept2vec model using resource description framework knowledge graphs
VijetaA restricted domain medical question answering system
Chinpanthana et al.Deep textual searching for visual semantics of personal photo collections with a hybrid similarity measure
Kotis et al.Mining query-logs towards learning useful kick-off ontologies: an incentive to semantic web content creation
Awad et al.Semantic Relevance Feedback Based on Local Context
LoseeA performance model of the length and number of subject headings and index phrases
AygülSearching documents with semantically related keyphrases
Fan et al.KeyOnto: A Hybrid Knowledge Retrieval Model in Law Semantic Web
ChatterjeeInvestigations on effective techniques for Bengali information retrieval and summarization of search results
Dejprapatsorn et al.BERT-Based Semantic Retrieval for Academic Abstracts

Legal Events

DateCodeTitleDescription
ASAssignment

Owner name:MICROSOFT CORPORATION, WASHINGTON

Free format text:ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:LIU, JING;WANG, BIN;LI, ZHIWEI;AND OTHERS;REEL/FRAME:020246/0581

Effective date:20071029

FEPPFee payment procedure

Free format text:PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCFInformation on status: patent grant

Free format text:PATENTED CASE

ASAssignment

Owner name:MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON

Free format text:ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034542/0001

Effective date:20141014

FPAYFee payment

Year of fee payment:4

LAPSLapse for failure to pay maintenance fees

Free format text:PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FEPPFee payment procedure

Free format text:MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCHInformation on status: patent discontinuation

Free format text:PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FPLapsed due to failure to pay maintenance fee

Effective date:20210604


[8]ページ先頭

©2009-2025 Movatter.jp