CROSS-REFERENCE TO RELATED APPLICATIONThis application claims the benefit of U.S. Provisional Application 61/067,162, filed Feb. 25, 2008, entitled “Platforms, systems, and methods for data handling,” which application is hereby incorporated by reference in its entirety.
TECHNICAL FIELDEmbodiments of the invention relate to computer systems and software for advertising selection and display based on electronic profile information.
BACKGROUNDAdvertising systems presenting advertisements to Internet browsers may choose advertisements to display in a variety of ways. A website may simply have sponsors, and sell advertisements in an analogous manner to the sale of advertising space in a newspaper or magazine.
However, some systems guess what may be appropriate or desirable for users based on limited available information. For example, contextual advertising systems may provide an advertisement for a web page based in part on a target word in the web page. These systems have no way of knowing if the advertisement is actually relevant to the user viewing the web page—the advertisement is chosen simply because it matches a target word on the web page. For example, Google may display advertisements based on words contained in a user's email message or search string. The advertisement is selected based on the content of the single email message being viewed. No other information about the user is available.
Some systems decide what products may be desirable for a user based on ratings of other similar products provided by the user. For example, some recommendation services receive limited user ratings, or implicit ratings based on views or purchases, of a certain kind of product—books or movies for example—and recommend other books or movies that the user may like based on similarity to items favorably rated, such as authors, themes, actors, directors, genres, and the like.
Similarly, other systems may select advertisements to display based on the content of stored cookies associated with the user browsing the website. This may be done in some cases without the user's informed consent, raising privacy concerns for the user.
These previous systems also suffer from being proprietary to the particular website or electronic service accessed. For example, web sites such as Facebook, Ticketmaster, and ESPN, maintain some profile information associated with their users. However, the profile information stored by the user at one site is generally inaccessible to others, depriving the user of its benefit as they travel to other websites. Allowing one site to share information with others again raises privacy concerns. It often may be prohibitive for one system to obtain the necessary user consent to share profile information with another system.
Accordingly, current systems have a variety of drawbacks in how they select and display advertisements.
BRIEF DESCRIPTION OF THE DRAWINGSFIG. 1 is a schematic diagram of a system according to an embodiment of the present invention.
FIG. 2 is a schematic illustration of a conceptual database schema for an electronic profile according to an embodiment of the present invention.
FIG. 3 is a schematic illustration of a profile management interface operating in a browser window of a display according to an embodiment of the present invention.
FIG. 4 is a flowchart illustrating operation of a disambiguation engine according to an embodiment of the present invention.
FIG. 5 is a flowchart illustrating operation of an indexing engine according to an embodiment of the present invention.
FIG. 6 is a flowchart illustrating operation of a disambiguation engine according to an embodiment of the present invention.
FIG. 7 is a flowchart illustrating operation of an analysis engine according to an embodiment of the present invention.
FIG. 8 is a schematic illustration of a web browser operating a plug-in according to an embodiment of the present invention.
FIG. 9 is a schematic illustration of a web browser operating a plug-in according to an embodiment of the present invention.
FIG. 10 is a flowchart illustrating operation of a system according to an embodiment of the present invention.
DETAILED DESCRIPTIONCertain details are set forth below to provide a sufficient understanding of embodiments of the invention. However, it will be clear to one skilled in the art that embodiments of the invention may be practiced without various of these particular details. In some instances, well-known computer system components, network architectures, control signals, and software operations have not been shown in detail in order to avoid unnecessarily obscuring the described embodiments of the invention.
Embodiments of the invention provide a system for selecting advertisements or other content for an entity accessing network accessible content. The selections are made by the system based on an electronic profile of the entity and the content accessed by the entity, such as, but not limited to, a web page, web site, email, messaging, message item, document, or image. A browser plug-in may render the selected advertisement, content, or both in a separate window or a portion of the browser window. In this manner, the selected content, advertisements, or both may remain as the entity browses to other sites or accesses other content. Although the same area may be used to display content and advertisements, the selected advertisements and content may change as the entity navigates to different websites or accesses different network accessible content.
The electronic profiles used to select the content, advertisements, or both for display may have been developed by a profiling system, embodiments of which were described in concurrently filed, co-owned U.S. application Ser. No. ______, entitled “Electronic profile development, storage, use, and systems therefor,” filed Dec. 12, 2008, which application is hereby incorporated herein by reference in its entirety. Electronic profiles described herein include data structures containing information about an entity, all or a portion of which may be used as input to an analysis engine that selects contents, advertisements, or both, based in part on the electronic profile. As will be described below, an entity may control the use of all or portions of their electronic profile, allowing it to be used in part or completely to score and select content responsive to requests from particular entities. The analysis engine uses information from the electronic profile to select links to content, advertisements, or both, for the entity. The information contained in an electronic profile is generally information about an entity associated with the electronic profile, which may also be referred to as the entity owning the electronic profile. The entity may be a person or a group of people. The entity may also be a segment of people that share a common attribute. The entity may also be, but not limited to, a product, place, business, or item of content. The entity may be a segment of things that share a common attribute.
An example of asystem100 according to an embodiment of the present invention is shown inFIG. 1. Aprofiling system110 includes aprofile management system115, adisambiguation engine120, and ananalysis engine125. These individual components will be discussed further below, and additional embodiments are described in concurrently filed, co-owned U.S. application Ser. No. ______, entitled “Electronic profile development, storage, use, and systems therefor,” filed Dec. 12, 2008, which application is hereby incorporated by reference in its entirety. Theprofiling system110 generally includes a processor and memory to store computer readable instructions that may cause the processor to implement the functionalities of theprofile management system115,disambiguation engine120, andanalysis engine125 described below. Although shown as a unitary system, theprofiling system110 may be implemented as distributed across a plurality of computing devices, with portions of the processing performed by each of the devices. In embodiments of the present invention, theprofiling system110 receives information about the web browsing activities of an entity, and enables the transmission of selected content to the entity, where the selected content is chosen based in part on an electronic profile associated with the entity and the information about the entity's web browsing activities.
Auser device130, which may be implemented as generally any network connected, digital media delivery system or device. Theuser device130 may have suitable processing, memory, and communication capabilities to implement acontent viewer137, is in communication with theprofiling system110. Theuser device130 may also have the capability to implement aprofile management interface135 in some embodiments, although in some embodiments theprofile management interface135 may not be included on theuser device130. Thecontent viewer137 may be implemented as an Internet browser plug-in or as a stand-alone application used to view content selected based on information regarding an entity's browsing activity, network accessing activity, or both, and their electronic profile, as described further below, or thecontent viewer137 may be embedded in a different application. Theuser device130 may accordingly be, but is not limited to, a personal computer, kiosk, cell phone, personal digital assistant, television set-top box, television, GPS system, projector, display, or music player. Theuser device130 may be specific to a single user, or may be used by multiple users, such as in the case of a publicly accessible workstation or kiosk. A display used by the entity as theuser device130 may be co-located in a same physical device as a processor for performing functions of a user device described herein, or the display may be in a remote or different location than the display. That is, an entity may view content items, advertisements, or both, selected by a system according to embodiments of the present invention on a stand-alone display device that may have limited processing capability. In some embodiments, the stand-alone display device may be coupled to or in communication with a computing device having processing capability to perform the user device functionality described herein. In some embodiments, the user need not be a physical person, but may be a representative of a group of people, or may be another automated process or computer program performing a profile entry functionality. Communication between theprofiling system110 and theuser device130 may occur through any mechanism. In some embodiments, theprofiling system110 may be implemented completely or partially as a web service that may communicate with theuser device130 over the Internet using http, in either a secure or unsecured manner, as desired. Theprofile management interface135 enables communication with theprofile management system115 to establish, augment, or otherwise manipulate profile information pertaining to an entity represented by a user using theuser device130. Thedisambiguation engine120 may receive profile information supplied from theuser device130 and further process the information to reduce ambiguity in the information provided, as will be described further below. The processing to reduce ambiguity may occur dynamically through interaction with the user device. Any number of user devices may be in communication with theprofiling system110.
Profile information received from theuser device130 and other sources is processed by theprofile management system115 anddisambiguation engine120 to generate electronic profiles that are stored in the electronic profile storage140. As will be described further below, the electronic profiles may be database structures and accordingly may be stored in a database as shown inFIG. 1. However, any type of electronic storage may be used to store electronic profiles and the profiles may be stored in any number of distinct storage locations, and individual profiles may be distributed across a plurality of storage locations. Electronic profiles will be discussed in greater detail below.
In embodiments of the present invention, theprofiling system110 may receive further information from theuser device130, such as a name or all or a portion of the content of a web page browsed by the entity operating theuser device130. This information may also be stored in the electronic profile storage140 or other storage, although it may only be temporarily stored, or may not be stored at all in some embodiments.
Theuser device130 further operates an Internet browser and acontent viewer137, which may be a browser plug-in. In some embodiments the browser plug-in runs on thesame user device130 as theprofile management interface135, however in some embodiments thecontent viewer137 operates on a user device having noprofile management interface135. That is, an entity need not enter or refine profile information using the same device on which they will view advertisements and links selected based on their stored profile information.
Theuser device130 may be connected to a web server139 or other sources of information over the Internet and an entity may use theuser device130 to browse the web using any Internet browser or other software.
Content sources142 represent any source of content, including advertisements that may be images, text, video, or combinations thereof. Advertisements may be provided by any number of businesses or advertisers. Theanalysis engine125, indexing engine, or combinations of engines, may analyze the content from thecontent sources142 and store advertisements in thead storage144 and links to content in thelink storage146. In some embodiments,content sources142,ad storage144,link storage146, or combinations thereof may include a set of content sources, advertisements, links, or combinations thereof that are designated as sponsored content sources, advertisements, or links. The sponsored content sources, advertisements, and links may be analyzed separately or differently from other content sources, advertisements, and links, and in some embodiments may be physically stored separately. Although shown as separate storage devices, in some embodiments the advertisements and content links may be stored on a same storage medium, and may be distributed across any number of physical storage locations. Further, in some embodiments the advertisements, links, or both may be stored on the same physical storage device as some or all of the electronic profiles in the electronic profile storage140. As will be described further below, the advertisements and links may be stored along with an index indicating the relative frequency of terms in or associated with the advertisements and links. Although advertisements and links have been described other content, rich media, or other application functionalities may be stored an accessed by theprofiling system110.
Theanalysis engine125 may score advertisements, links, rich media, other application functionality, or combinations thereof based on one or more of the electronic profiles stored in electronic profile storage140. The score may additionally be influenced by a website accessed by theuser device130. The output of this process may be provided to thecontent viewer137 such that a number of relevant links, advertisements, or both are displayed in a browser window displayed on theuser device130. There may be a fixed number of respective links and advertisements displayed, or all links or advertisements having a score above a certain threshold may be displayed in some embodiments.
Accordingly, an entity may communicate profile information to theprofiling system110 through theprofile management interface135 in communication with theprofile management system115. Theprofile management system115 and thedisambiguation engine120 may refine and expand the profile information provided. An electronic profile of the entity is stored in electronic profile storage140. While a single electronic profile storage140 location is shown inFIG. 1, the electronic profile may in some embodiments be distributed across a plurality of storage locations, including across a plurality of storage locations associated with different physical electronic devices that may be used by an entity. Accordingly, in some embodiments, only a portion of the entity's profile may be located on the electronic profile storage140. As the entity browses the web or other network available content (either on theuser device130 or another device), using a browser equipped with thecontent viewer137, thecontent viewer137 requests advertisements, links, or both from theanalysis engine125. Thecontent viewer137 may also transmit information about the network accessible content accessed to theprofiling system110 for use by theanalysis engine125. In embodiments where the network accessible content accessed includes a web page or web site, information about the webpage or site accessed may include but is not limited to URL, metadata, time and date visited, content of the website viewed, and website host. In embodiments where the network accessible content accessed is not a web page, the information transmitted may include metadata associated with the accessed content, terms or other features of the content, a location of the content, a file type, and one or more protocols associated with the content, or combinations thereof.
Theanalysis engine125 accesses the entity's electronic profile stored in electronic profile storage140 and, provided the entity has chosen to allow all or a portion of its profile information to be used responsive to a request from thecontent viewer137, scores thead storage144,link storage146, or both in accordance with the accessed electronic profile, information received about the website or page visited, or both. The resultant scores are used to select advertisement, links or both for display by thecontent viewer137 in a browser on theuser device130 along with the website content requested.
In this manner, theprofiling system110 may serve as a trusted intermediary between an entity and advertisement and content provider. A content provider who provides content to be indexed and stored in thead storage144,link storage146, or both, will have that content communicated to users when theanalysis engine125 determines that the content would be relevant for them. The content provider does not actually receive the profile information itself. Being able to control the accessibility of the profile information, and knowing content providers may not obtain the information directly, entities may share a greater amount of information with theprofiling system110.
Further, through theprofile management system115 anddisambiguation engine120, the electronic profiles may be more structured while being easily created than those created purely through freeform user input. Thedisambiguation engine120 may suggest related terms for addition to an entity's profile, that the entity may confirm or deny.
Having described an overview of an example of asystem100 according to the present invention, examples of electronic profiles will now be discussed. Electronic profiles described herein include data structures containing information about an entity, all or a portion of which may be used as input to an analysis engine that may take a predictive or deterministic action based in part on the electronic profile. For example, recall electronic profiles may be stored in the electronic profile storage140 and used by theanalysis engine125 to identify advertisements or links to content that may be relevant to the entity associated with the electronic profile.
Examples of electronic profiles accordingly include data structures. Any type of data structure may be used that may store the electronic profile information described below. In one embodiment, the electronic profile is stored in a relational database.FIG. 2 illustrates a portion of aconceptual database schema200 for an electronic profile according to an embodiment of the present invention. Thedatabase schema200 is organized as a star schema, but other organizations may be employed in other embodiments. Theschema200 includes several tables relating aspects of the electronic profile to one another that provide information about the entity owning the electronic profile. The database constructed according to theschema200 may be stored on generally any suitable electronic storage medium. In some embodiments, portions of an electronic profile may be distributed amongst several electronic storage media, including among storage media associated with different electronic devices used by an entity.
Information stored in an electronic profile about an entity may include, but is not limited to any combination of the following: data, preferences, possessions, social connections, images, permissions, recommendation preferences, location, role and context. These aspects of an entity may be used in any combination by an analysis engine to take predictive or deterministic action as generally described above. Examples of aspects of profile information included in theelectronic profile200 will now be described further.
The electronic profile represented by theschema200 includes data about an entity in a user table201. While the term ‘user’ is used inFIG. 2 to describe tables and other aspects of the profile, the term is not meant to restrict profiles to individuals or human representatives. The term ‘user’ inFIG. 2 simply refers to the entity associated with the profile.
Data202 about the entity stored in the user table201. The table201 may include a column for each type of data. For example, data associated with UserID1 includes name (‘Bob Smith’), address (555 Park Lane), age (35), and gender (Male) of the entity. Data associated with UserID2 includes height (5′10″), weight (180), and gender (Female). Data associated with UserID2 includes financial information and an address (329 Whistle Way). Data about an entity stored in the user table201 may generally include factual or demographic information such as, but not limited to, height, address, clothing sizes, contact information, financial information, credit card number, ethnicity, weight, and gender. Any combination of data types may be stored. The user table201 also includes auser ID203. The user ID may be generated by a system generating or using the electronic profile, or may be associated with or identical to a user ID already owned by the profile owning entity, such as an email account or other existing account of the entity. Each entity having an electronic profile may have a corresponding user table, such as the user table201, stored in the electronic profile storage140 ofFIG. 1.
Preferences of an entity may also be stored in the entity's electronic profile. Preferences generally refer to subjective associations between the entity and various words that may represent things, people, or groups. Each preference of an individual represents that association—“I like cats,” for example, may be one preference. Preferences may be stored in any suitable manner. In the schema ofFIG. 2, preferences are stored by use of the user preferences table210, the user preference terms table220, the preference terms table230, and the preference qualifiers table240, which will be described further below. The four tables used to represent preference inFIG. 2 is exemplary only, and preferences may be stored in other ways in other embodiments such that a profile owning entity is associated with their preferences.
Referring again toFIG. 2, the user table201 of an entity is associated with a user preferences table210. The user preferences table210 includesuserIDs203 of entities having profiles in the electronic profile storage140 and listsindividual preference IDs211 associated with each userID. For example, the UserID1 is associated with SPORTS-PREFERENCE1 and SPORTS_TRAVEL_PREFERENCE1 in the example shown inFIG. 2. Although shown as including only afew user IDs203, the user preferences table210 may generally include a list of multiple user IDs known to the profiling system and a list of individual preference IDs associated with the userIDs. In this manner, an entity's preferences may be associated with the data related to the entity. Generally, any string may be used to represent a preference ID. Also included in the user preference table210 arequalifier IDs212 that are used to record an association with terms contained in the preference. The qualifiers will be discussed further below.
Each preference ID has an associated entry in a user preference terms table220. The user preference terms table220 contains a list of term IDs associated with each user preference ID. InFIG. 2, for example, the preference ID SPORTS_PREFERENCE1 is shown associated with TermID1 and TermID2. Any string may generally be used to represent the term IDs. Each TermID in turn is associated with an entry in a preference term table230. The preference term table230 lists the actual terms represented by the TermID. A term may generally be any string and is generally a unit of meaning, which may be one or more words, or other representation. As shown inFIG. 2, the preference terms table230 indicates the TermID1 is associated with the term Major League Baseball. Although only one term is shown associated with the TermID1, any number of terms may be so associated.
Accordingly, as described above, an entity may be associated with preferences that ultimately contain one or more terms. However, the relationship between the entity and the terms has not yet been described. An entity's preferences may include a scale of likes, dislikes, or both of the entity. Further an entity's preferences may include information about what the entity is or is not, does or does not do in certain circumstances. In theschema200 ofFIG. 2, each preference may be associated with one or more qualifiers, as indicated by an association between the preference ID and a qualifier ID in the user preferences table210. A term associated with each qualifier ID is then stored in a preference qualifiers table240. Qualifiers describe the relationship of the preference terms to the profile owning entity. Examples of qualifiers include ‘like’ and ‘dislike’ to describe a positive or negative association with a preference, respectively. Other qualifiers may be used including ‘when’, ‘when not’, ‘never’, ‘always’, ‘does’, ‘does not’, ‘is’, and ‘is not’ to make more complex associations between preference words and the profile owning entity. As shown inFIG. 2, the qualifier QualID1 represents the association ‘like’ and, QualID2 represents the association ‘dislike’.
Accordingly, the structure shown inFIG. 2 encodes two preferences for an entity represented by UserID1. SPORTS_PREFERENCE1 indicates UserID1 likes Major League Baseball and the Seattle Mariners. SPORTS_PREFERENCE2 indicates UserID1 likes Fenway Park. Similarly, UserID2 has SPORTS_PREFERENCE2, which indicates UserID2 dislikes Major League Baseball and the New York Yankees. UserID3 has SPORTS_PREFERENCE3, which indicates UserID3 likes Derek Jeter.
The manner of storing preferences using the tables described inFIG. 2 may aid in efficient storage and analysis by allowing, for example, multiple termIDs to be associated with multiple user preference IDs without requiring storing the individual terms multiple times in the profile storage140 ofFIG. 1. Instead, multiple associations may be made between the termID and multiple user preferences. However, as discussed, generally, any data structure may be used to encode an electronic profile of an entity. In some embodiments, a profile may be represented and optionally stored as a vector or index. The vector may uniquely identify an entity associated with the profile. For example, the profile vector may represent a plurality of axes, each axis representing a term, word, or user device, and the vector include bits associated with each term, word, and user device to be included in the profile.
Further information regarding an entity may be stored in an entity's electronic profile including possessions, images, social connections, permissions, recommendation preferences, location, roles, context, and appearance settings for a content viewer. Although not shown inFIG. 2, these further aspects may be stored as additional star tables associated with the central user table201. Possessions of the entity may include things the entity owns or has access to including, but not limited to, gaming systems, cell phones, computers, cars, clothes, bank or other accounts, subscriptions, and cable or other service providers.
Social connections of the entity may include, but are not limited to, connections to friends, family, neighbors, co-workers, organizations, membership programs, information about the entity's participation in social networks such as Facebook, Myspace, or LinkedIn, or businesses an entity is affiliated with.
Permissions for accessing all or a portion of the electronic profile are described further below but may include an indication of when an entity's profile information may be used. For example, an entity may authorize their profile information to be used by the profiling system responsive only to requests from certain entities, and not responsive to requests from other entities. The permissions may specify when, how, how often, or where the profiling system may access the entity's profile responsive to a request from a specific entity, or type of entity. For example, an entity may specify that sports websites may obtain information about content relevant to the entity's profile, but that banks may not. As generally described above, only the profiling system has direct access to the stored profile information, and the profile information is not generally shared with content providers that may request scoring of their content based on the entity's profile. However, the scoring may only be undertaken in some embodiments when the entity has granted permission for their profile to be used to provide information to the particular content provider or browser plug-in.
Recommendation preferences may include whether the entity would like or accept recommendations for additional information to be added to their electronic profile, or for data or possessions. The recommendation preferences may specify which entities may make recommendations for the electronic profile owning entity and under what conditions.
Location information of the entity may include a current location determined in a variety of levels of granularity such as, but not limited to, GPS coordinate, country, state, city, region, store name, church, hotel, restaurant, airport, other venue, street address, or virtual location. In some embodiments location information may be obtained by analyzing an IP address associated with an entity.
Roles of the entity may include categorizations of the entity's relationships to others or things including, but not limited to, father, mother, daughter, son, friend, worker, brother, sister, sports fan, movie fan, wholesaler, distributor, retailer, and virtual persona (such as in a gaming environment or other site).
Context of the entity may include an indication of activities or modes of operation of the entity, including what the entity is doing in the past, present, or future, such as shopping, searching, working, driving, or processes the entity is engaged in such as purchasing a vacation.
Appearance settings for a content viewer may also be stored in the electronic profile of an entity, which may include electronic wallpaper information, skinning, or branding information, or combinations thereof. The appearance settings may be used to render selected content for an entity in a window having the wallpaper, skin, or other appearance indicated by the appearance settings in an entity's electronic profile.
As will be described further below, all or a portion of the electronic profile may be used as an input to an analysis engine. In some embodiments, there may be insufficient data about an individual to have a meaningful output of the analysis engine based on their electronic profile. Accordingly, in some embodiments the profile of a segment sharing one or more common attributes with the individual may be used as input to the analysis engine instead of or in addition to the individual's profile. The profile of a segment may also be used to select content that may be relevant for that segment of entities, and pass content to entities that share one or more attributes with the segment.
Having described exemplary mechanisms for storing profile information and the content of electronic profiles, exemplary methods and systems for obtaining profile information will now be discussed. Profile information may generally be obtained from any source, including from a representative of the profile owning entity, other individuals, or from collecting data about the profile owning entity as they interact with other electronic systems. In some embodiments, referring back toFIG. 1, profile information may be directly entered by a profile owning entity or their representative from theuser device130 using theprofile management interface135. Profile information may be obtained generally at any time. In one embodiment, when an entity installs thecontent viewer137, they may be prompted to establish an electronic profile.
Theprofile management interface135 may take any form suitable for receiving profile information from a profile owning entity or their representative. In one embodiment, theprofile management interface135 includes an application operating on theuser device130. The application on theuser device130 may communicate with theprofiling system110. In one embodiment, the disambiguation engine, analysis engine, or both may be implemented as an application programming interface (API), and the application operating on theuser device130 may call one or more APIs operated by theprofiling system110. In some embodiments, the application on theuser device130 that is in communication with theprofiling system110 operates in an Internet browser window, and one embodiment of theprofile management interface135 is shown inFIG. 3 operating in a browser window of adisplay305 of the user device. In other embodiments, an application runs on the user device, which may be any network connected, digital media delivery system or device, including a phone, personal computer, kiosk, cell phone, personal digital assistant, television set-top box, television, GPS system, projector, display, or music player, to interface with theprofiling system110. A profile owning entity, or a representative of that entity, may enter profile information into thepreference entry field310. Prior to entering information, the entity may have identified themselves to the profiling system by, for example, entering a username, password, or both, or other methods of authentication may be used including identification of one or more user devices and their context associated with the entity. When entering profile information into thepreference entry field310, the entity may also select a qualifier associated with the profile information using aqualifier selector308. Thequalifier selector308, which may be unique for the entity in some embodiments, may include a drop-down menu, buttons depicting different qualifiers, or other mechanism. For example, thequalifier selector308 may include a button for ‘Like’ and one for ‘Dislike’ so an entity could specify that they like or dislike the terms they provide in thepreference entry field310. The entity may submit the entered profile information to theprofile management system115 of theprofiling system110 inFIG. 1. Information may be submitted, for example, by pressing an enter key, or clicking on an enter button displayed in the browser window302. The information may be communicated to theprofile management system115 using any suitable communication protocol, including http.
Accordingly, profile owning entities may provide profile information to theprofile management system115. The profile information may be directly captured—“I like cats” in the case of a preference, or “I am a father” in the case of a role. However, in some instances, the provided profile information may be ambiguous, such as “I like the giants.” It may be unclear whether the profile owning entity intends to indicate a preference for the New York Giants, the San Francisco Giants, or large people.
The profile information submitted by an entity may accordingly be submitted to thedisambiguation engine120 ofFIG. 1. As will be described further below, thedisambiguation engine120 may provide a list of relevant terms that may be displayed in thedisambiguation selection area320 ofFIG. 3. An entity may then select the relevant terms from the disambiguation list for addition to the profile being managed. Alternatively or in addition, an entity may select or otherwise indicate, such as by right-clicking, one or more terms displayed anywhere in the browser window, or more generally displayed by the user device, that a term should be added to the entity's profile. Alternatively or in addition, embodiments of a profiling system may identify an action of the entity and automatically add a related term to the electronic profile of the entity. After processing by theanalysis engine125, which will be described further below, relevant advertisements, links to relevant content, or both, may be displayed in thecontent area330. In some embodiments, thecontent area330 may not be provided on a same screen, or indeed on the same device, with theprofile management interface135. That is, while profile information may be entered or revised on one device, content displayed or provided based on that profile information may be provided on a different device in some embodiments.
Accordingly, thedisambiguation engine120 functions to select terms, based on preference information input by an entity, that may also be relevant to the entity and may be considered for addition to the entity's electronic profile. In one embodiment, thedisambiguation engine120 may simply provide a list of all known terms containing the entity's input. For example, if the entity entered “giants,” a dictionary or sports listing of all phrases or teams containing the word “giants” may be provided. While this methodology may accurately capture additional profile information, it may be cumbersome to implement on a larger scale.
Accordingly, thedisambiguation engine120 may function along with anindexing engine420 as shown inFIG. 4. Generally, theindexing engine420 accesses one ormore content sources410 to analyze the content stored in the accessedcontent sources410 and generate an indexedcontent store430. The content sources may include thecontent sources142 ofFIG. 1 and in this manner theindexing engine420 may generate the indexedad storage144 andlink storage146. Although shown as separate storage, the indexedcontent store430 may include indexing information stored along with the content from thecontent sources410, or may include only index records related to the content in the content sources410. The index information generally includes information about the relative frequency of terms in the content from the content sources410. In this manner, as will be described further below, terms may be identified that frequently appear along with a query term, or in a same pattern as a query table. Thedisambiguation engine120 may then access the indexedcontent store430 to more efficiently identify terms related to preferences expressed by an entity. The expressed preference may be stored in one storage location, or distributed across multiple storage locations.
Theindexing engine420 may generally use any methodology to index documents from the content sources410. Theindexing engine420 generally includes a processor and memory encoded with computer readable instructions causing the processor to implement one or more of the functionalities described. The processor and memory may in some embodiments be shared with those used to implement the disambiguation engine, analysis engine, or combinations thereof. In one embodiment, a vector space representation of documents from thecontent sources410 may be generated by theindexing engine420. A vector representation of each document may be generated containing elements representing each term in the group of terms represented by all documents in thecontent sources410 used. The vector may include a term frequency—inverse document frequency measurement for the term. An example of a method that may be executed by theindexing engine420 is shown inFIG. 5.FIG. 5 further demonstrates an example in which an indexedcontent store430 may be created specific to a particular category. In some embodiments, however, the indexed content store may be generalized to one or more categories. However, in embodiments where the indexedcontent store430 is specific to a single category of information, it may be advantageous to provide several content stores (which may be physically stored in the same or different media), each containing indexed content for a specific category. In this manner, the indexing performed by theindexing engine420 will be specific to the category of information, and may in some cases enable greater relevance matching than querying a general content store.
Proceeding with reference toFIG. 5, the indexing engine may receive a list of categoryspecific expert content512. The expert content may, for example, include a group of content in a particular category that may be considered representative of content in the category (using, for example, the Wikipedia Commons data set, or any other collection of information regarding a particular category). The indexing engine locates the category specific content in the list over the Internet or other digital source of category-specific content510. The source of categoryspecific content510 may be located in a single storage medium, or distributed among several storage mediums accessible to the indexing engine over the Internet or other communication mechanisms.
The indexing engine extracts thetext514 from the expert content and may perform a variety of filtering procedures such as word normalization, dictionary look-up and commonEnglish term removal516. During word normalization, tenses or variations of the same word are grouped together. During dictionary look-up, meanings of words can be extracted. During common English term removal, common words such as ‘and’ or ‘the’ may be removed and not further processed. Grammar, sentence structure, paragraph structure, and punctuation may also be discarded. The indexing engine may then perform vector space word-frequency decomposition518 of the extracted text from each document. The use of the term document herein is not meant to limit the processing of actual text documents. Rather, the term document refers to each content unit accessed by the indexing engine, such as a computer file, and may have generally any length.
During the decomposition, each document may be rated based on the term frequency (TF) of the document. The term frequency describes the proportion of terms in the document that are unique. The term frequency may be calculated by the number of times the term appears in the document divided by the number of unique terms in the document. A vector of term frequencies may be generated by the indexing engine to describe each document, the vector having elements representing a term frequency for each term contained in the entire content store analyzed.
The vector representing each document may also contain an inverse document frequency (IDF) measure, that reflects how often the term is used across all documents in the content score, and therefore a measure of how distinctive the term may be to specific documents. The IDF may be calculated as the log of the number of documents containing the term divided by the number of documents in the content store.
In some embodiments, a Kullback-Leibler Divergence, DKLmay also be included in a vector representation of a document. DKLmay provide a measure of how close a document is to a query—generally, how much common information there is between the query and the document. DKLis a measure of a distance between two difference probability distributions—one representing the distribution of query terms, and the other representing the distribution of terms in the document. DKLmay be calculated as:
where p is the distribution of terms in the document, q is the distribution of query terms, and i represents each term. The distribution of terms in the document may be a vector with entries for each term in a content store, where the entries are weighted according to the frequency of each term in the document. The distribution of query terms may be a vector with entries for each term in a content store, where the entries are weighted according to the frequency of each term in the query.
Accordingly, using TF-IDF, Kullback-Leibler Divergence, other methods of document relevance measurements, or combinations thereof, the indexedcontent store430 ofFIG. 4 contains one or more content indexes representing a measure of the importance of various terms to each analyzed document.
Having described the indexing of documents, a process for disambiguating a preference by thedisambiguation engine120 using the indexedcontent store430 is illustrated inFIG. 6. An entity declares610 a preference, for example by entry into thepreference entry field310 ofFIG. 3. Thedisambiguation engine120 then selects anexpert content store612 to query using the declared preference. The selection may be made in a variety of ways. In some embodiments, a single content store is used and no selection need be made. In other embodiments, thedisambiguation engine120 receives contextual information about the entity entering preference information, and the contextual information is used to select the expert content store. For example, in one embodiment, the disambiguation engine receives information that the entity entering profile information is doing so from a sports-related website, and accordingly, an expert sports content store may be selected.
Documents in the expert content store are rated614, as described above, based on their relevance to individual terms. In some embodiments, the rating is conducted once the preference is entered, while in others, the already stored vectors containing the measurements are accessed. A set of most relevant documents to the expressed preference may be identified. The most relevant documents may be identified by calculating a relevance number for each document based on the preference terms. A relevance number represents the relevancy of each document to the preference, using the entered preference terms. Embodiments of the relevance number use a 0-100 scale, and may accommodate a multi-term preference. The relevance number for a single term may generally be calculated as a normalized TF.IDF value. In one embodiment, the calculation may be made by subtracting a minimum TF.IDF value for all terms in the indexed content store from the TF.IDF value of the term and dividing the result by the difference between the maximum TF.IDF value for all terms in the indexed content store in the minimum TF.IDF value for all terms in the indexed content store. For multiple terms in a preference, the relevance number of each document may be given as:
NTerms is the number of terms in the query. The relevance number accordingly is a sum of the relevance numbers for each term in the query, divided by the number of terms. The Kullback-Leibler Divergence, DKL, may also be used as a relevance number to score content items from a content store, or across multiple content stores. In the case of DKL, a lower DKLnumber indicates a more relevant content item (as it may indicate the information space between the item and the preference is small).
While in some embodiments, the calculation of relevance numbers may not change over time as the profiling system operates, in some embodiments relevance numbers or the method for calculating relevance numbers, may be modified in a variety of ways as the profiling system operates. The relevance numbers may be modified through entity feedback or other learning methodologies including neural networks. For example, relevance numbers as calculated above may be used to develop a set of neural network weights that may be used to initialize a neural network that may refine and learn techniques for generating or modifying relevance values. The neural network may be trained on a set of training cases, that may be developed in any of a variety of ways, including by using entity selection of a document to set a target value of a resultant relevance number. During training, or during operation of the profiling system, error functions may be generated between a desired outcome (such as a training case where an entity or administrator specifies the relevance score, or a situation in operation where entity feedback indicates a particular relevance score) and a calculated relevance number. The error function may be used to modify the neural network or other system or method used to calculate the relevance number. In this manner, the computation of relevance numbers, and in some embodiments, the relevance numbers themselves, may change as the profiling system interacts with content items and entities. For example, a relevance value for a content item may be increased if entity feedback indicates the content item is of greater or lesser relevance. The entity feedback may be explicit, such as indicating a degree of relevance the entity would assign to the content item, or implicit, such as by identifying multiple entities have selected the content item or responded to the content item to a degree that indicates the relevance number should be higher, or lower, than that assigned by the profiling system. Entity feedback may also include feedback obtained by monitoring the activity, selections, or both of one or more entities without necessarily receiving intentional feedback from the entity. Examples of neural networks, entity feedback modification, and other computer learning techniques usable with embodiments of the present invention are described in co-pending U.S. Provisional Application ______, entitled “Determining relevant information for domains of interest,” filed Dec. 12, 2008, which application is hereby incorporated by reference in its entirety for any purpose.
Referring back toFIG. 6, the set of significantly relevant documents may be identified by setting a threshold relevance number, or by setting a fixed number of results, and selecting that number of results in relevance number order, regardless of the absolute value of the relevance number. In some embodiments, the most relevant documents are selected by identifying a place in a relevance-ranked list of documents where a significant change in relevance score occurs between consecutive results. So, if, for example, there are documents with relevance numbers of 90, 89, 87, 85, 82, 80, 60, 59, 58 . . . then a threshold relevance number of 80 may be selected because it occurs prior to the relatively larger twenty-point relevance drop to the next document.
After the most relevant documents have been selected, the disambiguation engine may determine the most distinctive relatedkey words616 in those documents. The most relevant keywords may be determined by weighting the highest TF.IDF terms in the documents by the relevance number of the document in which they appear, and taking a sum of that product over all the documents for each term. The terms having results over a threshold, or a fixed number of highest resulting terms, may be selected by the disambiguation engine as most distinctiverelated keywords616. These selected keywords may be presented to the entity to determine if the keyword is useful620. For example, the keywords may be listed in thedisambiguation selection area320 ofFIG. 3. The preference entering entity may find that one or more of the identified keywords helps to refine the preference they have entered, or for other reasons should be included in their electronic profile, and may indicate the keyword should be added622 to their preference. The disambiguation engine may further continue the disambiguation operation by repeating the process shown inFIG. 6 using the added preference terms. If keywords are not identified as belonging to an entity's preference, the declared preference is stored624.
Accordingly, examples of the entry of profile information and refinement of entered profile information have been described above that may facilitate the creation and storage of electronic profiles. Referring back toFIG. 1, the information contained in an entity's electronic profile may be used by theanalysis engine125 to take a predictive or deterministic action. A variety of predictive or deterministic actions may be taken by theanalysis engine125 based in part on information contained in an entity's electronic profile. Products, things, locations, or services may be selected and suggested, described, or presented to an entity based on information contained in the entity's electronic profile. In other embodiments, other entities may be notified of a possible connection to or interest in an entity based on their electronic profile. Content on a website browsed by an entity may be modified in accordance with their profile in some embodiments. Theprofiling system110 may also generate or assist in the provider device generating a notification, alert, email, message, or other correspondence for the entity based on its profile. Accordingly, the analysis engine may take action for the entity or for third parties based on the entity's profile information. In one embodiment, which will be described further below, theanalysis engine125 selects content for presentation to the entity based on their electronic profile.
An example of operation of theanalysis engine125 to select relevant advertisements, links, or both, for an entity is shown inFIG. 7. Theanalysis engine125 receivesinformation711 about a network accessible content item, such as but not limited to, a website, web page, email, messaging, message item, document, or image, accessed by an entity, or simply receives a request for information from a browser plug-in being operated by the entity or on its behalf. Theanalysis engine125 accesses710 a stored preference in an entity's electronic profile. In some embodiments, a single stored preference is accessed, in some embodiments selected preferences may be accessed, and in some embodiments all stored preferences may be accessed. The selection of which preferences associated with an entity to access may in some embodiments be made according to the context of the request for analysis. For example, if the request comes from a sports content provider, one or more sports-related preferences may be accessed. In other embodiments, multiple preferences may be accessed and the context of the request or of the entity may alter the manner in which the relevance number is computed. For example, in some embodiments a total relevance number is calculated by summing individual relevance numbers calculated using a respective preference. A weighted sum may also be taken, with the weight accorded to each individual relevance number based on the preference with which it is associated. Accordingly, an entity's context, which may be stored in the entity's electronic profile, may determine the weighting of individual preferences in calculating a relevance number.
A specific request may not be required to begin the process shown inFIG. 7. Theanalysis engine125 may select712 one or more content indices for analysis based on a context in which theanalysis125 is operating. In some embodiments, the content index or indices to use may already be known, or there may only be one, in which case theselection712 may not be necessary. For example, theanalysis engine125 may utilize the ad andlink storage144 and146 shown inFIG. 1.
Referring back toFIG. 7, the analysis engine scores714 content in the selected indices based on the accessed preferences and received information about the network accessible content item(s), such as website(s) or web pages, accessed. The scoring process may occur in any manner, including a manner that allows the analysis engine to evaluate content items based on terms in the stored preference. In one embodiment, the scoring process includes assigning a relevance number to content items based on terms in the preference and terms received about the website as described above with reference toFIG. 6 and thedocument rating614 performed during preference disambiguation. However, in this case, the content items are simply scored and further analysis of relevant terms within the document may not be done, as was done during preference disambiguation.
Accordingly, content items in the selected indices are scored by calculating a reference number using the term(s) in the accessed electronic profile preference and term(s) received about the network accessible content items, such as web site(s) or page(s) accessed. Relevant advertisements and content links may then be selected716 in a similar manner to the selection of documents and terms for the disambiguation of preferences described above. That is, content may be selected having a relevance number over a threshold, or a fixed number of highest rated content items may be selected, or all content items preceding a sharp decline in relevance number may be selected. The selected links, advertisements, or both may then be displayed in thecontent area330 of the user device display shown inFIG. 3.
Having described an overview of selecting relevant advertisements and links to relevant content using electronic profile information associated with an entity as well as information about one or more network accessible content items, such as websites or web pages, visited by the entity, an example of how thecontent viewer137 may display those relevant advertisement(s), link(s), or both will now be described with reference toFIG. 8. Of course, the relevant content may be displayed differently in other embodiments.
Abrowser window820 is shown inFIG. 8. The browser window may be generated by any Internet browser program including but not limited to Internet Explorer, Mozilla, Safari, and Firefox. Additionally, the Internet browser program may be operating on any type of user device as generally described above. Thebrowser window820 generally displayswebsite content802 of websites visited by an entity. As is generally understood, as the entity browses the web, and follows links or enters URLs, different website content will be displayed in thearea802. Thecontent viewer137 described above with reference toFIG. 1 may render arelevant content area804. Therelevant content area804 may overlay portions of thewebsite content802, and may generally be positioned by a viewer to a suitable location in thebrowser window820, and may be pinned down as known in the art in any desired location. However, in one embodiment, as shown inFIG. 8, therelevant content area804 makes use of unused screen width810 that may be present when a widescreen monitor is used. Large or widescreen displays, such as displays wider than about 1024 pixels, although in some embodiments larger than 800×600 pixels, and in some embodiments wider than 1000 pixels, may have unused screen width 810 when rendering a typical website. In the process of installing a viewer, such as thecontent viewer137, the application may assess a screen resolution of theuser device130 and make a determination regarding where on the screen to position the viewer. The application may recommend that the entity utilize a different monitor if the experience would be non-optimal. The typical website may be designed to appear on a screen having a different aspect ratio or width, and the unused space810 may be present when a screen of a widescreen aspect ratio is used. In some embodiments, thecontent viewer137 is configured to render the links, advertisements, or both, selected by theanalysis engine125 in the unused space810. In this manner, the items displayed in therelevant content area804 may not affect the display ofwebpage content802. In other embodiments, thecontent viewer137 may render therelevant content area804 within thewebsite content area802. In some embodiments, an entity viewing therelevant content area804 may select the position of thearea804 in the display of the user device by dragging thearea804 around and clicking to place it in a fixed location.
Thecontent viewer137 may also facilitate reporting to advertisers or other content providers, provided an entity has configured their electronic profile such that it may be used to provide such information. Theprofiling system110 may track a number of advertisement impressions delivered over a specified time period, and thecontent viewer137 may report click-throughs on advertisements or content links to theprofiling system110. In this manner, the profiling system can report ad impressions and click through rates. Theprofiling system110 may also aggregate consumer profile data based on the electronic profiles of entities that have viewed the advertisements, clicked on the advertisements, or both. In some embodiments, theprofiling system110 aggregates data only when an entity's electronic profile indicates it may be so used. Reporting of click through or other data may be performed using standards employed by the Internet Advertising Bureau or other organizations. Click throughs may be reported related to advertisements, content, or both. Further, reporting may include information regarding what other advertisements, content, or both were displayed to the entity. Still further, in some embodiments, reporting can include information provided to theprofiling system110 to make the selection of the advertisements and links provided to the entity.
Therelevant content area804 may include the relevant links, advertisements, rich media, applications, or combinations thereof, supplied by theanalysis engine125. In the embodiment ofFIG. 8, five links and one advertisement are provided. The links are displayed above the advertisement inFIG. 8, although other configurations are possible. As the viewer browses the web and visits different web pages, the content displayed in thewebsite content area802 may change. As the new website information is transmitted to theanalysis engine125, the links and ads displayed in therelevant content area804 may also change. Although shown inFIG. 8 as a web browser window, therelevant content area804 may in other applications be a separate application or process and, instead of or in addition to displaying selections based on a web page or site viewed, display selections based on other network accessible content accessed by an entity, such as but not limited to documents, imagery, and correspondence such as emails. A viewer may click on the links or ads in therelevant content area804, causing further information related to the selection to appear in thewebpage content area802. In some embodiments, an entity may add information to their electronic profile by selecting terms appearing in thewebpage content area802 or therelevant content area804 and right-clicking or otherwise indicating that the selected term should be transmitted to theprofiling system110 for inclusion in the entity's electronic profile.
In this manner, an entity operating a user device may completely control information displayed in an application window. The content displayed is based on the entity's profile and network accessible content accessed by the entity. In this manner, advertisements, content, rich media, applications, and combinations thereof, may be more accurately targeted to the entity.
An example scenario for use of thecontent viewer137 andanalysis engine125 will now be described with reference toFIG. 10. Thecontent viewer137 is initiated1005. This may occur, for example by starting up an Internet browser on the user device that is equipped with a browser plug-in including software to perform the user device functions described. In some embodiments, a separate application is started up on the user device that performs the functions of thecontent viewer137. Once launched, thecontent viewer137 may display an initial content set. The initial content may be a default selection of advertisements, links, rich media, or combinations thereof. In other embodiments, the initial content may be selected based on the electronic profile of the entity. In such an embodiment, the identity of the entity operating thecontent viewer137 is transmitted1010 to theanalysis engine125. The entity may be identified in substantially any manner, including by logging into thecontent viewer137 with a username, password, or both, or by transmitting an identification of theuser device130 to theanalysis engine125. Having received an indication of the identity of the entity, theanalysis engine125 may access1015 the stored electronic profile associated with the entity. The initial content may be selected1020 based on the electronic profile of the entity, in some embodiments in combination with known past browsing history of the entity, which may also be stored in the entity's electronic profile. The initial content viewer display may be rendered1025 using appearance settings stored in the entity's electronic profile, such as by displaying a wallpaper, skin, or brand stored in the entity's electronic profile. In this manner, the initial information displayed by thecontent viewer137 may be a default setting, or selected based on the entity's profile, past browsing history, or both.
The entity then browses1030 to a web page using an Internet browser or similar viewer, or in other embodiments the entity accessed any type of network accessible content in any manner. Information about the web page visited, or content accessed, by the entity is transmitted1035 to theanalysis engine125. The information, as described above, may include metadata associated with the web page, a URL, content of the web page, or combinations thereof. In embodiments where the network accessible content accessed is not a web page, the information transmitted may include metadata associated with the accessed content, terms or other features of the content, a location of the content, a file type, and one or more protocols associated with the content, or combinations thereof. Theanalysis engine125 selects1040 content based on the entity's electronic profile and the web page information received. The selected content is then displayed1045 by thecontent viewer137. In this manner, as an entity browses to different web pages, or accesses different network accessible content, the displayed content in thecontent viewer137 may change accordingly.
From the foregoing it will be appreciated that, although specific embodiments of the invention have been described herein for purposes of illustration, various modifications may be made without deviating from the spirit and scope of the invention.