Movatterモバイル変換


[0]ホーム

URL:


HK1116557B - Search system and methods with integration of user annotations from a trust network - Google Patents

Search system and methods with integration of user annotations from a trust network
Download PDF

Info

Publication number
HK1116557B
HK1116557BHK08106875.9AHK08106875AHK1116557BHK 1116557 BHK1116557 BHK 1116557BHK 08106875 AHK08106875 AHK 08106875AHK 1116557 BHK1116557 BHK 1116557B
Authority
HK
Hong Kong
Prior art keywords
user
annotation
trust network
search
users
Prior art date
Application number
HK08106875.9A
Other languages
Chinese (zh)
Other versions
HK1116557A1 (en
Inventor
卢齐
埃卡特.沃瑟尔
大卫.库
仲-满.谭
凯文.李
徐志辰
帕沃.伯克欣
阿姆.A.阿瓦达拉
阿利.迪比
肯尼思.诺顿
建常.毛
Original Assignee
Jollify Management Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jollify Management LimitedfiledCriticalJollify Management Limited
Priority claimed from PCT/US2005/008487external-prioritypatent/WO2005089291A2/en
Publication of HK1116557A1publicationCriticalpatent/HK1116557A1/en
Publication of HK1116557BpublicationCriticalpatent/HK1116557B/en

Links

Description

Search system and method integrated with user annotations from a trust network
Cross Reference to Related Applications
This application claims priority from the following two U.S. provisional patent applications:
application No.60/553,577, entitled "Search System and Methods with Integration of User segmentation in the Networks", filed 3, 15.2004; and
application No.60/623,282, filed on 28.10.2004, entitled "Search System and Methods with Integration of User segmentation Networks".
The contents of each of these two applications are incorporated herein by reference for all purposes.
This disclosure is related to commonly owned, pending U.S. patent application No. ____ (attorney docket No.017887 and 013720US) filed at ____ and entitled "Search Systems and Methods with integration of User indications," the disclosure of which is incorporated herein by reference for all purposes.
Technical Field
The present invention relates generally to searching a corpus of documents (corps), and more particularly, to a search system and method that synthetically considers user annotations (annotations) of documents, including annotations provided by a querying user and annotations provided by other users having a trust relationship with the querying user.
Background
The world wide Web (Web) provides a vast source of interlinked information (in a variety of formats including text, images, and media content) relating to almost every conceivable topic. As the Web evolves, the ability of users to search through the collection and identify content relevant to a particular topic becomes increasingly important, and multiple search service providers are now emerging to meet this need. Typically, search service providers publish a Web page (Web page) through which users can submit queries indicating what the users are interested in. In response to a query, a search service provider typically generates a list of links to web pages or sites that are deemed relevant to the query in the form of a "search results" page and sends the list to the user.
The query response typically includes the following steps. First, a pre-created index or database of web pages or sites is searched using one or more search terms (search term) extracted from the query to generate a list of hits (hit) (typically target pages or sites, or references to target pages or sites, which contain the search terms or are otherwise identified as relevant to the query). Subsequently, the hits are ranked (rank) according to predefined criteria, and the best results (according to these criteria) are given the most significant placement, e.g. at the top of the list. The ranked hit list is sent to the user, typically in the form of a "results" page (or a group of interconnected pages) that contains a list of links to the hit page or site. Other features, such as sponsored links or advertisements, may also be included on the results page.
Hit ranking is often an important factor in whether a user's search has ended successfully or failed. Queries often return so many hits that it is not possible for the user to browse through all hits in a reasonable time. If the first few links followed by the user fail to lead to relevant content, the user often abandons the search, and may even abandon the search service provider, even though relevant content may be available under the list.
To maximize the likelihood that relevant content is prominently placed, search service providers have developed increasingly sophisticated page ranking criteria and algorithms. In the early days of Web searches, rankings were typically based on the number of occurrences and/or proximity of search terms on a given page. This has proven inadequate, and algorithms used today typically incorporate other information in addition to the presence of search terms on the hit page itself, such as the number of other sites on the Web that are linked to a given hit page (which reflects how useful other content providers consider the hit page to be). An algorithm allows querying users to provide feedback by evaluating (rate) returned hits. This rating is stored in association with the query, and the previous positive rating is used as a factor in ranking hits the next time the same query is entered by any user.
However, existing algorithms typically do not take into account preferences of individual users. For example, two users entering the same query may actually be interested in different things; pages or sites that are relevant to one user may not be relevant to another user. In addition, different users may have different preferences in different areas, such as how content is organized and displayed, which content providers they trust, etc., which may affect how they evaluate or rate a given site. Thus, a site that meets the needs of one user (or perhaps multiple users) may not meet the needs of the next user to enter the same query, and the user may still end up failing.
Another tool for helping individual users find content of interest to them is "bookmarking". Traditionally, bookmarks have been implemented in Web browser programs, and while viewing any page, a user may choose to save the page's bookmarks. Bookmarks generally include the URL (uniform resource locator) of a page, a title, and (possibly) other information, such as when the user accessed the page or when the user created the bookmark. The Web browser program maintains a list of bookmarks and the user can navigate to a page by finding bookmarked pages in their bookmark list. To simplify the task of navigating a bookmark list, most bookmark tools allow users to organize their bookmarks into folders. More recently, some internet-based information services have implemented bookmarking tools that allow registered users to create and access personal lists of bookmarks from any computer connected to the internet.
While bookmarks may be useful, the tool also has its limitations. For example, even with a folder, it is difficult for a user to remember which bookmarked page has a particular item of information that the user may be looking for at a given time. In addition, existing bookmarking tools do not typically help a user identify whether they have bookmarked a given page, nor do they provide any tools for searching bookmarked information. In addition, existing bookmark techniques do not provide an easy way for users to share their bookmarks with other users.
Accordingly, it is desirable to provide improved tools for assisting individual users in collecting and searching for content of interest to them.
Disclosure of Invention
Embodiments of the present invention provide search systems and methods that incorporate user judgment information relating to various pages or sites. This information may include a decision from the querying user, as well as a decision from other users selected by the querying user as members of their "trust network," or a decision from some other group of users identified by the querying user. For example, in some embodiments, each user participating in a content annotation system may define a list of friends, where each friend is another user of the system with which the first user wants to share annotations; a trust network is defined for each user based on a list of friends defined by the various participating users. In other embodiments, a user's trust network is defined to include members of a well-defined community to which the user belongs. Regardless of how the trust network is defined, annotations made by any member of the querying user's trust network may be integrated into the results of subsequent searches of the corpus made by the querying user, and may also be used in various ways to enhance the querying user's experience of browsing the corpus. In other embodiments, the querying user may specify members of a predefined community (which the user may or may not be a member of), and annotations made by any member of the community may be integrated into the querying user's search results and may also be used to enhance the querying user's browsing experience.
According to one aspect of the invention, a method for responding to a user query includes receiving a query submitted by a querying user of a plurality of users, and searching a corpus including a plurality of documents to identify one or more hits, wherein each hit is a document of the corpus determined to be relevant to the query. A trust network is constructed for a querying user, the trust network having as members a subset of the plurality of users including at least one user other than the querying user. A store of annotations created by a plurality of users is accessed, wherein each annotation is associated with a subject document of the documents of the corpus and with a creating user of the plurality of users, and each annotation includes user-specific metadata related to the subject document. At least one "annotated hit" is identified, where an annotated hit is a hit of the subject document that is also at least one matching annotation, and the creating user of each matching annotation is one of the members of the trust network. A search report is generated that includes a list of hits. For each annotated hit, the search report also includes information about the matching annotation. The corpus may include, for example, a plurality of web pages, and the user may be a person or a computer (or a person operating a computer).
In some embodiments, the trust network members include at least one other user that is explicitly identified by the querying user as a friend. For example, a trust network interface may be provided, where the trust network interface is operable by a user to identify other users as friends. Identification of friends is received from a plurality of input users (including a querying user) via a trusted network interface. A list of identified friends for each input user is stored. Some embodiments also allow for assigning a trust weight to each friend in the identified friends list. The trust weights may be assigned, for example, based on user input received via a trust network interface.
In other embodiments, building a trust network for the querying user includes automatically populating a list of identified friends of the querying user from a list of users with which the querying user communicates. The user list may include, for example, a list of instant messaging contacts maintained by the querying user, an email address book maintained by the querying user, a list of members of a community to which the querying user belongs, and so forth. A trust network interface may also be provided that is operable by a querying user to edit an automatically populated friends list.
In using the list of identified friends, constructing a trust network for the querying user advantageously includes retrieving the list of identified friends of the querying user and adding at least one of the identified friends of the querying user as a member of the trust network. A list of identified friends of the first of the trust network members may also be retrieved, and at least one of the identified friends of the first of the trust network members may also be added as a member of the trust network. In some embodiments, identified friends that connect to the querying user's trust network members with a degree of separation that does not exceed a maximum value are added as members of the trust network. In other embodiments, the selection of a user to add to the trust network as a member is based at least in part on the trust weight. The querying user may also be added to the trust network as a member.
There is no need to explicitly identify an individual user as a friend. For example, in some embodiments, the trust network members are members of a selected community of users, which is selected by the querying user. The querying user may or may not be a member of the selected community, and the querying user may or may not be a member of the trust network. In the case where a user selects a community to define a trust network, the user may or may not have access to information identifying individual members of the community.
The annotations may be used in various ways to generate search reports. In some embodiments, the search report generated in response to the query of the querying user includes a visual highlighting element applied to each hit that is an annotated hit. In the case where the user-specific metadata included in the annotations includes a rating, for each annotated hit, extracting a rating from each matching annotation, and calculating an average rating; the visual highlighting element applied to each annotated hit depends on the average rating. In some embodiments, the order of the hit list is determined based at least in part on the average rating of the annotated hits. In other embodiments, generating the search report further comprises, for each annotated hit, providing a control element in the search report that is operable by the user to request display of user-specific metadata for at least one matching annotation. In other embodiments, generating the search report further includes generating a separate list that includes only annotated hits.
In some embodiments, the method further includes searching the annotation store to identify one or more additional annotated hits, wherein each additional annotated hit corresponds to a document in the corpus, for which the annotation store includes creating an associated annotation for which the user is one of the trust network members, and the associated annotation includes user-specific metadata determined to be relevant to the query. Additional annotated hits may be incorporated into the hit list of the search results page. For example, where searching the corpus includes extracting search terms from the user query and identifying each document in the corpus that contains the search terms as a hit, the search annotation storage may include identifying user-specific metadata in the corpus as additional annotated hits for each document in the corpus for which the search terms are included.
In some embodiments, the annotation store may include at least one annotation associated with a set of documents in the corpus, and any hit that is one of the sets of documents may be identified as an annotated hit.
The user-specific metadata advantageously comprises information items explicitly entered by the user, such as ratings of the associated documents, keywords describing the associated documents, tags selected from a predefined vocabulary, descriptions of the associated documents, and so on.
According to another aspect of the invention, a method for responding to user queries includes receiving a query submitted by a querying user of a plurality of users. A trust network is constructed for a querying user, the trust network having as members a subset of the plurality of users including at least one user other than the querying user. A storage of annotations created by a plurality of users is accessed, each annotation in the storage being associated with a subject document of a plurality of documents belonging to a corpus and with a creating user of the plurality of users, and each annotation further comprising user-specific metadata relating to the subject document. One or more hits are identified, each hit being a document in the corpus determined to be relevant to the query, and each hit also being a subject document of at least one matching annotation, wherein a creating user of each matching annotation is one of the trust network members. A search report including a list of hits is generated and sent to the querying user. The corpus may be, for example, the world wide web, and the users may be humans or computers (or humans operating computers).
A trust network may be constructed in various ways. For example, the trust network members include at least one other user explicitly identified by the querying user as a friend, and the trust network may be constructed from a list of explicitly identified friends of various users, e.g., as described above. The trust network members may also be members of a selected community of users, the community being selected by the user; the user may or may not be a member of a selected community, and the user may or may not know the identity of the individual community members.
In some embodiments, identifying the one or more hits includes comparing the query to the content of the documents in the corpus.
In another embodiment, identifying one or more hits includes comparing the query to user-specific metadata in the annotated search pool that created an annotation that the user was one of the trust network members. For example, search terms may be extracted from a query, and for each annotation in the search pool, whether the search term is present in user-specific metadata may be detected; the associated documents are identified as hits in the event that the search term is present in user-specific metadata. In some embodiments, the user-specific metadata includes a plurality of fields, and the query may specify which fields are to be considered during the detection action. In addition, for each document that is a subject document of at least one annotation in the search pool, it may be detected whether a search term exists in the document, and the document may also be identified as a hit if the search term exists in the document.
In some embodiments, for each hit, the search report includes a control element operable by the user to request display of user-specific metadata for at least one matching annotation. In other embodiments, for each hit, the search report includes at least some user-specific metadata from at least one matching annotation. In other embodiments, where the user-specific metadata included in each matching annotation includes a rating of the subject document, the hits in the list are placed in an order determined based at least in part on the rating of the hits.
In some embodiments, the annotation store may include at least one annotation associated with a set of documents in the corpus, and any document that is one of the sets of documents may be identified as a hit.
According to yet another aspect of the present invention, a computer system for responding to user queries from a plurality of users includes an index data store, a personalization data store, and a search server communicatively coupled to the index data store and the personalization data store. The index data store is configured to store searchable representations of a plurality of documents belonging to a corpus. The personalization data store is configured to store annotations, each associated with a subject document in the corpus and with a creating user of the plurality of users, each annotation including user-specific metadata related to the subject document. The search server includes input control logic, search control logic, trust network control logic, personalization control logic, and reporting control logic. The input control logic is configured to receive a query from a querying user of the plurality of users. The search control logic is configured to search the index data store to identify one or more hits, where each hit is a document in the corpus that is determined to be relevant to the received query. The trust network control logic is configured to construct a trust network for the querying user, the trust network being member of a subset of the plurality of users including at least one user other than the querying user. The personalization control logic is configured to identify each hit of the subject document as at least one matching annotation as an annotated hit, wherein the creating user of each matching annotation is one of the trust network members. The report control logic is configured to generate a search report including a list of hits, the search report further including, for each annotated hit, information about at least one matching annotation, the report control logic further configured to send the search report to a querying user.
According to yet another aspect of the present invention, a computer system for responding to user queries from a plurality of users includes an index data store, a personalization data store, and a search server communicatively coupled to the index data store and the personalization data store. The index data store is configured to store searchable representations of a plurality of documents belonging to a corpus. The personalization data store is configured to store annotations, each associated with a subject document in the corpus and with a creating user of the plurality of users, each annotation including user-specific metadata related to the subject document. The search server includes input control logic, trust network control logic, search control logic, and report control logic. The input control logic is configured to receive a query from a querying user of the plurality of users. The trust network control logic is configured to construct a trust network for the querying user, wherein the trust network is member of a subset of the plurality of users including at least one user other than the querying user. The search control logic is configured to identify one or more documents from the corpus as hits, wherein each hit is determined to be relevant to the query, and each hit is also a subject document of at least one matching annotation, wherein a creating user of each matching annotation is one of the trust network members. The report control logic is configured to generate a search report including the hit list, the report control logic further configured to transmit the search report to a user.
The following detailed description and the accompanying drawings will provide a better understanding of the nature and advantages of the present invention.
Drawings
FIG. 1 is a block diagram of an information retrieval and communication network according to an embodiment of the present invention.
Fig. 2 is a block diagram of an information retrieval and communication network according to another embodiment of the present invention.
Fig. 3 is an example of a annotated content field according to an embodiment of the invention.
FIG. 4 is an example of a folder entry for organizing annotations according to an embodiment of the present invention.
FIG. 5 is a network diagram of a trust network according to an embodiment of the present invention.
FIG. 6 is an example of a trusted web interface page according to one embodiment of the present invention.
FIG. 7 is an example of a toolbar-based interface for annotating and/or viewing existing annotations for any page that the user happens to be viewing, according to an embodiment of the present invention.
FIG. 8 is an example of an overlay for displaying annotations according to an embodiment of the present invention.
Fig. 9A and 9B are examples of search result pages enhanced with annotation information according to embodiments of the invention.
FIG. 10 is a flow diagram of a process for incorporating annotations of trust network members into a response to a current query from a querying user in accordance with an embodiment of the present invention.
FIG. 11 is an example of a personal Web search interface page according to an embodiment of the present invention.
FIG. 12 is a flow diagram of a process for responding to a query during a personal Web search in accordance with an embodiment of the present invention.
Fig. 13 is an example of folder privacy settings according to an embodiment of the present invention.
FIG. 14 is an example of a library interface page for interacting with a user's own annotations according to an embodiment of the present invention.
FIG. 15 is an example of an import interface page according to an embodiment of the present invention.
Fig. 16A and 16B are examples of interface pages for searching the community Web according to an embodiment of the present invention.
Detailed Description
Embodiments of the present invention provide systems and methods that allow users to share their annotations in relation to various documents (or other content items) found in a corpus such as the world wide web. The term "annotation" as used herein generally refers to any descriptive and/or evaluative metadata related to documents from a corpus, where the metadata is collected from a user and then stored in association with an identifier of the user and an identifier of the subject document (i.e., the document to which the metadata is related). The annotations may include various fields of metadata, such as a rating of the page or site (which may be positive or negative), one or more keywords or tags identifying the topic(s) of the page or site, a free text description of the page or site, and/or other fields. Annotations are advantageously collected from users of the corpus and stored in association with identifiers of the users who created the annotations and identifiers of documents (or other content items) related thereto. Examples of annotations and processes for collecting annotations from users are described in application No. ____ (attorney docket No.017887 and 013720US) cited above. It should be understood that the present invention is not limited to particular metadata or to particular techniques for collecting metadata.
In an embodiment of the invention, each user participating in the content annotation system may define a list of friends, where each friend is another user of the system with whom the first user wants to share annotations. Based on the friends lists defined by the respective participating users, a trust network is defined for each user, and annotations made by the first user's trust network members may be integrated into the results of subsequent searches of the corpus made by the first user, and may also be used in various ways to enhance the first user's experience of browsing the corpus.
For example, when a first user searches a corpus, any hits of documents annotated (referred to herein as "annotated hits") corresponding to the first user or any other member of the first user's trust network may be highlighted and links provided to allow the user to view the annotations. Where the annotation includes judgment data, such as a digital rating, the judgment data may be aggregated for the first user's trust network and the annotated hits may be highlighted in a manner that indicates whether the judgment is positive or negative. Additionally, the aggregated numerical ratings of the first user's trust network may be used to rank search results responsive to the first user's query, where positive aggregate ratings tend to increase the rank of a given page or site, and negative aggregate ratings tend to decrease the rank.
In another embodiment, where the annotations include user-provided textual descriptions and/or descriptive keywords or tags, the first user has the option to search the content of the annotations (in addition to or in place of the page content) created by their trust network member. In other embodiments, whenever a user first accesses a page that has been annotated by any member of their trust network, a control is provided that allows the user to view these annotations.
For illustrative purposes, the description and drawings of the present invention may utilize specific queries, search result pages, URLs, and/or web pages. This use is not intended to imply any opinion, approval, or slight of any actual web page or site. In addition, it should be understood that the invention is not limited to the specific examples described herein.
I. Overview
A. Overview of network implementation
FIG. 1 illustrates an overview of an information retrieval and communication network 10 including a client system 20 according to an embodiment of the present invention. In computer network 10, client system 20 is connected via the Internet 40 orOther communication networks (e.g., via any Local Area Network (LAN) or Wide Area Network (WAN) connection) couple to any number of server systems 501To 50N. As described herein, client system 20 is configured with server system 50 in accordance with the present invention1To 50NFor example, to access, receive, retrieve, and display media content and other information (e.g., web pages).
Several of the elements of the system shown in fig. 1 include conventionally known elements that need not be described in detail herein. For example, client system 20 may include a desktop personal computer, workstation, laptop, Personal Digital Assistant (PDA), cell phone, or any WAP-enabled device or any other computing device capable of interfacing directly or indirectly to the Internet. Client system 20 typically runs a browsing program, such as Microsoft's Internet ExplorerTMBrowser, Netscape NavigatorTMBrowser, MozillaTMBrowser, OperaTMA browser, or in the case of a cell phone, PDA, or other wireless device, WAP-enabled browser, or the like, allowing a user of client system 20 to access, process, and view data from server system 50 via Internet 401To 50NThe obtained information and the page. Client system 20 also typically includes one or more user interface devices 22, such as a keyboard, mouse, touch screen, pen, etc., for use in conjunction with server system 501To 50NOr other server-provided pages, forms, and other information, to interact with a Graphical User Interface (GUI) provided by a browser on a display (e.g., monitor screen, LCD display, etc.). The invention is suitable for use with the internet, which refers to a particular global interconnection network of networks. However, it should be understood that other networks may be used instead of or in addition to the Internet, such as an intranet, an extranet, a Virtual Private Network (VPN), a non-TCP/IP based network, any LAN or WAN or the like.
According to one embodiment, client system 20 and all of its components are configurable by an operator using an application that includes computer codeThe code is processed by a central processing unit (e.g., Intel Pentium)TMProcessor, AMD AthlonTMProcessors, etc., or multiple processors). Computer code for operating and configuring client system 20 to transmit, process and display data and media content as described herein is preferably downloaded and stored on a hard disk, but the entire program code, or portions thereof, may also be stored in any other known volatile or non-volatile memory medium or device (e.g., ROM or RAM), or provided on any medium capable of storing program code, such as a Compact Disk (CD) medium, a Digital Versatile Disk (DVD) medium, a floppy disk, and the like. In addition, all or a portion of the program code may be transmitted and downloaded from a software source, such as from server system 501To 50NOne of which is sent to client system 20 via the internet or via any other network connection (e.g., extranet, VPN, LAN, or other conventional network) using any communication medium and protocol (e.g., TCP/IP, HTTP, HTTPs, ethernet, or other conventional medium and protocol).
It should be appreciated that computer code for implementing aspects of the present invention may be C, C + +, HTML, XML, Java, JavaScript, or the like code, or any other suitable scripting language (e.g., VBScript), or any other suitable programming language that may be executed on client system 20 or that may be compiled to execute on client system 20. In some embodiments, no code is downloaded to client system 20 and the required code is executed by the server, or code already present on client system 20 is executed.
B. Search and annotate System overview
Fig. 2 illustrates another information retrieval and communication network 110 for transmitting media content in accordance with an embodiment of the present invention. As shown, network 110 includes client system 120, one or more content server systems 150, and search server system 160. In network 110, client system 120 is communicatively coupled to server systems 150 and 160 through the Internet 140 or other communication network. As described above, client system 120 and its components are configured to communicate with server systems 150 and 160 and other server systems via Internet 140 or other communication networks.
According to one embodiment, a client application (represented as module 125) executing on client system 120 includes instructions for controlling client system 120 and its components to communicate with server systems 150 and 160 and to process and display data content received therefrom. Client application 125 is preferably sent from a software source such as a remote server system (e.g., server system 150, server system 160, or other remote server system) and downloaded to client system 120, but client application module 125 may also be provided on any software storage medium as described above, such as a floppy disk, CD, DVD, or the like. For example, in one aspect, the client application module 125 may be provided to the client system 120 via the internet 140 in the form of an HTML wrapper (wrapper) that includes various controls, such as embedded JavaScript or Active X controls, for manipulating data and rendering the data within various objects, frames, and windows.
In addition, the client application modules 125 include various software modules for processing data and media content, such as a specialized search module 126 for processing search requests and search result data, a user interface module 127 for presenting data and media content within text and data frames and active windows (e.g., browser windows and dialog boxes), and an application interface module 128 for interfacing to and communicating with various applications executing on the client 120. Examples of applications executing on client system 120 with which application interface module 128 is preferably configured to interface include various email applications, Instant Messaging (IM) applications, browser applications, document management applications, and the like, in accordance with certain aspects of the present invention. Additionally, user interface module 127 may include a browser, such as a default browser configured on client system 120 or a different browser.
According to one embodiment, search server system 160 is configured to provide search result data sets and media content to client systems 120, and content server system 150 is configured to provide data, such as web pages, and media content to client systems 120, for example, in response to a link selected in a search result page provided by search server system 160. In some variations, search server system 160 returns content and/or links and/or other references to content. The search server system includes a query response module 162 configured to receive queries from users and generate search result data therefor, a user annotation module 164 configured to manage user interaction with annotation information provided by users, and a trust network module 165 configured to manage the trust network of users. Search server system 160 is communicatively coupled to a personalization database 166, personalization database 166 storing data related to a particular user of search server system 160 and a page index 170, page index 170 providing an index to the corpus (in some cases, the world Wide Web) to be searched. The personalization database 166 and page index 170 may be implemented using conventional database technology.
In one embodiment, trusted network module 165 establishes a "friends" list for each registered user of search server 160 and stores the list in personalization database 166. The friends list may be automatically initialized by the trust network module 165 and edited by the user (as described below), or it may be created manually. Based on the friends lists established for the various users, the trust network module 165 defines a trust network for each user, the trust network including the friends of the user, and in some cases the friends of the user, and so on, up to some limit (described below).
In some embodiments, the trust network module 165 dynamically builds a trust network for each user; this includes generating a list of trust network members and associated parameters (e.g., trust weights or confidence coefficients, as described below) for each member. The building of a trust network for a given user may occur in real-time as trust network information is needed (e.g., as the user submits a query). Alternatively, a given user's trust network may be built under predetermined conditions and stored for subsequent applications. Examples of conditions that may trigger the construction (or reconstruction) of trust network information include: each time a user initiates a new session with search server 160; each time the user updates his/her friends list, as described below; or at regularly scheduled intervals (e.g., daily).
In one embodiment, annotating module 164 interacts with personalization database 166 to store and manage user annotation data for various users of search server system 160. For example, annotation data received from the user may be provided to annotation module 164 for storage in personalized database 166, and annotation module 164 may also respond to any request for annotation data, including requests from query response module 162, other components of search server 160, and/or client 120.
Various interfaces may be provided for a user to enter annotation data. Certain examples are described in the above-cited application No.017887-013720 US; any of these interfaces or others may also be used. When the user selects an annotation page or site, user annotation module 164 receives the new annotation data from the user (e.g., via client system 120) and updates personalized database 166.
In one embodiment, the query response module 162 references various page indexes 170, which page indexes 170 are populated with, for example, pages, links to pages, data representing the content of indexed pages, and so forth. The page index may be generated by various collection techniques, including an automatic web miner (crawler)172 and/or various web spiders (spiders), among others, as well as manual or semi-automatic classification algorithms and interfaces for classifying and ranking web pages within the hierarchical architecture. These techniques may be implemented in search server system 160 or within a separate system (e.g., web miner 172) that generates and makes available page index 170 to search server system 160. Various page index implementations and formats are known in the art and may be used for page index 170.
The query response module 162 is configured to provide data responsive to various search requests (queries) received from the client system 120, and in particular from the search module 126. The term "query" as used herein encompasses any request from a user to search server 160 (e.g., via client 120) that can be satisfied by searching the Web (or other corpus) indexed by page index 170. In one embodiment, a search interface is presented to the user via search module 126. The interface may include text boxes into which the user may enter a query (e.g., by typing), check boxes, and/or radio buttons for selecting directories or other structures from predefined queries that enable the user to limit the search to a predefined subset of the full search corpus (e.g., to certain websites or taxonomy departments within the page index 170), and so forth. Any search interface may be used.
The query response module 162 is advantageously configured with search-related algorithms for processing and ranking web pages relevant to a given query (e.g., based on a combination of logical relevance as measured by the occurrence pattern of search terms extracted from the query, contextual identifiers associated with the search terms and/or specific pages or sites, page sponsors, connectivity data collected from multiple pages, etc.). For example, the query response module 162 may analyze a received query to extract one or more search terms, and then use these search terms to access the page index 170 to generate a "hit" list, i.e., pages or sites (or references to pages or sites) that are determined to have at least some relevance to the query. The query response module 162 may then rank the hits using one or more ranking algorithms. The particular algorithm used to identify hits and rank the hits is not important to the present invention and conventional algorithms may be used.
In some embodiments of the present invention, query response module 162 is further configured to retrieve any annotation data associated with any users belonging to the querying user's trust network (including the querying user) from personalized database 166 and incorporate such annotation data into the search results. Retrieval of annotation data may include interaction between the query response module 162 and the trust network module 165 (e.g., to obtain a list of trust network members) and/or interaction between the query response module 162 and the annotation module 164 (e.g., to retrieve annotation data upon identification of a trust network member).
The incorporation of annotation data can be done in a number of ways. For example, where at least some annotations include ratings, hits may be identified and/or ranked based at least in part on rating information. The ratings given by individual trust network members to hit a page or site may be used directly, or an aggregate (e.g., average) rating of all trust network members rating a particular page may be used. In one embodiment, the query response module 162 may generate a separate list of "positive" results based on positive evaluations of a particular page or site; or the query response module 162 may incorporate the evaluation of a particular page or site in the ranking of search results; or the query response module 162 may use the negative ratings of the trust network members of a particular page or site to determine whether to drop a hit from a list of hits included in the search results page. Where the annotations include textual descriptions, keywords, or tags, the occurrence of the search term in any of these elements may be considered during the identification and/or ranking of search hits.
To enable search personalization features such as trust network annotations, search server 160 advantageously provides a user login feature, where "login" generally refers to any scheme for identifying and/or authenticating a user of a computer system. Numerous examples are known in the art and may be used in conjunction with embodiments of the present invention. For example, in one embodiment, each user has a unique user Identifier (ID) and password, and the search server 160 prompts the user to log in by passing a login page (via which the user can enter this information) to the client 120. In other embodiments, biometric, voice, or other recognition and authentication techniques may be used in addition to (or instead of) the user ID and password. Once the user has identified himself, for example, by logging in, the user may create and/or update annotations by interacting with the user annotation module 164, as described below. Additionally, each query entered by a logged-in user may be associated with a unique user ID for that user; based on the user ID, the query response module 162 may access the personalized database 166 to incorporate annotations from members of the querying user's trust network into responses to the user's query. The user login is advantageously persistent in the sense that once the user has logged in (e.g., via the client application 125), the user's identity may be transmitted to the search server 160 at any suitable time while the user is operating the client application 125. Thus, the personalization features described herein may be continuously accessible to the user.
In addition to using annotations of trust network members that are responsive to queries, the query response module 162 may also use aggregated information collected from annotations of other users. For example, in one embodiment, a global aggregate rating (e.g., average rating) for a page or site is computed from the ratings of each user, which each user provides annotations (whether or not trust the web member) with the rating of the page or site. The global aggregate rating may be used to select search hits and/or rank search hits. In another embodiment, a global aggregate keyword or tag describing a page or site may be determined, for example, by identifying the keywords or tags that are most commonly applied to the page or site by users who have annotated the page or site (whether or not they trust the network members). Such aggregated annotations for a given page may be stored, for example, in the page index 170 and used by the query response module 162 to rank hits in response to queries (whether or not the user is known to the search server 160).
In one embodiment, the user annotation module 164 forwards the new annotation data to an aggregator module (not shown in FIG. 2) that updates the aggregated annotation data stored in the page index 170 as it is received. The aggregated annotation data may be updated at regular intervals (e.g., daily or hourly), or approximately in real-time. The collection and use of global aggregated annotation data is described in application No. ____ (attorney docket No. 017887-.
In other embodiments, the query response module 162 may be configured to respond to a query by searching or reporting hits on a subset of the full corpus. For example, a user may be able to submit queries and requests that only those documents that have been annotated by their trust network member are reported as hits. As another example, a user may be able to request that only those documents that have been annotated by a particular community member be reported as search hits. Examples of these operations are described below.
It should be appreciated that the search system described herein is exemplary and that variations and modifications are possible. The content server and search server systems may be part of a single organization, such as by Yahoo! Companies provide distributed server systems to users, or they may be part of different organizations. Each server system typically includes at least one server and an associated database system, and may include multiple servers and associated database systems, and although shown as a single block, may be geographically distributed. For example, all servers of the search server system may be located adjacent to one another (e.g., in a server farm located within a single building or campus), or may be distributed at locations remote from one another (e.g., one or more servers located in city a and one or more servers located in city B). Thus, a "server system" as used herein generally includes one or more logically and/or physically connected servers distributed locally or at one or more geographic locations; the terms "server" and "server system" are used interchangeably. In addition, the query response module and the user annotation module described herein may be implemented on the same server or on different servers.
The search server system may be configured with one or more page indexes and algorithms for accessing the page index(s) and providing search results to the user in response to a search query received from the client system. The search server system may generate the page index itself, receive the page index from another source (e.g., a separate server system), or receive the page index from another source and perform further processing thereon (e.g., addition or update of various page information). Additionally, although the search server system is described as including a particular combination of component modules, it should be understood that the division of the modules is merely for convenience of description; more, fewer, or different modules may be defined.
Additionally, in some embodiments, some of the modules and/or metadata maintained by search server 160 described herein may reside, in whole or in part, on a client system. For example, some or all of the user annotations may be stored locally at the client system 120 and managed by component modules of the client application 125. Other data (including some or all of page index 170) may be periodically downloaded from search server 160 and stored by client system 120 for subsequent use. In addition, the client application 125 may create and manage a content index stored locally at the client 120, and may also provide the ability to search for locally stored content, incorporate search results including locally stored content into Web search results, and so forth. Thus, the search operation may include any combination of operations performed by the search server system and/or the client system.
In embodiments of the present invention, annotations may be collected from a user in a variety of ways, including annotations entered from a search results page, annotations entered using a toolbar interface, and so forth. Examples of collecting annotation data are described below.
C. Annotation overview
Annotation data stored in personalized database 166 may be collected from registered users of search server 160 via a variety of suitable interfaces. Some examples of annotation formats and interfaces for collecting annotations are described in application No. ____ (attorney docket No. 017887-. However, it should be understood that the present invention is not limited to a particular annotation format or annotation collection technique.
I. Annotating content
As described above, the term "annotation" as used herein generally refers to any descriptive and/or evaluative metadata that pertains to a page or site (or other content item in a corpus) collected from a user and is thereafter stored in association with an identifier of the user and an identifier of the page or site. The annotations may include various metadata fields, such as a rating of the page or site (which may be any data indicating a positive or negative opinion), one or more keywords identifying the topic(s) of the page or site, a textual description of the page or site, and/or other fields. For illustrative purposes, specific annotation structures are now described; it should be understood that the particular annotation structure is not important to the invention.
As used herein, a "page" refers to a unit of content that can be identified by a unique locator (e.g., a URL) and displayed by a suitably configured browser program. "site" refers to a group of one or more pages related to a common topic and located on the same server. In some embodiments of the invention, the user creating the annotation may indicate whether the annotation should be applied to a single page or to a group of related pages (a site). In the latter case, the user may advantageously define the scope of the site. In some embodiments, there is no other distinction between page annotations and site annotations other than the number of pages to which the annotations may apply.
In one embodiment, each annotation is a structured entry in personalization database 166. Fig. 3 illustrates the content field of the annotation 300. The fields in the left column 302 may be automatically generated and updated by the user comment module 164; the fields in the right column 304 are preferably user-provided.
The automatically generated fields include an "Author ID" (Author ID) field 306 that stores the user ID of the user who created (or saved) the annotation, and a "URL" field 308 that identifies the page (or group of pages) that is the object of the annotation. In this embodiment, the annotation is associated with the user whose ID is represented in author ID field 306, and any document whose URL matches the URL stored in URL field 308. The "Host flag" (Host flag) field 310 indicates whether the annotation applies to a single page or to a group of pages. If the host flag is set to "page", the annotation applies only to pages whose URL exactly matches the string in field 308, whereas if the host flag is set to "site", the annotation applies to any page whose URL begins with the string shown in field 308. Thus, annotations with the host flag set to "site" may be applied to any number of pages (including only to one page). The host flag field 310 may be automatically set to a default value (e.g., "page") and the user may be given the option to change the value.
The "Title" (Title) field 312 stores the Title of the subject page. This field is advantageously populated with the page title extracted from the source code of the annotation page as a default value; in some embodiments, the user is allowed to change the title. The "Abstract" field 314 stores a text Abstract of the subject page or site; the summary may be generated automatically or provided by the user.
The remaining fields in column 302 provide historical information about the annotation. For example, an "introduction" field 316 provides contextual information about how the user arrived at the subject page. The introduction field 316 may include, for example, a query (in response to which the user is taken to the subject page, as shown in FIG. 3), historical information about the content viewed by the user prior to navigating to the notes page, or an identifier of another user from which the author introduced notes (introduced as described below).
In the event that the user has annotated the page and subsequently modified the annotation, the introduction field 316 is advantageously updated to identify the introduction source that led to the modified annotation. The "old introduction" (oldrefragal) field 318 may be used to store context information relating to a previous version of an annotation; this information is similar to the information stored in the introduction field 316. Any number of old introductions (including no old introductions) may be maintained.
The Last updated field 320 provides a timestamp indicating when the user Last updated the annotation. The Last Visited field 322 provides a timestamp indicating when the user Last Visited the annotation page. Although fig. 3 shows these timestamps in a year-month-day-minute-second format, it should be understood that other formats and any desired accuracy may alternatively be used. This information may be used, for example, to identify older annotations that may be less reliable (especially if the annotation page is updated more recently than the time the user last accessed the page).
The fields in column 304 are provided by the author and advantageously remain empty until and unless the user provides data. In the preferred embodiment, the user is not required to provide data for all of these fields, and any empty fields may be ignored when the annotation is used in the search process.
The "keywords" field 324 stores tags that describe one or more user-provided keywords or user selections of the subject page. As used herein, "keyword" (also sometimes referred to in the art as tag) refers to a word or phrase provided by a user who is free to select any word or phrase, and "label" (label) refers to a word or phrase selected by a user from a system-defined vocabulary, such as a hierarchical list of category identifiers.
The "Description" field 326 stores a textual Description of the subject page provided by the user. In populating the field, the user is not limited to words or phrases, nor to any particular length, and the text may be formatted or unformatted. In some embodiments, description field 326 may store a fairly long text string (e.g., up to 500 or 1000 words). The user may also be allowed to include links to other content as part of the description. Links may be included, for example, to identify other sites that provide more detailed information about the subject matter referred to by the annotation page.
The "Rating" field 328 stores a numerical value or other indicator that reflects the user's opinion or judgment of the subject page. The evaluations may be provided using various scales, which preferably allow for at least "positive" (stable), "negative" (stable), and "neutral" evaluations. For example, in one embodiment, the user is prompted to give a positive (e.g., thumb up) or negative (e.g., thumb down) rating to the subject page during annotation creation. The positive and negative evaluations are each assigned a numerical value (e.g., +2 and-2, respectively); pages that are not evaluated are given a default evaluation (e.g., 0) representing a "neutral" judgment. Other evaluation systems (e.g., 0-4 stars, 1-10 grades, etc.) may also be used. The rating indicator stored in field 328 need not match the rating scale used by the user (e.g., if the user rates the page on a scale of 1-10, this may be converted to a rating indicator ranging from-4 to 5). Any page that is annotated by the user but not rated is advantageously considered to have a neutral rating.
It should be understood that the annotation entry 300 is exemplary and that other annotation structures having different fields may be used. For example, in some embodiments, the annotations may include a representation of a portion or all of the content of the subject page in compressed or uncompressed form. In other embodiments, the user may connect the description to a particular portion of the content of the subject page, and the portion to which the description is connected may be stored in the annotation. In another embodiment, the search server 160 may also classify pages or sites according to some classification, and such classification data may be saved as part of the annotation.
Other metadata related to the subject page (or site) may also be collected in the annotation record and automatically updated as the user continues to browse. For example, a counter may be provided to count the number of times a user visits their annotated page or site. The counter and/or the timestamp of the last visit may be automatically updated each time the user visits a page or site. In some embodiments, only those accesses that occur while the user is logged into search server 160 result in an automatic update.
The annotation entries may take any format suitable for storage in personalization database 166 (e.g., relational database schema, XML records, etc.) and may be accessed by reference to various fields. In one embodiment, the annotation record is accessible by at least the author ID, URL, title, and keywords.
2. Collecting annotation data
Annotations may be collected from the user in a variety of ways, examples of which are described in application No. ____ (attorney docket No.017887 and 013720US) cited above. As described herein, the user may select any page displayed with annotations in a Web browser client equipped with a suitable toolbar, or the user may select a page where the annotations appear in a search hit list.
In embodiments of the present invention, any suitable technique may be used to collect descriptive and/or evaluative metadata from a user about a single page (or group of pages) and associate that metadata with the user and the subject page (or group of pages) that provided the metadata. As each user accesses and annotates the various pages or sites, each user builds a "library" of personal content of interest to the user, and each user can view and edit its own library, such as described in application No. ____ (attorney docket No.017887- > 013720US) cited above.
3. Organization of annotations
In some embodiments, the user may organize their annotations using folders. For example, each user may have a "Main" (Main) folder into which the user's new annotations are placed by default. The user may create additional folders as desired. In some embodiments, the user may also define subfolders within the folder. The user interface for creating and managing folders may be of conventional design.
In one embodiment, each folder is defined in personalized database 166 with folder entry. FIG. 4 illustrates a folder entry 400 according to an embodiment of the invention. The folder entry 400 includes a reference field 404 that provides a reference (e.g., a persistent pointer) to a comment and/or a subfolder belonging to the folder 400; a linked list or other suitable data structure may be used to implement reference 404.
The folder entry 400 also advantageously includes other fields that may be used for folder management. In one embodiment, these fields include an "author ID" field 406 that stores the user ID of the user to which the folder belongs and a "Name" (Name) field 408 that stores the folder Name (e.g., up to 80 characters) provided by the user. The "name" field 408 may default to "New Folder" (New Folder) or some other suitable string. A "Description" (Description) field 410 stores a free text Description of the folder purpose or content that the user can edit; this field may default to a null state. The "Active" field 412 stores a flag (e.g., a boolean value) that indicates whether the comments in the folder should be used to respond to a query.
The "publish flag" (field 414), the "Privacy level" (Privacy level) field 416, and the "Access List" (Access List) field 418 are all related to the sharing of annotations, which may be controlled on a per folder basis in some embodiments. The publish flag in field 414 indicates whether the comments in folder 400 should be automatically distributed to other users via a publication mechanism; the publication will be described below. The privacy level in field 416 and the access list in field 418 are used to control the extent to which the annotations in the folder should be viewable by other users. Examples of privacy levels and their importance are described below.
It should be appreciated that the folder format may vary and may include other fields. In addition to the "master" folder, the user is free to create, rename, and delete folders. In some embodiments, multiple folders may store references to the same annotation; in other embodiments, each annotation is assigned to only one folder at a time, and the user can move the annotation from one folder to another, or create a copy of the annotation in a different folder. In some embodiments, each annotation entry may also include a "folder ID" (folder ID) field that stores a reference back to the folder(s) to which the annotation is assigned.
Although folders are optional, providing folders allows a user greater control over the search experience. For example, the user may arrange their comments in multiple folders and set the active flag (field 412) to true (true) for one or more of the folders and false (false) for the other folders. When a user enters a query, only the judgments in the active folder(s) will affect the results. The user may also use the folders to collect and organize annotation pages in a somewhat similar manner to bookmarks or other lists of personal sites supported by various Web browser programs or internet portal services. In a preferred embodiment, the folder and annotation data described herein is maintained by the search server 160 for the user and is available to the user regardless of the location at which the user accesses the search server 160.
In another embodiment, rather than using folders, the use of annotations is managed based on user-provided keywords or tags in the annotation record. For example, the active flag, the publish flag, and/or the privacy settings may be defined per keyword rather than per folder.
Sharing annotations via a trust network
As described in application No. ____ (attorney docket No.017887 and 013720US) cited above, the annotations collected by each user may be made available to the user when the user browses the Web. For example, while a user is viewing their annotated site, they may also be able to view and/or edit their annotations at the same time. As another example, the search results page may include visual or other highlighted elements to identify hit pages that the user has annotated, or may report metadata extracted from the user's annotations to various hit pages. As another example, user annotations may be used in addition to (or in place of) page content and other conventional factors to identify search hits and/or rank search hits.
In embodiments of the present invention, users may view annotations created by other users in addition to their own annotations. The set of users whose annotations are to be viewed by the first user is referred to herein as the first user's "trust network," and in a preferred embodiment, each user may have at least some degree of control over the membership of their trust network. An example of a technique for defining a user's trust network will be described below.
A. Creation of a trust network
1. Communication network model
In some embodiments, the users' trust networks are defined based on an interaction network that is built by trust relationships between pairs of users. Each user may explicitly define trust relationships with one or more other users (referred to herein as "friends" of the first user). Based on the trust relationships of the individual users, an interaction network may be defined that connects the user with other users via trust relationships, and a portion of the interaction network originating from a given user may be defined as the user's trust network. In such embodiments, a given user's trust network typically includes (in addition to itself) the user's friends, and may also include friends of the user's friends, and so forth. In some embodiments, all trust relationships are mutual (i.e., users A and B are friends only if both agree to trust each other); in other embodiments, a one-way trust relationship may also be defined (i.e., user A may see user B as a friend, regardless of whether user B sees user A as a friend). Any user may define as a friend any other user for whom the first user believes his comments are valuable to him.
From the trust relationships defined by the various users, a "social network" may be constructed, and all or a portion of the social network may be selected as the trust network for a given user. In general, an interaction network may be represented by network diagram 500, for example, as shown in FIG. 5. The network diagram 500 includes nodes 501 and 509, each of which represents a different user (in this example, users are identified by letters A-H). The edges (arrows) connecting pairs of nodes represent trust relationships between users; thus, user A trusts users B, C, D and I; user B trusts users C and E, and so on. In this example, the trust relationship is one-way; a two-way trust relationship (e.g., between users A and C) is represented by two edges. It should be understood that network diagram 500 is exemplary. The social network may include any number of users and any number of trust relationships, and a user may define trust relationships with any number of other users; trust relationships may be one-way or two-way.
In one embodiment of the invention, user A is able to view his own annotations as well as annotations created by any of his friends. In another embodiment, user A is also able to view annotations created by friends of his friends. For example, there is no direct trust relationship between user A and user E. However, user A trusts user B, which in turn trusts user E. Thus, it can be said that user A has an "indirect" trust relationship to user E, and that annotations from both users B and E are visible to user A.
More generally, the current description relates to trust relationships with N degrees of separation, where N is an integer equal to the minimum number of edges connecting users in the communication network. N-1 corresponds to a direct trust relationship (e.g., a relationship between users a and B); n > 1 corresponds to an indirect trust relationship. For purposes of this description, user a may be considered a member of his own social network, with N being 0. In some embodiments of the invention, a user browsing the Web (e.g., user a) may view and edit their own annotations, and may also view (but not edit) annotations created by other users in their social network (up to some maximum degree of separation, e.g., N ═ 1, 2, 3, or greater).
In some embodiments, user A may assign a different "trust weight" to each of its trust relationships. The trust weights may be defined on different scales, e.g., an integer from 1 to 10, etc. The trust weight advantageously reflects the amount of confidence that user A has in his/her annotations for each friend; generally, a higher confidence weight reflects a higher confidence.
This information may also be used to define a trust network when defining trust weights. For example, a belief propagation algorithm may be used to assign a "confidence coefficient" p to users in the social network; confidence coefficient p of user X relative to user AXATypically based on the trust weights that user a has assigned to his friends, the trust weights that user a's friends have assigned to their friends, and so on. Examples of belief propagation algorithms are known in the art and may be used to generate the confidence coefficients. The confidence coefficients of other users relative to user a may also be determined based on the degree of separation, for example by assuming equal trust weights for each friend of user a and then using a trust propagation algorithm to determine the confidence coefficient for each trust network member, or by assigning equal confidence coefficients to each user a given degree of separation from user a. In one embodiment, membership in user A's trust network is limited to its confidence coefficient pXAUser X, regardless of its degree of separation from user a, exceeds a certain threshold. Other applications of the confidence weights and confidence coefficients are described below.
2. Unambiguous identification of friends
In one embodiment, trust network module 165 (FIG. 2) provides an interface that a user (e.g., user A) can use to explicitly identify other users as friends in order to define their trust network. The interface may include a web page that is provided to the user upon request, and the user is advantageously required to log into the search server 160 before receiving the interface page.
FIG. 6 is an example of a trust network interface page 600 according to an embodiment of the present invention. Page 600 provides various mechanisms for a user to view and modify their friends list to define a trust network using the social network model. The current list of friends for user a is displayed in section 602. For each friend, list entry 604 includes a user ID, a description, and a trust weight. The description field may be populated by user A with any desired information, such as the true name of a friend, the relationship with user A, and so forth. Section 602 can be implemented to support sorting by any of its fields and can include other information about each friend, such as the number of friend members each friend has or a timestamp (not shown) indicating when the friend was added to the list. The information used to populate list 602 may be stored, for example, in an appropriate record within personalized database 166 and may be retrieved by trusted network module 165 in response to a user request.
Other information may also be provided. For example, in some embodiments, each entry 604 in section 602 includes an "active" flag 605 indicating whether the friend is to be included in user A's trust network (smiley icon) or ignored (not icon). This allows user A to ignore the friend's annotation without removing the friend from the list. For example, user A's same friends list may also be used in another social network environment, and user A may wish another user (e.g., user D) to be on his friends list in this other context, but not for the purpose of viewing annotations. In some embodiments, user A may also be able to select whether to include (use) or ignore (not use) comments from friends of each friend, and entry 604 may display this information.
Each entry is accompanied by an "Edit" (Edit) button 606 and a "Delete" (Delete) button 608. Activation of button 606 opens a dialog box (or form page) via which user A can update any information about the friend, and then save or cancel the change. Activating button 608 removes the friend from user a's list.
A "View Network" (View Network) button 609 is also provided. Activation of button 609 initiates an interactive display of user A's trust network, including friends thereof and friends of friends that have reached a maximum degree of separation, a minimum confidence coefficient, or other limiting parameter for defining the trust network. The display advantageously includes all users that will be in user a's trust network (i.e., all users whose annotations are visible to user a), and may also display users that user a has blocked from his trust network (e.g., user D).
In one embodiment, the display includes a network diagram similar to FIG. 5, with the diagram or other display being editable. For example, user a may be allowed to delete a node, indicating that the user represented by the node should be excluded from its trust network. In one embodiment, where the node represents a friend of user A (e.g., if user A, as an editing user, were to delete node 504), the delete node removes the friend from user A's friends list (e.g., user D); in another embodiment, the deleting node simply sets the friend's "active" flag 605 to an inactive state. In the case where the node is a friend of a friend (any node with a degree of separation greater than 1 from user a), deleting the node has the effect of making the user's annotation invisible to user a, but does not change any trust relationship. Conversely, a special entry identifying a particular user as "blocked" is advantageously added to the friends list maintained for user A in personalization database 166. For example, if user A, as the editing user, were to delete node 507, user G would stop being a member of user A's trust network, but the trust relationship between user C and user G would not be affected and user G would still be in user C's trust network. Thus, user A can adjust his trust network by selectively blocking user A from discovering individual members whose annotations are useless. In some embodiments, blocking a member also has the effect of blocking other members that are connected to the trust network only via the blocked member.
Referring again to FIG. 6, page 600 also includes section 610, via which section 610 user A can add a new friend. User a enters the user ID of the new friend in text box 612, the description in text box 614, and the trust weight in box 616. In some embodiments, the trust weight may have a default value (e.g., 3 on a scale of 1-5). User A may also select whether to include a friend of the new friend in his trust network via checkbox 618. Activating the "Add" button 620 completes the operation and the list in portion 602 is advantageously refreshed to include new friends.
Once defined, user A's friends list is stored, for example, in personalized database 166 in association with other user-specific information for user A. This information may then be accessed and used to personalize or customize the response to the user query.
It should be appreciated that the interfaces described herein are exemplary and that variations and modifications are possible. For example, in some embodiments, a friend may be added only if the friend agrees to be added. Thus, user A activating the add button 620 may not immediately add any friends to the user's list. Conversely, an invitation may be sent to a user designated by A (e.g., user K) via email, instant message, or other suitable communication medium, and user K may respond with an indication as to whether or not it accepts the invitation. If user K accepts, then a two-way friendship between user A and K will be established, for example by adding each user to the friends list of the other user; if not, no new friendship will be established.
3. Automatic identification of friends
In some embodiments, the trusted network module 165 may also automatically generate a friend list for user A by mining various information sources to identify other users that user A is willing to contact.
For example, in one embodiment, the provider of search server 160 also provides communication services such as email, IM (instant Messaging), and the like. As is known in the art, these services may allow user a to maintain a list of users with whom a desires to contact. For example, if user A registers with the provider's IM service, user A may define a "friends" list (sometimes also referred to as a "buddy" list), which is a list of user identifiers of other registered users with whom user A wishes to exchange instant messages. Including user B (or any other user) on user A's IM friends list indicates a connection from user A to user B and indicates that user B may be a friend of user A. Similarly, if user A is registered with the provider's email service, user A may maintain a personal email address book that identifies the user with whom user A exchanges email. Including user C (or any other user registered on search server 160) in user a's address book will also indicate a connection from user a to user C and indicate that user C may be a friend of user a.
In another embodiment, the provider of the search server 160 also allows registered users to join an online community, the members of which may communicate with each other using bulletin boards, chat rooms, email distribution lists, and the like. If both users (e.g., A and B) are members of the same online community, it may be inferred that there is some association between the users, and a two-way friendship may be appropriate.
Any or all of these techniques may be used to automatically populate the user's friends list. In some embodiments, the user's friends list may be pre-populated with any of the above or other sources of relationship information, and the user may then edit the list, for example, via page 600 (as described above). Where the relationship is automatically defined, page 600 advantageously indicates (e.g., in a description field) the source from which the relationship was inferred, and may also indicate that the relationship is automatically defined. In embodiments that require mutual consent to establish a friendship, any source of relationship data may be mined and used as a basis for inviting various pairs of users to become friends, wherein relationships are established once both users accept.
In other embodiments, the user's friends list is not pre-populated by default, and the user can select which sources of relationship information (if any) should be used to auto-populate the list (e.g., IM friends list and/or email address book and/or community membership information). Thereafter, the user may edit the list.
4. Selection of a collection of friends
In other embodiments, the trust network is defined based on explicitly defined groups of users or implied trust relationships between groups of users. As used herein, a "community" refers to any ongoing forum for which the search server 160 may obtain a list of user IDs of members and associate those IDs with annotation authors. Typically (but not necessarily), the community uses at least one network-based communication medium managed by the provider of the search server 160, such as a subscript-based email distribution list, a membership-only chat room, a bulletin board, and the like. In one embodiment, the community corresponds to the Yahoo! Groups, but any other online community whose membership may be determined by the search server 160 may be used; more generally, any organization or forum that provides a well-defined membership list may be used as a community, so long as the search server 160 can map user identifiers in the membership list to user identifiers of participants in the annotation system.
In some embodiments, user A's trust network is defined to include all users that are currently members of the community to which user A belongs. In some embodiments, user A may be able to select, via a suitable interface (not shown in FIG. 6), one or more communities of which user A is a member, to use as their trust network. Some embodiments may allow user a to view and edit a personal friends list derived from the community member list of the selected community(s) (e.g., as described above), but do not require that user a be able to edit or even view the community member list. Thus, user A may select any community to which it belongs as its trust network, even if there is no information about other members of that community, and the membership of user A's trust network may change automatically (whether or not user A knows) as members join and leave the selected community.
In the case where user A's trust network is defined by a reference community, user A may be able to block annotations from individual members, effectively removing those annotations from their trust network. For example, when displaying annotations of trust network members, the display interface may include a control via which user A may instruct the search server 160 to block the author's annotations in the future. In such an embodiment, personalization database 166 may include (for each user) a list of community(s) defining the user's trust network and a "blacklist" of users whose annotations should be blocked.
In the case where user a's trust network is defined by reference to a community, all community members may be considered to have the same degree of separation from user a (e.g., N ═ 1). In some embodiments, all members are also initially assigned equal trust weights, and user A may or may not be able to manually adjust the trust weights of individual members via a suitable interface (e.g., similar to page 600 described above).
In other embodiments, each community member may be assigned a "reputation score" within the community, and the reputation score of a given member may be used as a confidence coefficient for that member. The reputation score may be determined in various ways. In one embodiment, the reputation score of a group member is based on its level of participation in the group (e.g., frequency of posting in a bulletin board or email distribution list or frequency of participation in a chat room, etc.). In another embodiment, community members may be able to explicitly evaluate the reliability of other members, and the reputation score of each member may be based on such evaluations (see, e.g., section iv.c below). In another embodiment, a community member may be able to rate the annotations of other members (but not edit), and the member's reputation score may be based on the rating given to its annotations by other members of the community.
5. User preference for trust networks
In some embodiments, the trust network module 165 allows each user to specify various parameters relating to how their trust network should be defined and used. For example, in page 600 of FIG. 6, portion 624 allows the user to control the settings of the trust network. For example, using the radio button 626,the user may indicate whether trust network membership should be determined based on a degree of separation or confidence coefficient. In some embodiments, the user may also be able to specify a maximum degree of separation (e.g., N) within a certain rangemax1, 2 or 3) or minimum confidence coefficient (e.g. p)min0.2, 0.4 or 0.8). Checkboxes 628, 630, and 632 allow the user to specify the situation in which information obtained from their trust network should be displayed. For example, the user may select whether to have the search results highlighted and/or sorted based on information obtained from their trust network (blocks 628, 630), and whether the browser toolbar should indicate whether the displayed page has been annotated by someone in their trust network (block 632). Examples of these operations are described below.
It should be appreciated that other user preferences and combinations of preferences may be supported. For example, a user may be able to specify whether their trust network should be built from an social network model with an explicit friends list or implicitly from the community to which they belong.
B. Toolbar interface to trust network annotations
FIG. 7 is an example of a toolbar-based interface for annotating by trust network members and/or viewing existing annotations for any page that a user happens to be viewing, according to an embodiment of the present invention. The Web browser window 700 includes conventional elements such as a viewing area 702 for displaying Web content, a default toolbar 704 that provides navigation buttons (back, forward, etc.), and a navigation area 704 that displays the URL of the currently displayed page and also allows the user to enter the URL of a different page to display in the viewing area 702. Browser window 700 also includes a search toolbar 706, which search toolbar 706 may be provided as an add-on to a conventional browser program or as a standard feature of a browser program.
Search toolbar 706 advantageously includes a textbox 708 and a "search Web" (SearchWeb) button 709 via which a user may submit a query to search server 160 (fig. 2) and a "Saved List" (List Saved) button 710 and a "Save This" button 712, where "Saved List" button 710 allows the user to view their own Saved annotations and navigate to their annotated pages and "Save" button 712 opens a page or dialog box that allows the user to annotate the currently displayed page. These aspects of the search toolbar 706 may be generally similar to the features described in application No. ____ (attorney docket No.017887-013720US) cited above. As used herein, "saving" a page refers to creating and storing annotations of the page, and may or may not include saving a copy of the page's content.
In some embodiments, search toolbar 706 also includes a "show My Web" (ShowMy Web) button 714 that appears active whenever the browser is displaying a page that has been previously annotated by the browsing user or another member of its trust network; the browsing user may operate button 714 to view previous annotations entered by any member of his trust network. Where the annotations include ratings, the appearance of the button 714 may depend in part on the rating given to the currently displayed page by the trust network member. For example, the average rating of all trust network members may be reflected by an icon included in button 714. In a preferred embodiment, the button 714 is only operable if the currently displayed page has been annotated by at least one member of the user's trust network.
Fig. 8 illustrates a dialog box or overlay 800 that may be launched upon activation of button 714. Overlay 800 provides annotation information about a currently displayed page based on annotations from members of the browsing user's (e.g., user A) trust network. In section 802, metadata from annotations saved by the "nearest" (closest) member of user A's trust network is displayed.
The "nearest" member may be defined in various ways. In one embodiment, proximity is primarily based on degree of separation (N) so that the trust network member with the smallest N relative to user A is defined as closest. (note that if user a has annotated a page, user a's own annotation will be displayed in section 802, since user a is the only member of N-0 in a's trust network by definition, user a is the only member of a.) when defining the recent user tie by referencing the N results, other parameters (e.g., trust weight, confidence coefficient, or how long the relationship has existed) may be used to determine which member is the most recent. In another embodiment, confidence coefficients may be used to define proximity, while other parameters (e.g., degree of separation) are used to break ties. It should be appreciated that the specific definition of "nearest member" is not important to the present invention.
Below section 802 is a list of other trust network members for which the display page has been annotated. Clickable links for displaying annotations for each such member are advantageously provided. In a preferred embodiment, the browsing user is not allowed to edit annotations entered by other users, but may be allowed to edit their own annotations (e.g., by including an "edit" button in overlay 800 that initiates an editing interface, which is only operable if the browsing user's own annotations are displayed in portion 802).
Portion 806 provides metadata for the trust network aggregation of browsing users. In one embodiment, the aggregated metadata includes an average rating of the page or site and a list of keywords describing the page or site. The average rating may be calculated, for example, by calculating a weighted average of ratings, where the rating of each member of the trust network is weighted by the member's confidence coefficient relative to the browsing user. (any trusted network member that does not annotate a page is advantageously ignored for purposes of computing an average rating.) the list of keywords may be generated by identifying the most frequently occurring keywords among all trusted network member's annotations; the frequency of occurrence of each key may be calculated by adding the confidence coefficients of the trust network members using the key. In other embodiments, the aggregation algorithm may also take into account other factors, such as how recent a given annotation is (the longer the annotation is, the lower the weight), and so forth.
The "Close" button 808 closes the overlay 800, and the overlay 800 can be reopened at any time by activating the button 714.
It should be appreciated that the toolbar interface described herein is exemplary and that variations and modifications are possible. The search toolbar 706 may include other components in addition to (or instead of) those shown above. In addition, any other persistent interface (i.e., an interface that is accessible while the user is viewing any web page) may be substituted; no search toolbar is required. In alternative embodiments, an interface element that notifies the browsing user of the presence of an annotation may convey other information. For example, the interface element may identify the nearest trusted network member of the annotated page and/or indicate the number of trusted network members of the annotated page. Such information may also be included in the overlay 800. The element may also indicate whether the closest member is the browsing user or another user. Annotation data need not be displayed in the overlay; dialog boxes, new browser windows, new tabs in existing browser windows, etc. may also be used, or annotation data may be added to the page in an embedded fashion. Alternatively, the current browser window may be redirected to a page containing annotation data.
In some embodiments, search toolbar 706 may be configured such that it may be used by users that are not logged into search server 160 in a "general" state and may be used by users that are logged in a "personalized" state. In the general state, the toolbar provides access to basic Search services (e.g., via text box 708 and "Search" (Search) button 709), and also provides a button that allows the user to log in to access personalized services. In the personalized state, the personalized feature may be supported through the toolbar. For example, the "save" button 712 may be provided only in the personalized state of the toolbar 706; alternatively, the button 712 may also be provided in the common state, and the browser is redirected to a login page if the button 712 is activated while the toolbar is in the common state.
C. Search report interface to trust network annotations
In some embodiments, the presence of annotations by a user's trust network member may be included in a page reporting search results for a query entered by the user. FIG. 9A is an example of a search results page 900 enhanced with annotation information according to an embodiment of the invention. The results page 900 may be generated by the query response module 162 in response to a user's query. In this embodiment, results page 900 includes banner portion 902. In addition to the page identification information, the banner section 902 also includes a search box 904 that shows the current query (e.g., "Chinese food sunnyvale") in editable form and a search button 906 that enables the user to change the query and perform a new search. These features may be of conventional design.
Section 908 is a personalized ("My Web" (My Web)) results area in which any hits that have been previously annotated by the querying user's trust network member are displayed. In some embodiments, portion 908 may only display those hits for which the aggregated rating of the trust network (e.g., as described above with reference to fig. 8) is positive; in other embodiments, all annotated hits may be listed in section 908. Each annotation hit is advantageously accompanied by a "show My Web" button 910 that the user can activate to view the member's annotations. In one embodiment, activation of button 910 initiates an overlay similar to overlay 800 of FIG. 8 described above.
The "All Results" section 916 shows some or All of the hits (including both annotated and unannotated hits) having a ranking determined by the query response module 162. Conventional ranking algorithms may be used to generate the ranking. Each entry 918 in section 916 corresponds to one of the hits and includes a title of the hit page (or site) and a brief excerpt (or summary) of the page's content. The excerpts or abstracts may be generated using conventional techniques. The URL (uniform resource locator) of the hit is also shown. For hits that have no comments on members of the trust network, a "save" button 919 may be displayed, and while viewing page 900, the user may select to comment on the unannotated hit by activating button 919. The "Save" button 919 is advantageously similar in operation to button 712 in FIG. 7, described above.
Any annotated hits in section 916 may be visually highlighted to indicate the presence of annotations, and may also include a "show my Web" button 910. In addition, a "save" button 919 may also be provided for each hit for which other members of the querying user's trust network have annotated the hit, but for which the querying user has not annotated.
Various designs for highlighting annotated hits may be used, including for example, borders, shading, special fonts, colors, and the like. In some embodiments where the annotations include ratings, the type of highlighting depends on the aggregated rating of the trust network, and the aggregated rating may also be displayed on page 900. For example, hit 920 has a positive rating and hit 922 has a negative rating. In other embodiments, other aggregated metadata and/or metadata from individual members of the trust network may also be included on page 900.
In other embodiments, more information than just highlighting may appear on the search results page. FIG. 9B is an example of another search results page 940 in which a snippet of an opinion made by a trusted network member is provided in My Web section 948. Each hit 950 is accompanied by an opinion 952 extracted from the annotations made by the trust network members. In this embodiment, two opinions are shown; additional comments or More information about the annotation may be viewed by clicking on the "More" (More) button 954. In the event that the querying user has not yet annotated a hit, a "save" button 956 may be provided. Search results page 940 may also include an "all results" section (not shown) and other information.
It should be appreciated that the search results page described herein is exemplary and variations and modifications are possible. Any report made in any format suitable for transmission to a user may replace search results page 900, and the various interface control elements for interacting with a search report may vary from what is shown here. Any portion (including all) of the annotation metadata may be included inline in the page and/or may be accessed via a suitable interface control. In some embodiments, the user may be able to set personal preferences in the search report sent to him regarding the appearance of annotation related information.
D. Enhanced Web search
In one embodiment, search server 160 (FIG. 2) accesses the annotation store of the user's trust network members to provide additional information when responding to a query from the user. For example, as indicated above, a separate list of annotated hits (i.e., hits corresponding to annotated pages in the repository of at least one trust network member) may be included in the search results, or the annotated hits may be highlighted, wherever it happens to appear in the results list. Where the annotation includes a rating, a separate list of hits for positive ratings may be provided, the rated hits may be highlighted in a manner that reflects the rating of the querying user, or rating data may be used as one factor in ranking the hits.
FIG. 10 is a flow diagram of a process 1000 that may be implemented in query processing module 162 (FIG. 2) for incorporating annotations of trust network members into responses to a current query from a querying user. At step 1002, a query is received. At step 1004, a hit list corresponding to the query is obtained, for example, from page index 170 (FIG. 2). At step 1006, the query processing module 162 ranks the hits, for example, using conventional algorithms.
At step 1008, the query processing module 162 determines whether the querying user is logged in. If not, the query processing module 162 may send the results page to the querying user at step 1010 without personalization, thereby enabling the user to perform the search and obtain results without logging into (or even registering with) the search server 160. If the user is logged in, the results page is customized for the user based on the information in personalization database 166.
More specifically, at step 1012, query processing module 162 provides the ID of the querying user to personalization database 166 and retrieves the userA list of trusted network members. In one embodiment, step 1012 includes dynamically building a list of trust network members with the trust network module 165. For example, a trust network is to be built from a friends list and extended to a maximum degree of separation (N) from the querying usermax) In this case, step 1012 may include creating a representation of the network graph by first obtaining the querying user's friend list from personalization database 166 and defining a network node for each friend. In NmaxWhen 1, the identification of the trust network member may stop there; for Nmax(> 1), get a list of friends for each friend, and define additional nodes, and so on until a maximum degree of separation is reached. It should be noted that for sufficiently large NmaxThe number of trust network members may extend to all users of the search system, and it may be desirable to limit NmaxOr trust the total number of network members to avoid unduly inundating the querying user with annotations.
In other embodiments, where the trust network is defined by reference to a community, step 1012 may include retrieving the current membership list for the community from the personalization database 166 or another data store accessible to the search system 160. In other embodiments, step 1012 includes retrieving a pre-constructed list of the querying user's trust network members from personalization database 166.
Where trust weights and/or confidence coefficients are used to identify trust network members or to use trust network information, step 1012 may also include determining trust weights and/or confidence coefficients.
The annotations created by the trusted network member are retrieved from the personalization database 166 at step 1013, and the URLs of the retrieved annotations are compared to the hit URLs at step 1014 to detect any hits that match the URLs for which at least one trusted network member has previously created an annotation. These hits are referred to herein as annotated hits. For annotations with the host flag set to "site," a match (also referred to herein as a "partial match") is detected if the beginning of the hit URL matches a URL (or partial URL) stored in the annotation (e.g., in URL field 308 of FIG. 3). If the host flag is set to "page", then an "exact" match between the annotated URL and the hit URL is required. As used herein, unless otherwise specified, "match" includes both partial and exact matches.
In embodiments where the annotations include a rating, for each annotated hit, an average rating or aggregate rating is calculated at step 1015. As described above, the aggregate rating may be a weighted average (weighted by a confidence coefficient) of all trust network members for which the annotation hits. The rating may also be weighted based on recency or other criteria. It is determined whether the aggregate rating is positive in step 1016. If so, the hit is added to the positive results ("My Web") list. In other embodiments, all annotated hits (regardless of their ratings) may be added to the "My Web" list.
At step 1020, the results list is optionally re-ranked using the aggregate rating. For example, during ranking, a base score may be generated (whether annotated or not) for each hit using conventional ranking algorithms. For hits with positive or negative aggregate ratings, a "reward" (bonus) may be determined from the ratings. Rewards are advantageously defined such that positively rated sites tend to move up in the ranking, while negatively rated sites tend to move down. For example, if a low score corresponds to a high ranking, a positively rated reward may be defined as a negative number, while a negatively rated reward may be defined as a positive number. In some embodiments, partial URL matches may be awarded less than exact URL matches. Unevaluated (or neutral-rated) hits do not receive rewards. The reward may be added (algebraically) to the base score to determine a final score for each hit, and the new ranking may be based on the final score.
In some embodiments, re-ranking at step 1020 may also include discarding any annotated hits from the list of hits to be displayed that have a negative aggregate rating. In these embodiments, the search results page passed to the user may include an indication of the number of hits discarded due to a negative aggregate rating and/or a "Show all hits" (or other control) button (or other control) that allows the user to view the displayed search results including hits for a negative rating. In another variation, the user may click on a link to see only hits that are negatively rated.
At step 1022, the My Web list is ranked and added to the search results page. In some embodiments, the ranking may be based on the base score or the final score described above. In other embodiments, hits in the "My Web" list are sorted by aggregate rating; hits with the same rating may be further classified according to the basic score described above. In other embodiments, hits in the "My Web" list are sorted based primarily on the number of trust network members annotating the hit, which hit has an annotation from the closest member, and so on.
At step 1024, modify the search results page based on the presence of the annotation; for example, the highlight and/or "show my Web" button described above may be added to the annotated hits. The modified search results page (which in this case includes a personalized "my Web" section) is sent to the user at step 1010.
It should be appreciated that the processes described herein are exemplary and that variations and modifications are possible. The steps described as sequential may be performed in parallel, the order of the steps may be varied, and the steps may be modified or combined. In some embodiments, some or all of the annotated content of the hits, or the aggregated metadata of the hits, may be displayed in-line in a search results page prior to an explicit request from the querying user. For example, a visual highlighting element indicating a positive or negative aggregate rating may be displayed, or an aggregate keyword may appear under an automatically generated summary, and so forth. Additionally or alternatively, annotated metadata from individual trust network members (whether attributed to their respective authors or not) may be displayed. In other embodiments, the search results page may indicate which trust network members have annotated each annotated hit.
In other embodiments, annotations of trust network members may be used to identify hits during search operations. For example, in addition to searching the page index 170, the query response module 162 may search selected fields of the annotations of the trust network members using some or all of the same search terms as were used to search the page index 170. In one such embodiment, the annotated keyword and/or description fields are searched, and if a search term occurs in one of these fields, the annotated page is identified as a hit, regardless of whether the annotated page has been identified as a hit in the search of page index 170. In another embodiment, aggregated metadata (e.g., keywords aggregated over a trust network as described above) may also be searched.
E. Searching in personal Web
In some embodiments, a querying user may search for content that has been annotated by members of their trust network, rather than the entire Web. For example, search toolbar 706 of FIG. 7 includes a text box 706 and a "search Web" button 704 that may be used to submit a query for searching the entire Web. Search toolbar 706 also includes a "My Web" button 720 that may be used to search for content annotated by the user's trust network members. Such content is referred to herein as "Personal Web" (Personal Web), and in general, different users will have different Personal webs, in the case where different users have different trust networks. In one embodiment, a user logging into search server 160 may enter a query in text box 706 and then activate button 709 to search the entire Web or activate button 720 to search their personal Web. In the latter case, the search may be generally similar to a traditional Web search, except that only hits having associated annotations from at least one member of the querying user's trust network are displayed. Personal Web search options may also be provided through other interfaces, such as from a conventional search interface page or from a search results page.
In another embodiment, a querying user can search for annotations of their personal Web in addition to or instead of page content. For example, search toolbar 706 may include a button (not explicitly shown) that launches a personal Web search interface via which a querying user may define a desired scope of a search.
FIG. 11 is an example of a personal Web search interface page 1100 in accordance with an embodiment of the present invention. Page 1100 provides a user interface for field-specific searches within the user's personal Web. Scope section 1102 allows the user to indicate whether the search should include annotated content from other trust network members, or just the user's own annotated content, or the entire Web, including annotated content from all users. The "Show My Trust network" button 1104 advantageously allows the user to navigate to a "My Trust network" page 600 (FIG. 6) or similar page and then return to page 1100, via which page 600 or similar page the user can view and modify his current trust network definition. In some embodiments, the user may also be able to view a list of their trust network members and select one or more individual members, thereby limiting the search to annotations for those members.
The query portion 1112 of the page 1100 provides various text boxes into which a user may enter search terms to search for particular fields in page content and/or search annotations. In this example, the user may individually specify search terms for page content (text box 1114), annotation title (text box 1116), keyword field (text box 1118), description (text box 1120), and/or presentation (text box 1121). Radio button 1122 may be used to constrain the evaluation of hits (e.g., the aggregate evaluation or average evaluation described above). By default, "any rating" (rating) is selected so that the rating does not limit the search; the user may choose to limit the search to hits with positive ratings or hits with negative ratings, for example. The "search" button 1126 submits a query for processing, and the "Reset" (Reset) button 1128 clears all fields in the query portion 1112.
It should be understood that the user may leave some or all of the text boxes in portion 1112 empty; in the case where the text box is empty, the corresponding field is not used for constrained search. For example, a user may search for page content of their personal Web by entering search terms in text box 1114 and leaving other text boxes empty; the actual search may be performed using the page index 170 and any hits that do not correspond to annotated pages or sites are discarded before the results are sent to the user. Search results are advantageously delivered using a search results page similar to page 900 (FIG. 9A) or 940 (FIG. 9B) described above, except that when the search is restricted to the user's personal Web, there is at least one annotation for each hit.
FIG. 12 is a flow diagram of a process 1200 for responding to a query submitted via page 1100 or another interface for searching the personal Web, according to an embodiment of the present invention. At step 1202, a query is received from a user. At step 1204, a determination is made whether the querying user is logged in. If not, the user may be prompted to log in at step 1206, or the operation may be aborted. At step 1208, identifying a trust network member of the user; this step may be generally similar to step 1012 of process 1000 (FIG. 10) described above. At step 1210, annotations made by the trusted network member (including the querying user) are retrieved from the personalization database 166.
At step 1212, depending on the query, search hits are identified based on the page content and/or annotation content. In the case where page content is to be searched, information about the page content may be obtained from the page index 170, or from annotations in the personalized database 166 (if a representation of the page content is stored therein). The other fields are searched using the comments of the trust network members obtained from the personalization database 166. Regardless of the particular search algorithm, a page is advantageously identified as a hit only if at least one member of the querying user's trust network has annotated the page. For example, in the case of page content to be searched, the search may be performed on the entire corpus represented in the page index 170, with the resulting global list of hits filtered based on the presence or absence of annotations, or the annotations retrieved at step 1210 may be used to generate a pool of documents represented in the page index 170 to be searched.
In some embodiments, hits are re-ranked or highlighted based on average ratings. Thus, at step 1214, an average rating for each hit is calculated, similar to step 1015 of process 1000 (FIG. 10) above. At step 1216, similar to step 1020 of process 1000, hits are re-ranked using average ratings. At step 1218, any desired highlighting or metadata may be added to the hit list. For example, as described above, visual highlighting may be applied to each hit to reflect an average rating for that hit; a "show my Web" button may be associated with each hit to allow the user to view an annotation of the individual trust network member; or metadata extracted from individual annotations and/or aggregated metadata (e.g., average ratings or aggregated keyword sets) may be added to the list. At step 1218, the search results page (including the hit list) is returned to the querying user.
It should be appreciated that the search interface and search process described herein are exemplary and that variations and modifications are possible. The process steps described as sequential may be performed in parallel, the order of the steps may be varied, and the steps may be modified or combined.
The query interface may vary. For example, in another interface, a single text box is provided and the user is prompted to select whether the search term in the text box should be searched in the page content and/or should be searched in various fields of the annotation record (e.g., title, keywords, description, and/or other fields). In another embodiment, a "basic" search interface with a single text box is provided by default, and a search is performed on the page content and one or more pre-selected comment fields. The user may accept the basic search configuration or select the view query portion 1112 (or another query interface) to enter a more complex query. Other query interfaces and combinations of interfaces are also possible.
In some embodiments, the search page 1100 may also be accessed via a button on a toolbar (e.g., button 720 of toolbar 706 in FIG. 7) or other suitable element of the persistent user interface, or from the main page of the search provider. If a user that is not logged into search server 160 attempts to access page 1100, the user may be prompted to log in before page 1100 is displayed.
Additionally, although the term "personal Web" is used above, it should be appreciated that in a manner similar to that described above, a "personal" version of any document corpus accessed by multiple users may also be defined and searched.
F. Browsing personal Web
In some embodiments, a user may browse his personal Web without entering a query. For example, a user can browse through folders through their own annotations or through folders through which annotations made by their trusted network members are browsed by using a suitably configured interface.
In other embodiments, the user is able to search for other documents (e.g., pages or sites) that are similar to or related to the page or site that has been annotated by their trust network member. A "similar" document is a document that contains content that meets some similarity criteria with respect to the annotated page. Examples of similarity criteria include: having a certain number of common words, phrases, or other multi-word units; similar patterns of occurrence with words, phrases, or other multi-word units; belong to the same category or closely related categories in a system-defined taxonomy; and so on. Algorithms for determining similarity between two pages are known in the art and may be used with the present invention. "related" documents share a portion of the URL (e.g., at least the domain name) with the page being evaluated; also, known algorithms for determining the degree of correlation may be used.
In another embodiment, the user may be able to browse the relevance of the annotations. For example, a user may be able to select a "start" page or site and obtain a list of other pages or sites that are most frequently annotated by those users who have annotated the start page or site.
Whenever an annotated document is displayed, a user may be able to initiate a search for similar, related, or related documents from a search results page or toolbar interface. For example, the overlay 800 of fig. 8 or the toolbar 706 of fig. 7 may include control elements through which these searches may be initiated.
In other embodiments, a user may be able to view information about behavior in their personal Web. For example, page 1100 (FIG. 11) or another personal Web interface can include various controls (not shown in FIG. 11) that allow a user to view a list of information. In one embodiment, a user may view a list of pages or annotations recently added to their personal Web. In another embodiment, the user may view a list of pages that have been annotated by the maximum number of trust network members, or a list of pages that have the highest average rating or aggregate rating within their trust network. In another embodiment, a user may view a list of pages most frequently visited by their trust network members over a certain period of time. Any one of these lists or others may also include metadata from the annotations, summaries or aggregations of metadata from the annotations, and so forth.
In yet another embodiment, such information may be used to respond to queries. For example, a list of annotated pages or sites may be provided where a user's query (or keywords from a user query) matches an introductory field in at least one trust network member's annotation. Other variations, additions, and modifications are also possible.
G. Personal Web statistical information
In some embodiments, a user may be able to view statistical information about the behavior of their trust network members.
For example, a user may be able to view statistics about queries submitted by their trust network members to the search server 160 over a certain period of time, such as the most popular queries within their trust network, the queries that change most dramatically in popularity, and so forth. Such a list may be similar to that described by Yahoo! The existing "Buzz" feature provided by the company (the assignee of the present application), but only includes queries submitted by the user's trust network members.
In other embodiments, other statistical information may be obtained. For example, a user may be able to view a list of the most popular pages (or sites) among their trusted network members, such as measured by the number of members who have annotated the same page or site or by the average rating given by the members who have annotated the page. Another list may include pages or sites that have been recently annotated by members; entries in such a list may indicate who has annotated the page and may also provide links to view the page and/or new annotations. The user may also be able to filter these lists, for example by specifying that the annotation should include a particular keyword (or keywords).
H. Restricting access to annotations
As described above, in embodiments of the invention, some or all of one user's annotations may be visible to other users connected to the first user through a trust relationship. Although each user typically has the ability to identify their friends, in some embodiments, the user may not have the ability to prevent other users from identifying them as friends. Thus, it may be desirable to allow a user to establish privacy settings to control whether other users may view any or all of their annotations. In some embodiments, a folder record (see, e.g., FIG. 4) or annotation record includes two additional fields related to managing access: a privacy level (field 416) and an access list (field 418). Where a privacy level is established for a folder, the privacy level applies to all annotations within the folder. In some embodiments, the user may establish a default privacy level for the folder and then overwrite the default value for individual notes within the folder.
In one embodiment, the privacy level may be set to "Public" (Public), "Shared" (Shared), or "Private" (Private). If an annotation (or its folder) is marked as "public," the annotation is visible to other registered users of the system, and also to any other user if the annotation user is in the trust network of that other user (at least possibly). By "visible to the user" in this context, it is meant that the annotation may be presented to the user in a display form, such as overlay 800, or it may be used to determine the aggregate metadata of the user's trust network. For example, referring to the trust relationship shown in FIG. 5, if user A's trust network is defined to include all users up to two degrees of separation, user G will be within user A's trust network and user A will be able to see whichever of user G's annotations has been marked "public" by user G.
If an annotation (or its folder) is marked as "shared," the annotation can only be seen by another user if: (1) annotating the user in another user's trust network; and (2) another user is in the trust network of the annotation user. For example, referring again to FIG. 5, even if user G is in user A's trust network, user A cannot see any of user G's annotations that user G has marked as "shared" because user A is not in user G's trust network. On the other hand, users A and C will be able to see the "shared" annotations of each other.
If an annotation (or its folder) is marked as "private," the annotation can only be seen by another user if: (1) annotating the user in another user's trust network; and (2) another user is on the access list of annotations (or folders). Like other privacy settings, an access list of private annotations is advantageously maintained by the author of the annotation. For example, referring again to FIG. 5, user A can only see the annotation if user C has placed user A on the access list of user C's annotation that has been marked as "private" by user C. Thus, the user may hide certain annotations from some or all of their friends.
In a preferred embodiment, any annotations are always visible to their authors, regardless of their privacy level.
To further illustrate the use of folder privacy settings, reference is made to FIG. 13, where list 1302 shows privacy levels for various folders (Main and F1-F4) that may be defined by user B, and comments (J1-J10) created by user B that may be included in each folder, list 1304 shows user B's trusted network members, and list 1306 shows user A's trusted network members. Assume that user a enters a query that is processed according to process 1000 (fig. 10) described above. At step 1012, it will be determined that user B is a member of the trust network of user A. At step 1013, user B's folder tree (see list 1302) will be traversed to retrieve user B's annotation. Folder "Main" is marked as "public"; thus, annotations J1-J3 are visible to user A and will be retrieved for use as responses to user A's query. Folder "F1" is marked as "private" and user A is not granted access; thus, annotations J4 and J5 are not visible to user A and will not be retrieved. Folder "F2" is also labeled "private" and user A is granted access; thus, annotation J6 is visible to user A and will be retrieved. Folder "F3" is labeled "public"; comments J7 and J8 will be retrieved. Folder "F4" is marked as "shared," but it is not visible to user A because user A is not in user B's trust network; thus, annotations J9 and J10 are not visible to user A and will not be retrieved. Thus, in process 1000, visible annotations J1-J3 and J6-J8 would be retrieved and used as responses to user A's query, while invisible annotations J4, J5, J9, and J10 would not. From the perspective of user a, as if no invisible annotations were present, and at step 1015 of process 1000 will compute an aggregate trust network rating for any hits that B may rate with invisible annotations, as if user B had not yet annotated the hits.
It should be appreciated that other privacy mechanisms may be provided in addition to or in lieu of those described herein. More or fewer privacy levels may be defined. In some embodiments, access to the author's annotated "shared" folder may be determined with reference to data other than the author's trust network, such as the author's IM friends list, email address book, Yahoo! Members of a group or other active associations selected by the author, and so forth.
In another embodiment, information sharing may be controlled based on keywords used in a particular annotation. For example, the annotation user may be able to specify that all annotations containing the keyword "cycling" should be treated as common, all annotations containing the keyword "football" should be treated as shared, and so on. Where the annotations include keywords assigned different privacy levels, system-wide rules may be applied to determine whether sharing of the annotations should be managed with a more restrictive privacy level or a less restrictive privacy level.
In some embodiments, the metadata may be aggregated on a global scale (e.g., annotations relating to all registered users of search server 160). For example, a global rating for a page may be determined by averaging all user-provided ratings for the page. In some embodiments, privacy settings established by the author are considered during global aggregation; for example, only annotations marked as "public" may be used. In other embodiments, the privacy settings are ignored and all annotations are used.
Static sharing of annotations
In some embodiments of the invention, users may also share their annotations by distributing copies of their annotations to other users. Unlike the dynamic sharing described above, static sharing advantageously results in the receiving user obtaining a copy of their own annotations that the receiving user can edit, delete, or otherwise modify without affecting the sharing user's annotations.
A. Exporting and importing annotations
In some embodiments, the user may export and import annotations. For example, an "export" user may send all annotations (or a selected subset of such annotations) in their library to another user, who may then choose to "import" such annotations into their own library. An embodiment supporting annotation export and import will be described below.
In one embodiment, an interface page is provided via which a user can view and edit their own annotations. FIG. 14 is an example of a library interface page 1400; similar interfaces are described in the above-referenced application No. ____ (attorney docket No.017887-013720 US). By operating the view option in control section 1402, the user can create a customized list of their own annotations in list section 1404.
Each annotation displayed in the list section 1404 has a check box 1406, the check box 1406 being operable to select the derived annotation. Once selected (by selecting or deselecting various boxes 1406), the user may operate a button 1408 to export the selected annotations. Alternatively, the user may operate button 1410 to export all annotations listed in section 1404 without regard to check box 1406.
When the user activates button 1408 or 1410, a conductive publication of the selected annotation is created. For example, some or all of the metadata for each annotation that is derived may be retrieved from personalization database 166, reformatted as necessary (e.g., inserted into one or more web pages), and placed into a temporary storage area from which the metadata may be retrieved using an appropriate resource identifier (e.g., a URL).
The exporting user is prompted to identify the delivery method (e.g., IM, email) and provide the appropriate identifier (e.g., IM screen name, email address) of the recipient or recipients. In a preferred embodiment, no trust relationship between the user and the recipient needs to be derived; the export user can export his comments to anyone of his choice. The derived annotations, or other data identifying the availability of the derived annotations, are communicated to the identified recipients. The notification mechanism depends on the delivery method; for example, suitably configured email messages or instant messages may be used.
Each recipient has the option of importing the annotations into their own library. In one embodiment, an email or IM client may be configured to recognize that an incoming message contains one or more annotations and ask the user whether to import the annotations. In another embodiment, the derived annotations are encapsulated into a displayable webpage, and the URL of the page is passed to the recipient, e.g., via email or IM. The recipient can view the exported annotations and choose which annotations, if any, to import. FIG. 15 is an import interface page 1500 that can be referenced by a URL sent to the recipient. If the recipient has not logged in when he navigates to page 1500, he may be prompted to log in before viewing the page or importing any comments.
A title (header) 1502 identifies the source of the annotation (e.g., by displaying the user ID of the exporting user). The list 1504 includes fields selected from each annotation. In this example, Title (Title), URL, Keywords (Keywords), Description (Description), and Rating (Rating) fields are shown. In other embodiments, other fields may be displayed in addition to or in place of the fields shown in FIG. 15, and the importing or exporting user may select the fields to be displayed. Each entry may include an active link via which the recipient may navigate from page 1500 to the subject page.
Each list 1504 includes a check box 1506 that recipients can check or clear. Control buttons are provided that enable the recipient to import the selected item (button 1508) or import all of the items (button 1510). Other controls may also be provided.
When the recipient imports an annotation, a new annotation record (e.g., as described above in FIG. 3) is advantageously created and added to the personalization database 166. The author of the new annotation is the importing user (not the exporting user) and the "introduction" field of each imported annotation advantageously identifies the exporting user as the source of the annotation. The "old introduction" field may include introduction information from the export user's comments or may be reset to a default (e.g., null) value. The "last update" field may be updated to reflect when the annotation was imported, and the user may reset any counters or other statistical information (e.g., last access, number of accesses) associated with the annotation for the import. Thereafter, the imported annotation is treated as if it was created by the importing user. For example, it is visible to the importing user, regardless of any privacy settings, and the importing user can edit or delete it.
B. Publishing notes
In addition to exporting annotations to other users, users may publish their annotations. As used herein, the annotation "publish" refers to the automatic distribution of user annotations via any suitable channel, and may include periodic republication to reflect changes made by the publishing user. The republishing of annotations or the publishing of updates may occur at regular intervals, in response to changes in information, or on some other arrangement. For some publication channels, the publishing user may have some control over who receives the data; for other channels, the receiving user determines which published information to view.
In one embodiment, the user may specify some or all of his folders for publication using the publication flag described above (see FIG. 4); in other embodiments, the user may specify individual annotations for publication, or may control publication based on the presence or absence of keywords in the annotations. An automated distribution process performed by the search server 160 of FIG. 2, or another suitably configured server, identifies any annotations to publish (or republish) and generates a publish message appropriate for the publication channel.
Various technologies and channels may be used to support publication. In one embodiment, the annotations selected for publication may be used to periodically update an RSS (really simple syndication, also known as a rich site summary or RDF (resource description framework) site summary) feed. Subscribers to the RSS feed will receive notification of the updated annotations and will be able to select whether to import the annotations, e.g., using an interface similar to the import page 1500 described above. In another embodiment, URLs pointing to an updated posting user's annotation list (e.g., pointing to an import webpage such as page 1500) can be periodically distributed to email lists identified by the user, periodically published to a community's bulletin boards or chat rooms, and so forth. Each user on the email list may then link to the URL and import any or all of the listed annotations. In another embodiment, the list (or updates to the list) may be automatically published onto a blog (Web log, blog) maintained for the publishing user. In another embodiment, the user may maintain a publicly accessible web page incorporating annotations, and the web page may be updated automatically from time to time.
Comments in user communities
A. Expert filtering of content
In some embodiments of the invention, a user may search within a page or site library that has been annotated by some community members; such a library is referred to herein as a "community Web". Users may or may not be affiliated with the community, and community members may or may not have well-defined trust relationships between them.
For example, in one embodiment, registered users of search server 160 (FIG. 2) may actively join an online community (e.g., Yahoo! group), whose members may communicate via a dedicated message board, email list, chat room, or the like maintained or hosted by the provider of search server 160. The personalization database 166 (or another database) advantageously includes a list of user identifiers for the members of each such community. Another user (whether or not they are members of the community) may perform a search of the community's content. Such as those who are browsing popular topics with which they are unfamiliar may be interested in this feature. Thus, for example, a user unfamiliar with a "Harry Potter" book may be interested in searching for information about it. Searching the Web with the query "Harry Potter" will return millions of hits (too many for the user to access in a reasonable time), but the user does not know which of these millions of pages or sites are worth accessing. By limiting the search to pages or sites that have been evaluated by members of a community of Harry Potter fans, users can comprehensively consider the knowledge and opinions of these fans, thereby quickly finding content that is potentially reliable and useful.
FIG. 16A illustrates an interface page 1600 for searching the community Web, according to an embodiment of the present invention. The user may access page 1600, for example, by operating an appropriate button on a search toolbar or from a search interface page.
Section 1602 enables the user to specify which community or communities to use to define the community Web to be searched. At 1604, the currently selected activity blob(s) are listed and a button 1606 can be used to change the selection.
More specifically, FIG. 16B illustrates a community selection page 1610 according to an embodiment of the invention. Page 1610 may be displayed when the user operates button 1606. On the left side, a list 1612 of communities ("ABC" and "QRS") of which the querying user is a member is presented. Next to each group name is a checkbox 1614, and the user can check the checkbox 1614 to select the group, or uncheck the checkbox 1614 so that the group is not selected. In this embodiment, the user may select multiple communities; in other embodiments, the user may be limited to selecting only one community at a time.
On the right side is a search interface 1616, the search interface 1616 enabling a user to find and select communities of which the user is not a member. The user may search for communities by name using text box 1618 and/or by keywords using text box 1620. The search is performed when the user presses the Submit button 1622. The search for the community is advantageously performed on a searchable directory of the community (e.g., Yahoo | group directory) maintained by the provider of the search server 160. The directory advantageously includes the name of each community and a brief description of that community. In one embodiment, the search term entered into the text box 1618 is matched to the community name, and the search term entered into the text box 1620 is matched to the description as well as the name.
The search results, in this case the name and (optionally) brief description of any community matching the query, are displayed in area 1624. The number of listed communities may be limited to, for example, 10 (or some other number), and communities may be selected for listing or ranking within a list based on various criteria. In some embodiments, the criteria relate to the likelihood that the community will provide a useful library of annotated content. For example, a community may be selected based on the number of members, the total number of pages or sites that have been rated by the members, the amount of activity in the community's message board, email list or chat room, and so forth. These or similar types of statistics may be displayed in area 1624.
The user may select to list one or more of the communities using checkbox 1626. In a preferred embodiment, checkbox 1626 does not result in the user joining the community, and does not provide the user with any information about the individual community members. The "Finished" (Finished) button 1628 allows the user to return to the page 1600 (fig. 16A) while newly selecting one or more communities; the new selection will be shown at 1602 when the page 1600 is redisplayed. A "Cancel" button 1630 on page 1610 allows the user to return to page 1600 without changing the selection.
Referring again to FIG. 16A, at page 1600, the user enters a query in query section 1630. Query section 1630 provides various boxes in which the user can enter search terms that are specific to particular fields of metadata in the annotation. In this example, the user may specify search terms for the page content (text box 1632) and/or annotation fields, such as title (text box 1634), keywords (text box 1636), description (text box 1638), and/or introduction (text box 1640). It should be understood that the user need not enter search terms in all text boxes of section 1630; fields corresponding to boxes without search terms are not used to constrain the search. The user may also specify a desired rating using radio button 1642. A "search" button 1644 submits the query for processing, and a "reset" button 1646 clears all fields in the query portion 1630. Thus, the query section 1630 for searching the community Web may generally resemble a personal Web query interface (e.g., fig. 11).
The process for searching community Web may be generally similar to the process for searching personal Web (e.g., fig. 12). However, the query received from the user will identify a selected community (or communities) for which the community Web is to be searched, and step 1208 will include identifying all members of the specified community rather than the members of the querying user's trust network. The identification of the community members may be independent of trust relationships. The search is limited to documents that have been annotated by at least one member of the selected community.
In a preferred embodiment, privacy settings for the community members may be applied during the community Web search and the community members are treated as if they were trusted network members of the querying user. For the privacy settings described above, the "public" note for each community member will be used in all cases; if the querying user is exactly in the community member's trust network, then the "shared" annotation will be used; and only when the querying user is right on the access list for that annotation will the "private" annotation be used.
Additionally, the use of annotation metadata in identifying and reporting hits may vary. For example, the search for keywords may be based on an aggregation of keywords for community members. In one embodiment, a keyword match is detected only if some minimal portion of the community members that annotate the page use the keyword. In another embodiment, a keyword match is detected if the keyword is used by at least one community member. Similarly, whether a page meets the rating requirements may be determined based on an average rating of the community members that annotated the page, or based on whether a minimum number of community members have given a specified rating to the page, or based on whether at least one of the community members has given a specified rating to the page.
In some embodiments, the annotations for each community member may be given equal weight. In other embodiments, the weight of the annotations given to each rater may be determined by the total trust weight assigned to the rater by other members of the group, by the number of group members whose friends list includes the rater, by the reputation score or global reputation score of the raters in the community (e.g., as described below), or by other factors.
When search results are reported to a user, the user is advantageously restricted from accessing metadata from individual community members. For example, in one embodiment, the search results provide only an aggregate list of the average ratings and/or keywords for each hit, and may also indicate information such as the number or proportion of community members that have annotated the hit. Such information may allow a querying user to assess the quality of the information that they are retrieving without exposing any information about the identity or annotation of the individual community members.
In another embodiment, the search results may provide anonymous snippets from individual annotations. For example, an excerpt from the description field that is not attributed to a particular author may be included, or a list of all keywords may be reported (alphabetically or by frequency) without attributing the keywords to individuals, or a list of non-attributive ratings (chronological order) may be included.
In other embodiments, the user may be able to view information about behavior in the community Web. For example, page 1600 (FIG. 16A) or another interface page may include various controls (not shown) that allow the user to view a list of information. In one embodiment, the user may view a list of pages or annotations recently added to the community Web. In another embodiment, the user may view a list of pages that have been annotated by the maximum number of community members or a list of pages that have the highest average rating within the community. In another embodiment, the user may view a list of pages most frequently visited by the community members. Any one of these lists or others also include aggregated or anonymous annotation information, similar to the community Web search results page described above. Privacy settings established by the community members are also advantageously taken into account in this context.
It should be appreciated that the community Web is similar in many respects to the personal Web, particularly where the trust network of the user's personal Web is defined with reference to a community rather than individual friends. Thus, any of the above-described search and browse operations described for the personal Web can also be extended to the community Web. However, in the case where the user accessing the community Web is not a community member, the information identifying the individual community member is advantageously not available to the accessing user.
B. Suggestion community
In some embodiments, the search provider may analyze patterns in user A's annotations and identify various communities that user A may be interested in joining based on these patterns. For example, a search provider may select an interest-based community G (e.g., Yahoo! group) and identify pages of the community Web that include the community; the provider may also determine the average rating that members of group G have given for some number of annotated pages.
Assuming that user A is not already a member of community G, user A's annotation library can then be compared to community G's community Web to detect affinity (affinity) between the two. As used herein, "affinity" generally refers to a pattern of common interest and/or taste, and can be measured in a variety of ways. For example, the number of pages in the community Web of community G that user a has also annotated may be used to measure affinity. As another example, the correlation between the evaluations given by user a and group G on the same page may be measured. The correlation between the keywords for a specific page user a and the aggregated keywords for the community G may also be used. In another embodiment, if a log of queries for each user is maintained, the patterns in user A's queries may also be compared to the patterns in queries entered by members of community G to determine whether user A and the members of community G have similar interests and tastes. If the affinity is sufficiently high, the provider issues a suggestion (e.g., via email) that user A should consider joining group G. Alternatively, the provider may issue a suggestion to a representative of group G that considers inviting user a to join.
In one embodiment, user A has the option of whether to receive these suggestions. For example, the user may be able to select, via a user profile page, whether to receive suggestions about communities to join. If the user chooses not to receive, no suggestion is generated for the user.
Although the system may automatically add user a to the suggested community, in a preferred embodiment, user a controls the final decision as to whether to join the suggested community. For example, the suggestion may be sent in an email message, which may include a link along which user A may obtain more information about the community or join the community, contact information (e.g., email address or IM screen name) for the current members of the community, and so forth. Thus, user A may decide how to follow and whether to follow any suggestions received.
In some embodiments, user A may receive a suggestion to join any group that may join on its own initiative (e.g., the Yahoo! group). In other embodiments, existing members of the community may decide whether to participate in an affinity-based introduction procedure to obtain new members. For example, online communities typically have an "owner," which is a member of the community that has been designated as a point of contact for the provider of the online community service, and has the authority to set various operational rules or preferences for the community (e.g., whether the email list associated with the community is moderate, whether new members should be approved, etc.). In the case where the service provider provides affinity-based introductions, the owner of each community can indicate whether the community wants to participate, and the service provider adheres to the expressed preferences.
C. Meta-evaluation (meta-rating)
In some embodiments, when a querying or browsing user views an annotation, it may be prompted to evaluate the annotation, e.g., as to whether it finds the annotation helpful. For example, the overlay 800 of FIG. 8 may include a set of feedback buttons via which a user may submit a rating (referred to herein as a "meta rating") for an annotation. The meta-ratings submitted by the users are advantageously stored in a personalized database 166 (FIG. 2) associated with the annotations being rated, the author of the annotations, and the user whose annotations were being rated. Meta-evaluation can be used in a variety of ways.
In some embodiments, meta-ratings may be used to determine which annotations are displayed first. For example, where a large number of members of user A's trust network have annotated a page, it may not be practical to display all annotations at once; even if all annotations are to be displayed simultaneously, the order of display still needs to be selected. The order is advantageously determined in such a way as to maximize the likelihood that a prominently placed annotation will be helpful to the user being displayed to it. In the case where user A has annotated the page, it may be assumed that user A will find its own annotations helpful, and that its annotations may be displayed first. In the case where user A has not annotated the page, or where annotations of other users are displayed in addition to user A's own annotations, meta-evaluation may be used to determine how to rank the annotations of other users.
Thus, in some embodiments, the aggregate meta-rating for each annotation for a particular page or search hit may be calculated, and the annotation with the most positive aggregate meta-rating may be displayed to user a first (after a's own annotation (where available)). The aggregate meta-rating may be, for example, a weighted average of the meta-ratings given by user A's trust network members; the weight may be determined from the confidence coefficient of each member relative to a, the degree of separation from user a, etc. Alternatively, the aggregate meta-rating may be, for example, an average of meta-ratings from all users who have rated the annotation (whether or not they are in user A's trust network).
In other embodiments, an aggregate meta-rating for each user X that annotates a page may be calculated and used to determine a reputation score for user X. The aggregate meta-rating may be calculated, for example, by averaging the ratings given to the annotations of user X. The reputation score of user X may be determined globally, for example, by: all meta-ratings given to user X's annotations by all users of the annotation system are averaged or, per community, for example, the meta-ratings given to user X's annotations by members of each community to which user X belongs are averaged separately. Thus, each user may have one or more reputation scores.
Reputation scores can generally be used in the same manner as the confidence coefficients or trust weights described above. For example, the order in which annotations for a page or site are displayed may be determined based on the available reputation scores of their authors. The reputation score may also be used as a weight to determine an aggregate rating for a page or site in any context of interest in the aggregate rating. The reputation score may also be used in place of a trust weight or confidence coefficient during a community Web search, including where the querying user is not a member of the community for which the annotated content is being searched. Using community-specific reputation scores during community Web searches can provide a reliable indicator of what content the community as a whole finds interesting or valuable.
V. further examples
While the invention has been described in conjunction with specific embodiments, those skilled in the art will recognize that numerous modifications are possible. For example, the appearance of the various search reports and user interfaces may differ from the examples shown herein. Interface elements are not limited to buttons, clickable regions of a page, text boxes, or other particular elements described herein; any interface implementation may be used.
It should be understood that the invention is also not limited to any particular rating scheme when relevant to a rating, and that some embodiments may provide the user with the option of selecting an alternate rating scheme (e.g., thumb up/down or scale rating). In some embodiments, only positive or neutral evaluations may be supported. In other embodiments, evaluations may not be collected at all. Without collecting ratings, user annotations may still be collected and other types of metadata may be provided that may be reported in a reverse search report, including but not limited to the various types of metadata described above.
In addition, in some embodiments, rather than using a single overall rating, a user may be able to rate specific dimensions of a page or site, including dimensions related to technical performance, content, and aesthetics. For example, the technical performance evaluation may include an evaluation that reflects the speed of accessing the page, the reliability of the server, whether a link outgoing from the page is working, and the like. Content ratings may include ratings that reflect whether the content is current, accurate, understandable, well-organized, and the like. The aesthetic rating may include a rating reflecting the user's opinion of the layout, readability, use of graphical elements, and the like. The user may be requested to rate the site in any number of these and other dimensions. In some embodiments, the user may also be able to give an overall rating, or the overall rating may be calculated from the ratings given for each aspect.
The annotations may include any number of fields in any combination, and may include more fields, fewer fields, or different fields than those described above. For example, the user may also be able to indicate whether the annotated page or site belongs to a certain general category of content, such as "adult" (adult) or "foreign" (foreign) or "spam" (spam). The user may then choose to include or exclude content (the user and/or a member of his or her trust network) identified as belonging to that category during the search. In addition, information about which pages or sites have been categorized by different users in one or the other of these categories can be used to infer that the page or site in question should be considered on a global basis. Thus, for example, if a large number of users identify a particular page as a spam advertisement, that page may be excluded or given a lower ranking in all future search results.
In some embodiments, the annotations may also include non-user specific metadata. For example, the metadata may also include a real location (e.g., latitude and longitude coordinates, street address, etc.) or phone number associated with the subject page or site, a UPC (universal product code) or ISBN (international standard book number) or ISSN (international standard serial number) associated with the subject page or site, an indicator as to whether the page or site initiates a pop-up window, and so forth. In addition, metadata related to various attributes of the subject page or site (e.g., whether it includes adult content or is foreign language, etc.) may also be incorporated into the annotation independently of the user input.
Other interfaces for viewing and interacting with the annotations may also be provided. For example, in one embodiment, the annotation data is automatically displayed (e.g., embedded in the page content or in an overlay) each time the annotated page is displayed in the user's browser content. The automatic display of annotation data may be limited to browsing the user's own annotations, or extended to include automatic display of annotation data from some or all of the other members of the user's trust network. In some embodiments, each user may be able to indicate preferences for which annotations of other users should be automatically displayed.
As described above, some embodiments allow a user to control whether annotations should be applied to a single page or to groups of pages (or sites). Additionally, in some embodiments, the user is also able to apply annotations to all pages registered to the same domain name registrar as the annotated page. The presence of the public domain registrar may be determined using WHOIS or another similar service.
In other embodiments, the provider of search server 160 may also provide sponsored links where the content provider pays to have links to their site provided in the search results. Sponsored links are typically displayed in a designated portion of the results page, isolated from conventional search results. In one embodiment of the invention, any sponsored links that a user, trust network, or group (if available) has annotated may also be marked. For example, the sponsored links may have a highlight to indicate that at least one member of the user's trust network has an annotation for the page, and the average rating or aggregate rating (if any) of the sponsored links' trust networks may be used to determine the highlight as with the conventional search results described above. The sponsored links may also be accompanied by a "save" button, a "show my Web" button, or similar button or interface control.
In some embodiments, a user may be able to define multiple friends lists, such as for searches on different (but possibly overlapping) corpora. For example, a Web search provider may allow users to search within different "attributes," such as a Shopping (Shopping) attribute (primarily including sites that provide goods and services for sale), a News (News) attribute (primarily including sites that report current events and post opinions), and so on. In one such embodiment, a user may define one friends list for a general Web search, another friends list for a search within a shopping attribute, and another friends list for a search within a news attribute, and so on. When the lists are different, the user will have different trust networks for each search category. If the user searches among their attributes that have not defined an attribute-specific friend list, then their general list can be used.
In other embodiments, the user may be able to associate different friends with a particular keyword, where the particular friend is included in the trust network only if the user's query includes the keyword as a search term.
In some embodiments, the user may also be able to define a friends list for applications other than searching. For example, many email account providers include various spam filters and give users the option of reporting whether an incoming message is spam or non-spam (e.g., so that the work of the spam filter can be reviewed and improved). Assume that user a has defined a friends list for email and that the trust network defined with friends list for a includes user B. Further assuming that B reports a particular message as a spam advertisement, user a subsequently receives the same (or very similar) message. User a may receive some indication that someone in user a's email trust network (who may or may not be identified as user B) thinks the message is spam, or that the message may be redirected to user a's "spam" email folder, or that some other action is taken to alert user a that the likelihood that the message is spam is high.
The embodiments described above may relate to websites, URLs, links, and other terms specific to the case where the world wide web (or a subset thereof) is used as a search corpus. However, it should be understood that the systems and methods described above may also be applicable to different search corpuses (such as electronic databases or document repositories), and that search reports or annotations may include content as well as links or references to locations where content may be found.
Computer programs incorporating various features of the present invention may be encoded on various computer-readable media for storage and/or transmission; suitable media include magnetic disks or tapes, optical storage media such as CDs or DVDs, flash memory, and carrier wave signals suitable for transmission over wired, optical, and/or wireless networks conforming to a variety of protocols, including the Internet. The computer readable medium encoded with the program code may be packaged within a compatible device or provided separately from other devices (e.g., via internet download).
Although the present invention has been described in connection with specific hardware and software components, those skilled in the art will recognize that different combinations of hardware and/or software components may be used and that particular operations described as being implemented in hardware may also be implemented in software, and vice versa.
Therefore, while the invention has been described in connection with specific embodiments thereof, it will be understood that the invention is intended to cover all modifications and equivalents within the scope of the appended claims.

Claims (55)

1. A method for responding to a user query, the method comprising:
receiving a query submitted by a querying user of a plurality of users;
searching a corpus comprising a plurality of documents to identify one or more hits, wherein each hit is a document in the corpus that is determined to be relevant to the query;
building a trust network for the querying user, the trust network having as members a subset of the plurality of users including at least one user other than the querying user;
accessing a storage device that stores annotations created by the plurality of users, each annotation being associated with a subject document of the documents of the corpus and with a creating user of the plurality of users, each annotation comprising user-specific metadata related to the subject document;
identifying each hit of the subject document as at least one matching annotation as an annotated hit, wherein the creating user of each matching annotation is one of the members of the trust network;
generating a search report comprising a list of hits, wherein for each annotated hit, the search report comprises information about at least one of the matching annotations; and
and sending the search report to the inquiry user.
2. The method of claim 1, wherein the members of the trust network comprise at least one other user explicitly identified by the querying user as a friend.
3. The method of claim 2, further comprising:
providing a trust network interface operable by the plurality of users to identify other users of the plurality of users as friends;
receiving, via the trust network interface, identifications of friends from a plurality of input users including the querying user; and
a list of identified friends is stored for each input user.
4. The method of claim 3, wherein building the trust network for the querying user comprises:
retrieving a list of identified friends of the querying user; and
adding at least one of the querying user's identified friends as a member of the trust network.
5. The method of claim 4, wherein building the trust network for the querying user further comprises:
retrieving a list of identified friends of a first one of the trust network members; and
adding at least one of the identified friends of the first one of the trust network members as a member of the trust network.
6. The method of claim 5, wherein building the trust network for the querying user further comprises:
adding identified friends of the trust network members as members of the trust network, the identified friends connected to the querying user with a degree of separation not exceeding a maximum.
7. The method of claim 4, wherein building the trust network for the querying user further comprises:
adding the querying user to the trust network as a member.
8. The method of claim 3, further comprising:
assigning a trust weight to each friend in the list of identified friends.
9. The method of claim 8, wherein the trust weights are assigned based on user input received via the trust network interface.
10. The method of claim 9, wherein building the trust network for the querying user comprises:
selecting a user to add to the trust network as a member based at least in part on the trust weight.
11. The method of claim 2, wherein building the trust network for the querying user comprises:
automatically populating a list of identified friends of the querying user from a list of users with which the querying user is in communication.
12. The method of claim 11, wherein the list of users with which the querying user communicates comprises a list of instant messaging contacts maintained by the querying user.
13. The method of claim 11, wherein the list of users with which the querying user communicates includes an email address book maintained by the querying user.
14. The method of claim 11, wherein the list of users with whom the querying user communicates comprises a list of community members to which the querying user belongs.
15. The method of claim 11, further comprising:
providing a trust network interface operable by the querying user to edit the automatically populated friends list.
16. The method of claim 1, wherein the trust network member is a member of a selected community of users, the selected community being selected by the querying user.
17. The method of claim 16, wherein the querying user is also a member of the selected community.
18. The method of claim 16, wherein the querying user is not a member of the selected community.
19. The method of claim 16, further comprising:
receiving an identifier of the selected community from the querying user.
20. The method of claim 1, wherein the querying user is one of the members of the trust network.
21. The method of claim 1, wherein the search report includes a visual highlighting element applied to each hit that is an annotated hit.
22. The method of claim 21, wherein the user-specific metadata included in the annotation comprises a rating, the method further comprising:
for each annotated hit, a rating is extracted from each matching annotation and an average rating is calculated,
wherein the visual highlighting element applied to each annotated hit depends on the average rating.
23. The method of claim 1, wherein the user-specific metadata included in the annotation comprises a rating, the method further comprising:
for each annotated hit, a rating is extracted from each matching annotation and an average rating is calculated,
wherein generating the search report comprises determining an order of the hit list based at least in part on an average rating of the annotated hits.
24. The method of claim 1, wherein generating the search report further comprises:
for each annotated hit, a control element is provided in the search report that is operable by the user to request display of at least one user-specific metadata that matches an annotation.
25. The method of claim 1, wherein generating the search report further comprises:
for each annotated hit, information extracted from at least one matching annotation is provided in the search report.
26. The method of claim 1, wherein generating the search report further comprises:
a separate list is generated that includes only annotated hits.
27. The method of claim 1, further comprising:
searching the storage to identify one or more additional annotated hits, wherein each additional annotated hit corresponds to a document in the corpus for which the storage includes an associated annotation for which the creating user is one of the trust network members, and the associated annotation includes user-specific metadata determined to be relevant to the query; and
incorporating the additional annotated hits into a hit list of the search results page.
28. The method of claim 27, wherein searching the corpus comprises:
extracting search terms from the user query; and
each document in the corpus that contains the search term is identified as a hit.
29. The method of claim 28, wherein searching the storage comprises identifying the user-specific metadata in the corpus as additional annotated hits for each document in which the search term is included.
30. The method of claim 1, wherein the storage device further comprises at least one annotation associated with a set of documents in the corpus, and any hit that is one of the set of documents is identified as an annotated hit.
31. The method of claim 1, wherein the user-specific metadata comprises an item of information explicitly entered by the user.
32. The method of claim 31, wherein the information items are ratings of a subject document.
33. The method of claim 31, wherein the information items are keywords describing a subject document.
34. The method of claim 31, wherein the information item is a tag selected from a predefined vocabulary.
35. The method of claim 31, wherein the information item is a description of a subject document.
36. The method of claim 1, wherein the corpus comprises a plurality of world wide web pages.
37. The method of claim 1, wherein the user is a human.
38. The method of claim 1, wherein the user is a computer.
39. A method for responding to a user query, the method comprising:
receiving a query submitted by a querying user of a plurality of users;
building a trust network for the querying user, the trust network having as members a subset of the plurality of users including at least one user other than the querying user;
accessing a storage device storing annotations created by the plurality of users, each annotation being associated with a subject document of a plurality of documents belonging to a corpus and with a creating user of the plurality of users, each annotation further comprising user-specific metadata related to the subject document;
identifying one or more hits, wherein each hit is a document in the corpus that is determined to be relevant to the query, and each hit is also a subject document of at least one matching annotation, wherein a creating user of each matching annotation is one of the trust network members;
generating a search report including a hit list; and
and sending the search report to the inquiry user.
40. A method as defined in claim 39, wherein the trust network member includes at least one other user explicitly identified by the querying user as a friend.
41. The method of claim 39, wherein the trust network member is a member of a selected community of users, the community being selected by the querying user.
42. The method of claim 39, wherein identifying the one or more hits comprises comparing the query to document content in the corpus.
43. The method of claim 39, wherein identifying one or more hits comprises comparing the query to user-specific metadata of annotations in an annotated search pool for which the creating user is one of the trust network members.
44. The method of claim 43, wherein identifying one or more hits further comprises:
extracting search terms from the query; and
for each annotation in the search pool, detecting whether the search term is present in the user-specific metadata,
wherein a subject document is identified as a hit in the event that the search term is present in the user-specific metadata.
45. The method of claim 44, further comprising:
for each document that is a subject document of at least one annotation in the search pool, detecting whether the search term is present in the document,
wherein the document is identified as a hit if the search term is present in the document.
46. The method of claim 44, wherein the user-specific metadata includes a plurality of fields, and the query specifies which fields are to be considered during the act of detecting.
47. The method of claim 39, wherein for each hit, the search report further comprises a control element operable by the user to request display of user-specific metadata for at least one matching annotation.
48. The method of claim 39, wherein for each hit, the search report further includes at least some user-specific metadata from at least one matching annotation.
49. The method of claim 39, wherein the user-specific metadata included in each matching annotation comprises a rating of the subject document, and the hits in the list are placed in an order determined based at least in part on the rating of the hits.
50. The method of claim 39, wherein the storage device further comprises at least one annotation associated with a set of documents in the corpus, and any document that is one of the set of documents is identified as a hit.
51. The method of claim 39, wherein the corpus is the world Wide Web.
52. The method of claim 39, wherein the user is a human.
53. The method of claim 39, wherein the user is a computer.
54. A system for responding to a user query, the system comprising:
means for receiving a query submitted by a querying user of a plurality of users;
means for searching a corpus comprising a plurality of documents to identify one or more hits, wherein each hit is a document in the corpus that is determined to be relevant to the query;
means for establishing a trust network for the querying user, the trust network having as members a subset of the plurality of users including at least one user other than the querying user;
means for accessing a storage that stores annotations created by the plurality of users, each annotation being associated with a subject document of the documents of the corpus and with a creating user of the plurality of users, each annotation comprising user-specific metadata related to the subject document;
means for identifying each hit of the subject document as at least one matching annotation as an annotated hit, wherein the creating user of each matching annotation is one of the members of the trust network;
means for generating a search report comprising a list of hits, wherein for each annotated hit, the search report comprises information about at least one of the matching annotations; and
means for sending the search report to the querying user.
55. A system for responding to a user query, the system comprising:
means for receiving a query submitted by a querying user of a plurality of users;
means for establishing a trust network for the querying user, the trust network having as members a subset of the plurality of users including at least one user other than the querying user;
means for accessing a storage that stores annotations created by the plurality of users, each annotation being associated with a subject document of a plurality of documents belonging to a corpus and with a creating user of the plurality of users, each annotation further comprising user-specific metadata related to the subject document;
means for identifying one or more hits, wherein each hit is a document in the corpus that is determined to be relevant to the query, and each hit is also a subject document of at least one matching annotation, wherein a creating user of each matching annotation is one of the trust network members;
means for generating a search report including a hit list; and
means for sending the search report to the querying user.
HK08106875.9A2004-03-152005-03-15Search system and methods with integration of user annotations from a trust networkHK1116557B (en)

Applications Claiming Priority (5)

Application NumberPriority DateFiling DateTitle
US55357704P2004-03-152004-03-15
US60/553,5772004-03-15
US62328204P2004-10-282004-10-28
US60/623,2822004-10-28
PCT/US2005/008487WO2005089291A2 (en)2004-03-152005-03-15Search system and methods with integration of user annotations from a trust network

Publications (2)

Publication NumberPublication Date
HK1116557A1 HK1116557A1 (en)2008-12-24
HK1116557Btrue HK1116557B (en)2010-12-31

Family

ID=

Similar Documents

PublicationPublication DateTitle
US11556544B2 (en)Search system and methods with integration of user annotations from a trust network
CN101124576B (en)Search system and method integrated with user annotations from a trust network
JP5941075B2 (en) SEARCH SYSTEM, METHOD, AND COMPUTER-READABLE MEDIUM WITH INTEGRATED USER JUDGMENT INCLUDING A AUTHORITY NETWORK
US7761436B2 (en)Apparatus and method for controlling content access based on shared annotations for annotated users in a folksonomy scheme
US20080005064A1 (en)Apparatus and method for content annotation and conditional annotation retrieval in a search context
US8005835B2 (en)Search systems and methods with integration of aggregate user annotations
US7685209B1 (en)Apparatus and method for normalizing user-selected keywords in a folksonomy
HK1116557B (en)Search system and methods with integration of user annotations from a trust network
HK1132055B (en)Search system and methods with integration of user judgments including trust networks

[8]ページ先頭

©2009-2025 Movatter.jp