RELATED APPLICATIONS- This application is a divisional of U.S. patent application Ser. No. 12/208,251, filed Sep. 10, 2008, entitled “Associating Website Clicks With Links On A Web Page,” which is a continuation U.S. patent application Ser. No. 10/794,809, filed Mar. 3, 2004, entitled “Associating Website Clicks with Links on a Web Page”, now U.S. Pat. No. 7,441,195, which in turn claims priority from U.S. Provisional Application Ser. No. 60/452,084, filed Mar. 4, 2003, entitled” Associating Website Clicks with Links on a Web Page”, and from U.S. Provisional Application Ser. No. 60/452,085, filed Mar. 4, 2003, entitled “Delayed Data Collection Using Web Beacon-Based Tracking Methods”, which are incorporated herein by reference in their entirety. 
- This application is related to U.S. patent application Ser. No. 10/608,515 entitled “Efficient Click-Stream Data Collection” (Attorney Docket No. OMN7132), filed Jun. 26, 2003; U.S. patent application Ser. No. 10/608,442 entitled “Custom Event and Attribute Generation for Use in Website Traffic Data Collection” (Attorney Docket No. OMN7133), filed Jun. 26, 2003; U.S. patent application Ser. No. 10/609,008 entitled “Capturing and Presenting Site Visitation Path Data” (Attorney Docket No. OMN8054), filed Jun. 27, 2003; and U.S. application Ser. No. 10/795,079 entitled “Delayed Transmission of Website Usage Data” (Attorney Docket No. OMN7761), filed Mar. 4, 2003. The contents of these related patent applications are incorporated herein by reference. 
FIELD OF THE INVENTION- The present invention is related to tracking website usage, and more particularly to accurately identifying and associating objects activated by a user during the course of navigating a website. 
Description of the Background Art- In an on-line environment, website usage and other customer behavior may be tracked by a website server, or by another server such as a data collection server (also known as a data collector), which may be remotely located. The data collection server is notified of activity on a website so that it can monitor and track the activity. One method of achieving this notification is through the use of a request for embedded content. 
- Embedded content is part of a web page, such as an image, that is requested as a separate file from the file containing the web page. The separate file may be requested from the website server or from a remote server, such as a remote content server or data collection server. For example, when a user requests a web page from a website server, the website server sends the web page file to the user's client. The client, such as a web browser, then attempts to render the file as a viewable web page. However, upon rendering the web page file, the client may find a reference to a separate file located on the website server or a remote server. After the content is located and sent to the client, the client renders the separate file containing the embedded content along with the original web page. 
- A web beacon (also known as a web bug) is a particular type of embedded content where the content itself is irrelevant, but the request for content carries useful information. For example, a web beacon is often a transparent image having very small dimensions, such as 1 pixel by 1 pixel. This image is small enough to be invisible to the user. When a client is rendering a web page that includes a web beacon, the web beacon causes the client to send a resource request to a server such as a data collection server. The web beacon may include a script (or other code) that causes the client to include, in the resource request, additional information about the user and the user's environment. The additional information can include the data from a cookie, or other information about the client's operating environment or status. Where the server indicated by the web beacon code is a data collection server, the data collection server may, in response to the request, cause the client to set an additional cookie for identification for tracking purposes. In this manner, the web beacon request can be used to indicate to a data collection server that a particular web page is being rendered. 
- One method for including the request is to write the request as a static image tag in Hypertext Markup Language (HTML). The following is an example of an image tag in HTML: 
| src=”http://ad.datacollectionserver.com/tracker.exe?AID=14658&PID=259294& |  | banner=0.gif” width=1 height=1 border=0> |  |  |  
 
- Here, the term “ad.datacollectionserver.com” refers to the address of the data collection server. 
- Another common method of including the request is to use a scripting language such as JavaScript so as to cause the browser to dynamically generate a request to the data collection server. One advantage of using a script instead of a static image tag is that the script can cause the browser to perform other functions including gathering additional data and sending it along with the request. In either case, the result is a request sent to the data collection server upon the occurrence of an event, such as the loading and rendering of a web page. 
- Once the request has been sent to the data collection server, the data collection server can perform various types of tracking functions. For example, the data collection server can count the number of requests associated with a web page so as to monitor traffic on the web page. By counting the number of times the web beacon element has been requested from the data collection server, the server can determine the number of times a particular page was viewed. By using JavaScript to dynamically construct the request for the web beacon and encode additional information, other identifying information can be obtained for further analysis. 
- Other types of website usage tracking are also well known, such as for example log file analysis. In such an approach, statistical analysis is performed on server logs in order to detect and analyze website traffic, and usage patterns. 
- In addition to tracking web page visits, it is often desirable to track user actions, such as object activations, on web pages. In general, existing approaches for collecting and tracking website usage fail to provide a means for tracking the actual links a user clicks on during the course of navigating a site. In some circumstances, the link clicked on can be inferred if the start page has only one link that leads to the destination page. However, where there is more than one link between pages, the determination of which link was clicked is more difficult or impossible. Additionally, even when there is only one link between two pages, it is often difficult or impossible to determine whether the user actually clicked on the link or navigated to the page via some other method (such as typing in the URL). 
- Such information is useful in many ways, including for example collecting feedback that leads to improved web page design; determining the effect of various degrees of prominence of links and graphic elements on web pages; and the like. What is needed, then, is a method and system for reliably and accurately tracking the actual links a user clicks on (and other objects the user activates) during the course of navigating a site. What is further needed is a mechanism for automatically and uniquely identifying links on a page so that the user's interactions with the links can be accurately tracked. What is further needed is a mechanism for accurately reporting web page object usage statistics. What is further needed is an improved report format that visually depicts web page object usage statistics. 
SUMMARY OF THE INVENTION- According to the present invention, objects (such as links) on a web page are uniquely identified by virtue of Object ID (assigned to the element by some browsers as a part of the Document Object Model), as well as other identifying indicia, such as an element type descriptor and an action descriptor. 
- Using the combination of these indicia, the present invention allows a web tracking system to associate historical clicks on various objects of a web page with the objects currently being viewed in the browser. If an exact match is not found for an object, a search factor can be applied in order to account for slight variations in Object IDs; such variations are common, particularly when Object IDs are assigned by different browsers or on different platforms, or when a web page has been altered or edited. Accordingly, the present invention accounts for such differences and allows matches to be made even when Object IDs are not identical. In addition, if Object IDs are not present (for example, if the browser in use does not generate Object IDs), objects are matched using the other identifying indicia. 
- By identifying objects according to the techniques of the present invention, web behavior tracking systems can more accurately detect, record, and analyze user actions with respect to objects (such as links) on a web page. A report can then be generated, showing indications of relative popularity of various objects on a web page by superimposing visual indicators, such as color-coded shading, on a representation of the web page. 
BRIEF DESCRIPTION OF THE DRAWINGS- FIG. 1 is a block diagram depicting a system for website traffic data collection. 
- FIG. 2 is a screen shot depicting an example of a page analysis report facilitated by the present invention. 
- FIG. 3 is a flow diagram depicting a method of associating website clicks with links, according to an embodiment of the present invention. 
- FIG. 4 is an example of a web page having objects to be associated with user actions. 
- The Figures depict a preferred embodiment of the present invention for purposes of illustration only. One skilled in the art will readily recognize from the following discussion that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the invention described herein. 
DESCRIPTION OF AN EMBODIMENT OF THE INVENTIONSystem Architecture- Referring now toFIG. 1, there is shown a block diagram depicting a system for website traffic data collection according to one embodiment of the present invention.User112 interacts withclient machine107, which runs a software application such asbrowser110 for accessing and displaying web pages.Client machine107 may be an ordinary personal computer, including well-known components such as a CPU running an operating system such as Microsoft Windows, a keyboard, mouse, display screen, and Internet connection (not shown).Client machine107 may run various software applications in addition tobrowser110.Browser110 includesscripting engine116, such as JavaScript, as is commonly found in commercially available browsers. In response to auser112 action such as clicking on a link or typing in a URL,client machine107 issues aweb page request111 that is transmitted via the Internet tocontent server101. In response to request111,content server101 transmits web page102 (in the form of HTML code, for example) toclient machine107.Browser110 displays the requestedweb page102 onclient machine107. 
- Web page102 includes beacon code, which in one embodiment is a pointer to a beacon (such as a1 pixel by1 pixel transparent image). The beacon is typically invisible to the user, such as a transparent one-pixel image. For purposes of the following description, a beacon is any element that is embedded in aweb page102 which is loaded automatically bybrowser110 that references anexternal server106 and is used to monitor traffic. The beacon code can be provided as a script (such as a JavaScript script) to be executed byscripting engine116. The beacon code causesclient machine107 to generateresource requests105 todata collection server106. These resource requests105 are usually dynamically generated according to the script instructions.Data collection server106 records such requests in alog108, and can also record additional information associated with the request (such as the date and time, and possibly some identifying information that may be encoded in the resource request). Thus, trackingserver106 records the occurrence of a “hit” toweb page102.Tracking server106 also transmits the requested one-pixel image toclient machine107 so that the resource request is satisfied. 
- Analysis module113 retrieves stored tracking data fromlog108, filters the data, and outputs reports114.Reports114 may be provided in hard copy, or via a display screen (not shown), or by some other means.Reports114 include, for example, overviews and statistical analyses describing the relative frequency with which various site paths are being followed through the website. Examples of such reports are described below. 
- Module113 may be implemented in software running onserver106 or on another computer that can access log108. 
- In one embodiment, communications betweenclient machine107,content server101, anddata collection server106 are accomplished using well known network protocols, such as TCP/IP and HTTP, for communication across the Internet. Other communication methodologies and protocols can also be used. 
Method- In the following description, the invention is set forth in the context of identifying user-activated objects on a web page; however, one skilled in the art will recognize that the techniques described herein can be used in any context where it is desirable to determine a match between a web page object and stored records. 
- In one embodiment, the present invention is implemented using a client-side script encoded in the beacon code that is sent as part ofweb page102. This script iterates through the Document Object Model (DOM) of theweb page102, looking for actionable items such as HREF links and form submit buttons. The script overrides the default action of these links to include a call to a click-tracking function in addition to the executing the normally expected action. 
- The click-tracking function is called, for example, when theuser112 activates an HTML object onpage102 by clicking on it. This function sends to data collection server106 a unique identifier of thepage102 where the object is found (such as a URL or unique page name), the action performed by the user-activated object, an OBJECT ID of the user-activated object, and a TYPE associated with the user-activated object. 
- In one embodiment, the action performed by the user-activated object is specified in terms of a target referenced by the object. The action of an HREF tag, for example, is the page pointed to by the tag. For a form submit button, the action is the document that the form will be submitted to, as defined in the <FORM> tag. Alternatively, the action can be specified as an ACTION parameter of a form, or alternatively a JavaScript function. 
- In one embodiment, the OBJECT ID is an identifier assigned to the object bybrowser110 as a part of the Document Object Model (DOM). The OBJECT ID may be, for example, an integer sequentially assigned to each element as it is encountered bybrowser110, according to techniques that are well known in the art. 
- In one embodiment, the TYPE is an indication of the type of object the user has activated. For example, it may be the TYPE parameter of an HTML element. The TYPE of the object may be, for example, an image, a form element, a standard HREF tag, a JavaScript element, or the like. By checking the TYPE of a link, the method of the present invention ensures, for example, that image and text links pointing to the same location can be easily distinguished from one another. 
- One skilled in the art will recognize that these information items are merely illustrative of the data sent toserver106 according to one embodiment, and that other information may be sent toserver106, including or omitting any of these and/or any other types of information describing the user-activated object. 
- Using the provided items of information, the present invention is able to detect matches between user-activated objects and stored records of previous activity such as historical clicks on various links in the page. According to the techniques of the present invention, matches can be found even if an exact OBJECT ID match may not exist. For any of a variety of reasons, an OBJECT ID as indicated in a stored record for an object may not exactly match a detected OBJECT ID for the same object when the user activates it. This OBJECT ID “drift” may occur, for example, when page content is changed (for example by a web author) and particularly when elements are added to or removed from a web page. Also, different browser models, and even different versions of the same browser, can assign OBJECT IDs slightly differently or may not assign OBJECT IDs at all. Accordingly, as described below, the present invention provides techniques for using other identifying indicia, such as TYPE and action, to more effectively match user-activated web page objects. 
- Referring now toFIG. 3, there is shown a flow diagram illustrating a method of associating website clicks with links according to one embodiment. In one embodiment, the steps ofFIG. 3 are performed bydata collection server106; in another embodiment, the steps may be performed byclient machine107 or by some other component of the system. 
- Server106 detects302 user activation of an object onweb page102, for example by receiving arequest105 or other message fromclient machine107.Server106 obtains303, from the receivedrequest105, information describing the user-activated object, including for example an OBJECT ID, a TYPE, and an action. 
- Server106 then searches304 stored records that have a TYPE that matches the TYPE of the user-activated object. In one embodiment,server106 performs this search on records inlog108 or in some other repository of historical usage data. In one embodiment,server106 searches304 all stored records, without restricting the search to records having a matching TYPE. 
- If, in305, any of the stored records have an OBJECT ID, Action and TYPE matching those of the user-activated object,server106 indicates307 that a match has been found. 
- If no match is found in305,server106 searches for stored records that have a matching Action and TYPE, and have an OBJECT ID that is close to the OBJECT ID of the user-activated object. In one embodiment, this search is performed iteratively using successively larger “search factors”: First, a search is made for stored records having an OBJECT ID that differs by 1 or less from the user-activated object's OBJECT ID and having matching Action and TYPE. Then (assuming no match has yet been found), a search is made for stored records having an OBJECT ID that differs by 2 or less from the user-activated object's OBJECT ID and having matching Action and TYPE. This process is repeated with successively larger search factors until a match is found, or until the search factor exceeds a predetermined tolerance. 
- If, in306, any matches are found,server106 indicates307 that a match has been found. Otherwise,server106 indicates308 that no match was found. 
- One skilled in the art will recognize that the method can be generalized by considering the comparison performed in305 to be a special case of that performed in306, but with a search factor of zero (in other words, the difference in OBJECT IDs must be zero for a match to be found in step305). 
- One skilled in the art will further recognize that, in an alternative embodiment, the search is performed non-iteratively, so that any records having an OBJECT ID within the predetermined tolerance (and having matching action and TYPE) are considered potential matches. In one embodiment,server106 identifies as a match the stored record(s) that, among potential matches, has (have) an OBJECT ID closest to that of the user-activated object. 
- In one embodiment,server106 records the user action inlog108 according to whether a match was indicated. For example, if a match was indicated,server106 increments a value in the matching record indicating the number of times the object was activated. If no match was indicated,server106 creates a new record for the object. 
- Links may optionally be tagged with a unique “name”. In this case, in one embodiment, neither the OBJECT ID nor the search factor is employed, but rather only the page name and link name are used to make the association. 
- If Object IDs are not present (for example, ifbrowser110 does not generate Object IDs), objects are matched using whatever identifying indicia are available, such as action and TYPE. 
- The values stored inlog108 can then be used to generate reports indicating statistics summarizing historical website usage. One example of a type of report that can be generated is a representation of a web page wherein visual indicators of usage are superimposed. For example, the report can depict links on a web page colored with different color densities and/or hues to indicate relative frequency with which the links have been activated. 
- In one embodiment, the steps ofFIG. 3 are performed in response to individual user actions (such as web page clicks). In another embodiment, the steps ofFIG. 3 are performed after a number of user actions have taken place, rather than after each individual action. For example, data describing user actions can be stored locally at client machine107 (using, for example, the techniques described in related U.S. application Ser. No. 10/795,079 entitled “Delayed Transmission of Website Usage Data” (Attorney Docket No. OMN7761), filed Mar. 4, 2003, and can be transmitted toserver106 when a new page is loaded or upon detection of some other triggering event. The object matching technique can thus be used to determine which stored record(s) match a number of user-activated objects or a number of user activations of a single object. 
EXAMPLE- The techniques of the present invention are applicable in any situation where it is desirable to identify and associate a web page object, particularly where OBJECT IDs may not exactly match. 
- The following example depicts an embodiment of the invention where user interactions with a number of objects are being matched with objects on a web page. As described above, the method of the present invention can be performed in response to each individual user interaction, or after a series of user interactions has taken place. 
- Referring now toFIG. 4, there is shown an example of aweb page202 having threeobjects401A,401B,401C.Object401A has an OBJECT ID of1 and an action of http://www.one.com.Object401B has an OBJECT ID of2 and an action of http://www.two.com.Object401C has an OBJECT ID of3 and an action of http://www.three.com.Web page202 has a Page ID of “Page A”. 
- The present invention provides a technique for matching the objects onpage202 with a record of user activity. For example, the invention can be used in response to an individual user action, in order to update stored data with current user activity. 
- Suppose, for example, that the following data is to be associated with the objects onpage202 as shown inFIG. 4. For illustrative purposes, all the objects have a TYPE of IMG (image). 
|  |  |  |  |  |  | # of |  | Page ID | OBJECT ID | Action | TYPE | Clicks |  |  |  | Page A | 1 | http://www.one.com | IMG |  | 2 |  | Page A | 2 | http://www.one.com | IMG |  | 3 |  | Page A | 3 | http://www.three.com | IMG |  | 8 |  |  |  
 
- In matching the indicated data withweb page202 as shown inFIG. 4, the following steps are performed: 
- For the first listed data item, determine whether any of the objects401 have matching OBJECT ID (1), action (http://www.one.com), and TYPE (IMG).Object401A satisfies these criteria; therefore object401A is considered to match the first listed data item. Log108 can thus be updated to indicate thatobject401A has been activated twice. 
- For the second listed data item, determine whether any of the objects401 have matching OBJECT ID (2), action (http://www.one.com), and TYPE (IMG). None of the objects401 are an exact match; object201B has matching OBJECT ID but does not match the action. Therefore, search for objects401 having an OBJECT ID that are within a search factor of1 (in other words, objects401 having an OBJECT ID of1 or3), and which match the action and TYPE listed above.Object401A satisfies these criteria, having an OBJECT ID of1, an action of http://www.one.com, and a TYPE of IMG. Log108 can thus be updated to indicate thatobject401A has been activated an additional three times. 
- For the third listed data item, determine whether any of the objects401 have matching OBJECT ID (3), action (http://www.three.com), and TYPE (IMG).Object401C satisfies these criteria; therefore object401C is considered to match the first listed data item. Log108 can thus be updated to indicate thatobject401C has been activated eight times. 
- After the above steps, therefore, log108 would indicate activity forobjects401A (five clicks) and401C (eight clicks). 
Output Format- Referring now toFIG. 2, there is shown an example of apage analysis report201, displayed alongside an image of theweb page102 being analyzed. In one embodiment,report201 is provided to a site administrator or owner interacting withdata collection server106. 
- In the example ofFIG. 2,report201 includesidentification202 of the website and web page being analyzed;report date203; report options andsettings204;page metrics205; andlinks206 to related reports. In addition, variable levels and shades of color density are superimposed on the displayed view ofweb page102, in order to visually represent the relative number of clicks each item208 or screen region has received.Color key207 is a legend to indicate the meaning of various superimposed colors. 
- One skilled in the art will recognize that other formats and output mechanisms can be used, including for example hard copy output, text or graphical reports, and the like. 
- In the above description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the invention. It will be apparent, however, to one skilled in the art that the invention can be practiced without these specific details. In other instances, structures and devices are shown in block diagram form in order to avoid obscuring the invention. 
- Reference in the specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the invention. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment. 
- Some portions of the detailed description are presented in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent sequence of steps leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like. 
- It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the discussion, it is appreciated that throughout the description, discussions utilizing terms such as “processing” or “computing” or “calculating” or “determining” or “displaying” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system's memories or registers or other such information storage, transmission or display devices. 
- The present invention also relates to an apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, and magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus. 
- The algorithms and displays presented herein are not inherently related to any particular computer, network of computers, or other apparatus. Various general-purpose systems may be used with programs in accordance with the teachings herein, or it may prove convenient to construct a more specialized apparatus to perform the required method steps. The required structure for a variety of these systems appears from the description. In addition, the present invention is not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the invention as described herein. 
- As will be understood by those familiar with the art, the invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. For example, the particular architectures depicted above are merely exemplary of one implementation of the present invention. The functional elements and method steps described above are provided as illustrative examples of one technique for implementing the invention; one skilled in the art will recognize that many other implementations are possible without departing from the present invention as recited in the claims. Likewise, the particular capitalization or naming of the modules, protocols, features, attributes, or any other aspect is not mandatory or significant, and the mechanisms that implement the invention or its features may have different names or formats. In addition, the present invention may be implemented as a method, process, user interface, computer program product, system, apparatus, or any combination thereof. Accordingly, the disclosure of the present invention is intended to be illustrative, but not limiting, of the scope of the invention, which is set forth in the following claims.