WO2009144688A2

Movatterモバイル変換

Info

Publication number: WO2009144688A2
Application number: PCT/IB2009/053527
Authority: WO
Inventors: Satya Mallya; Georges Nahon
Original assignee: France Telecom
Priority date: 2008-05-28
Filing date: 2009-05-20
Publication date: 2009-12-03
Also published as: WO2009144688A3

Abstract

A device (100) for controlling local caching of data available from a remote server (350) by identifying data sources that are repetitively accessed including a time and frequency of data access. A rate of updating the data source at the remote server is determined and the data from the data source is retrieved and stored in a local cache (140) based on the identified time and frequency of data access and the determined rate of updating the data source. The device may be arranged in a peer to peer configuration to distribute the stored data to a peer device that has an affinity to the stored data. The stored data may be distributed to peer devices based on an identified time and frequency of data access that is associated with the peer device.

Description

SYSTEM, METHOD AND DEVICE FOR LOCALLY CACHING DATA

FIELD OF THE PRESENT SYSTEM:

The present system relates to a system, method and device for locally caching data that utilizes one or more aspects of user interaction, attention and affinity to content to determine caching rules.

BACKGROUND OF THE PRESENT SYSTEM: In interacting with data over a network such as the Internet, numerous delays are introduced by the network between the time when a request for data is made and an end of the delivery of the data to the requesting device. For example, in a Web based application, the performance a user experiences is dependent on elements such as network speed, load, Internet server load, client side memory, processor speed, proxies and browser limitations that may limit active connections, data transfer size, etc.

Typically, when a client requests web content, this information is either retrieved directly from the original server, from a browser cache on a local hard drive or from a nearby cache server. In other applications, a peer to peer approach may be utilized to reduce server overload by accessing data from one or more peers that duplicate data that may otherwise be available from a server. By providing multiple sources for data in a form of peers, loads on servers is greatly reduced, yet other bottlenecks in accessing data still exist.

Computer systems interacting with data often utilize local cache memory for maintaining portions of the content to facilitate quick interaction. This is particularly true when the content is being accessed over a network wherein limitations in transfer speed between a location where the content is provided and the location where the content is accessed, often cause delays in the interaction. However, effective use of local caching of data requires rules that control what data is cached and for how long the data is cached. For example, a web 2.0 page that does not have an appropriate cache directive in an AJAX environment will have poor performance particularly when there are multiple AJAX calls, network latency (e.g., related to transatlantic hop) and proxies. Existing caching rules are typically statistically based and are based on object sizes and retrieval latencies. One problem that exists today is that there are no caching algorithms that adequately take into account anticipated user behavior based on likelihoods of data access. Some prior systems have tried to utilize anticipated behavior to make caching determinations although still do not adequately make caching decisions and alleviate loads on resources responsible for delivering data to a user. U.S. Patent No. 6,622,168, U.S. Patent Publication No. 2007/0255844, and U.S. Patent No. 6,871,218, each incorporated herein as if set out in its entirety, utilize a system to prefetch cache data based on an anticipated user's next content request. While this type of system will generally improve a user's browsing experience, since next data may be locally cached prior to an access to the next data, this type of system does nothing to alleviate a load on data servers and access points and in fact may worsen those loads. For example, since the system is predictive in downloading expected next data, at times when the predication is inaccurate, both of the anticipated next data and the actual desired next data need to be transferred to the user resulting in additional loads on all portions of the system.

U.S. Patent No. 7,039,683 ("the '683 patent'), incorporated herein as if set out in its entirety, utilizes a collective system including anticipating requests for data and frequency or volume of access requests from a plurality of users to ameliorate a volume of transfers. By utilizing collective access requests and making locally stored data available in response to requests for that data, the data may in theory be provided in response to two or more requests reducing the volume of transfer by the number of times the data is transferred locally. Although anticipating requests may be based on past requests for information, popularity of the subject matter, and direct feedback by access requestors regarding interest in data, the '683 patent fails to appreciate other characteristics of usage and therefore results in additional transfers of data due to local storage of data that may be of no current interest to a user.

It is an object of the present system to overcome disadvantages and/or make improvements in the prior art.

SUMMARY OF THE PRESENT SYSTEM:

The present system includes a system, method, processor and client device for controlling local caching of data available from a remote server. Data sources are identified that are repetitively accessed including a time and frequency of data access. A rate of updating the data source at the remote server is determined and the data from the data source is retrieved and stored in a local cache based on the identified time and frequency of data access and the determined rate of updating the data source. A proxy may receive a request for data, make a determination if the data corresponds to data stored in the local cache, retrieve the data from the local cache if it is determined that the data is stored in the local cache and retrieve the data from the remote server if it is determined that the data is not stored in the local cache.

In one embodiment of the present system, the rate of updating the data source at the remote server may be determined by periodically comparing a stored signature of a data page with a current data page retrieved from the remote server. In accordance with an embodiment, changes in the data that occur at times that do not correspond to the identified time and frequency of data access may not be downloaded. Retrieving additional data may be based on an analysis of top level and lower level link data related to the data and based on user affinity to the additional data. The time of data access may include a time of day and a day of week of the data access.

User interactions with an application utilized to access remote servers may be analyzed to determine and store usage characteristics in a user log. User interactions with the application utilized to access remote servers may be analyzed to determine at least one of a user's affinity to the data and user attention to the data. A device in accordance with the present system may be arranged in a peer to peer configuration to distribute the stored data to a peer device that has an affinity to the stored data. The stored data may be distributed to peer devices based on an identified time and frequency of data access that is associated with the peer device. The stored data may be distributed to the peer device in response to a request from the peer device for the stored data or it may be pushed by the present device to the peer device.

BRIEF DESCRIPTION OF THE DRAWINGS:

The invention is explained in further detail, and by way of example, with reference to the accompanying drawings wherein: FIG. 1 shows a system in accordance with an embodiment of the present system;

FIG. 2 shows a portion of a system in accordance with an embodiment of the present system;

FIG. 3 shows a portion of a system in accordance with an embodiment of the present system; and

FIG. 4 shows a process flow diagram in accordance with an embodiment of the present system.

DETAILED DESCRIPTION OF THE PRESENT SYSTEM: The following are descriptions of illustrative embodiments that when taken in conjunction with the following drawings will demonstrate the above noted features and advantages, as well as further ones. In the following description, for purposes of explanation rather than limitation, illustrative details are set forth such as architecture, interaction between portions of the system, interfaces, techniques, etc. However, it will be apparent to those of ordinary skill in the art that other embodiments that depart from these details would still be understood to be within the scope of the appended claims. Moreover, for the purpose of clarity, detailed descriptions of well known devices, circuits, techniques and methods are omitted so as not to obscure the description of the present system. It should be expressly understood that the drawings are included for illustrative purposes and do not represent the scope of the present system. Further, while portions such as a system, method, device, etc. are illustratively described, each one of the system, method, device, etc. described should be understood to similarly apply to each other of the system, method, device, etc.

For purposes of simplifying a description of the present system, the term data and formatives thereof as utilized herein should be understood to relate to any type of data that may be accessed by a user including textual data, audio data, visual data, audio/visual data, link data, application data, etc., for rendering and/or operation thereof. The term "operatively coupled" and formatives thereof as utilized herein refer to a connection between devices or portions thereof that enables operation in accordance with the present system. For example, an operative coupling may include one or more of a wired connection and/or a wireless connection between two or more devices that enables a one and/or two-way communication path between the devices or portions thereof.

FIG. 1 shows a system 100, one or more portions of which may be utilized for operation in accordance with the present system. The present system 100 provides a mechanism that considers user interaction, attention and affinity to types of data when caching and prefetching web data of a particular content service locally on a user's device. The present system enhances user experience in terms of content access (browser or application based) and makes offline content accessible to a user that is highly relevant in both of a fixed and mobile network connection. Although in the present system portions shown appear as individual portions, it may be readily appreciated that each of these portions may be comprised of hardware and/or software portions including memory storage portions that store instructions for configuring a processor 110 for operation in accordance with the present system.

Further, the present system 100 is illustratively shown as discrete operating portions, as is readily appreciated, these discrete operating portions are merely provided to facilitate the following discussion and should not be understood to require corresponding physically discrete portions of the system. For example, as is readily appreciated, the portions of the system 100 may be readily implemented as one or more programming portions that configure the system 100 for operation as described herein. In one embodiment of the present system, the processor 110 may be operationally coupled to a memory, such as represented by one or more of portions 120, 130, 145, 150, 155, 160, 165, 170, 175, 180. The one or more memories may be any type of device for storing application data as well as other data related to the described operation. The application data and other data are received by the processor 110 for configuring the processor 110 to perform operation acts in accordance with the present system. Accordingly, the present system may be embodied as a computer software program, such program containing modules corresponding to one or more of the individual steps, acts and/or portions described and/or envisioned by the present system. The operation acts may include operation in accordance with one or more of the portions 120, 130, 145, 150, 155, 160, 165, 170, 175, 180. The processor 110 may operate utilizing a program portion, multiple program segments, and/or may be a hardware device utilizing a dedicated or multi-purpose integrated circuit.

Accordingly, the system 100 for example may be readily implemented on a properly configured general use personal computer. As such, the system 100 may be provided as a locally operated system as opposed to a centrally-based server system or central caching system as disclosed in prior systems. The present system is directed to facilitating local access to cached data in place of accessing data from a remote location. It has been observed that often times, users have repetitive and habitual behavior (e.g., time of day usage) when it comes to accessing of data from a network. As may be readily appreciated, a user typically accessing data over a remote network such as the Internet, does not make requests for data directly, but in fact enters and/or selects URL's that direct an application, such as a web-browser, to a remote server data location wherein the data is available for download or streaming so that the data may be retrieved and/or consumed by the user. Oftentimes, URL's may point to default locations for data, such as a home page, as opposed to a URL solely associated with given data. In this way, default URL's may be utilized to facilitate a review and retrieval of data that currently is accessible from the default URL. For example, a URL associated with nyt.com may be utilized to facilitate accessing data that currently is available from a data server associated with the default URL, although the data provided by the data server may change at any time. By utilizing default URL's, users are provided a reliable means of accessing data that is available from the default URL, regardless of the particular data provided by the data server. This is consistent with how users typically browse data available from a remote server, such as through the Internet. For example, a user may have a habit of browsing the home page of the New York Times^™ every week day, at a beginning of a work day, yet the user has no expectation of what data will be provided by the New York Times data server.

In accordance with the present system, a web based proxy 120 operatively coupled to the processor 110 acts as an internal service for intercepting calls from a user application, such as the browser application. The calls are typically requests for data from an external network 135, such as may be provided in response to entrance of a Uniform Resource Locator (URL) into a browser address bar and/or as may be provided by selection of a URL provided on a web page, such as an XML and/or HTML based web page. The web based proxy 120 may utilize an interceptor pattern to identify and intercept all or pre-configured data identifiers related to data requests which are either directed to a remote destination server or are directed to a retrieval service under control of the processor 110. Interceptor patterns that are known in the art, such as may be utilized by a local anti- phishing program, may be readily employed by the web based proxy 120 in accordance with an embodiment of the present system.

Regardless of whether requests for content are fulfilled by locally cached content 140 as provided by a cache control 145 or are fulfilled by accessing the network 135, data analysis is performed, such as analysis of related URL's, pages, media, content blocks and links, by a data analysis portion 150 to determine a portion of caching rules for the user. In accordance with the present system, the data analysis portion 150 may utilize caching rules to determine data refresh rates and/or server update rates. As may be readily appreciated, many systems are available that may help determine whether data at a remote server is updated. Web pages may be directly compared (e.g., bit-by-bit comparison, title analysis, data size may be compared, presence of new links may be identified, Really Simple Syndication (RSS) feed metadata may be analyzed, for example ignoring potential changes that may occur to advertising content and focusing on changes to the underlying data. In addition to simply determining that data has changed, the present system (e.g., the data analysis portion 150) may also determine a rate or typical timing of data change. For example, content that changes more rapidly than other content may need to be cached more often if user usage warrants such caching as described in more detail below. For example, the data analysis portion may determine that data present at NYT.com changes more frequently than data present at sjmercurynews.com based on data signatures of data pages from the remote servers (e.g., NYT.com server and sjmercurynews.com server) that are stored locally and that are compared (e.g., periodically) with data pages currently retrieved from the remote servers. In this way, a time pattern of data changes may be used to assist in a determination of a portion of enhanced caching and prefetch rules 175 to assist in a determination of when data should be pulled from a given remote data server. By determining how often and/or when data servers accessed by a given user are updated, the present system determines a portion of the enhanced caching and prefetch rules 175 (e.g., to determine whether cached data should be updated) to ensure that when desired, the locally cached data 140 currently reflects data stored on the data servers. It is important to note that at times, it is desirable that locally stored content does not reflect the content stored on a remote server to help ensure that unnecessary local caching of content is minimized unlike prior systems.

To determine whether or not to update server content that has not yet been locally updated/cached, an affinity analysis portion 155 performs an analysis of user interaction patterns with a web application, such as a browser, including data accessed and locations of data access (e.g., domain, etc.) . Click stream analysis may be performed to identify links that are accessed, data that is consumed, and a time and periodicity of data consumption to determine what data to prefetch and cache and a frequency of such prefetching and caching. This analysis may be performed on a log of user interactions and/or may be performed substantially real-time as the user interaction occurs. For example, the log of user interaction may be examined to determine user interaction characteristics such as periodicity of accessing the data including a time of typical interaction, days of week, months, etc., when content is accessed including whether given data locations are repetitively accessed. In this way, user interaction with data that is not repeated, may be discarded as a fleeting interest not resulting in prefetching of data.

However, isolated instances of accessing data and data locations

(such as domains) may be stored as a portion of the user log to facilitate identification of a new behavior that is subsequently repeated. Further, by identifying how and when data is accessed, prefetching and caching of data may be timed to precede a request by the user to access the data while enabling the present system to ignore changes in the data that occur between access requests .

For example, while during the week a user may access a default URL (a home page) of a given website every morning at 9 A.M., the user may not access the given website during week-ends and holidays. In accordance with the present system, by determining the user's affinity for given data items, such as when a given website is likely to be accessed by the user, changes in the data available from the website that occur over the weekend or holiday may be ignored as long as a prefetching of new data is performed prior to when the website is again accessed by the user.

The affinity analysis portion 155 may include an analysis of user attention to data, for example, by monitoring the activity of user's clicking on hyper links or other menus on the website and/or how long a user dwells on given web pages. Affinity is measured by user's explicit or implicit interaction with data sites. An explicit example of a user's affinity for content may include monitoring a user's clicks on "WiMAX" in a search application, such a content aggregator as available at bubbletop.com or pageflakes.com and/or RSS feeds added by a user that have metadata including WiMAX. Another explicit example of a user's affinity for content may be related to a high frequency of WiMAX and related content access by a user. Other examples of the user's attention and interest is what is being shared between users, for example in a web2.0 community based application, user and derived tags on content downloaded from the web (e.g., on applications such as digg or del.icio.us or embedded logic on web applications such as bubbletop.com) . In addition, there is an emerging trend such as proposed by attentiontrust.org where users can export and transfer their attention and affinity data to other systems including the present system. Dwell time may also serve as an indicator of user's attention. For example, user dwell time may be utilized to determine whether minor updates (e.g., changes in less than 5% or 10% of the data) should result in prefetching in updated data. As a user's dwell time increases, the percentage of change that results in a prefetching of new data in accordance with the present system may go down, to reflect the realization that the longer that data is scrutinized by the user, the more important the data is to the user and the more important that changes in the data be accurately reflected in the cached data. For example, for data that is viewed by the user for a period of one minute, a five percent change in the data may be acceptable before the present system determines that the new data is ready for prefetching. However, for data that the user typically views for less than 30 seconds, a 10% change in the data may be acceptable before any detected new data is prefetched.

In accordance with the present system, an enhanced caching and prefetching portion 175 utilizes the affinity and content analysis portions 150, 155 in connection with other caching techniques to determine which data to prefetch and cache. For example, the enhanced caching and prefetching portion 175 may determine which data is static (e.g., data associated with a given link that does not change) and which data is dynamic (e.g., data and/or link data that does change) . Given data that is archived in data server may have an associated URL that does not change and thereby may be determined to be static data. To simplify caching and prefetching rule generation, analysis related to static data may be eliminated or minimized since the static data need only be prefetched and cached once irrespective of user affinity. However, dynamic data is analyzed to determine prefetching and caching rules since the more the data associated with an access request changes, the closer user usage and affinity needs to be examined to ensure that current data is provided to the user when an access request is made. The enhanced caching and prefetching portion 175 may also set expiration rules for cached data determining, for example, one or more of when: - cached data may be overwritten and may include a time of the day when data is typically accessed, periodicity of access to particular portions of the data and the web application, may also include a specific user directive to the system including rules for prefetch and cache duration, directives from the peer to peer management portion 180 where content change frequency may be detected from peers that influence a given user's cache and prefetch rules, directives from the peer to peer management portion 180 which indicate that users with similar interest often prefetch and cache given content, and other user behavior such as to determine the prefetching and caching rules utilized by the system

100.

Accordingly, the present system may make a qualified estimate on the client device (e.g., system 100) to determine if the content from a target web data/application needs to be prefetched taking into account usage behavior and web site content change frequency. In this way, prefetching of content that is not updated or not of current interest to the user may be avoided unlike prior systems.

The enhanced caching and prefetching portion 175 also may operate with one or more other common caching techniques to manage the size of the cached data 140. A Least Recently Used

(LRU) algorithm that is known to discard the least recently used data items first may be employed. The LRU algorithm requires keeping track of what was used when, which is expensive in terms of processor resources if one wants to make sure the LRU algorithm always discards the least recently used item. In a further embodiment, a Most Recently Used (MRU) algorithm may be utilized that discards, in contrast to the LRU algorithm, the most recently used items first. This caching mechanism may be used when data access is unpredictable by the affinity analysis portion 155, and when determining the least most recently used section of the cache system may require long and complex operations. Typical database memory caches may employ the MRU to simplify caching operation and the system 100 may similarly employ the MRU algorithm. A Pseudo-LRU (PLRU) algorithm may be employed for cached data items that have a large associativity

(generally >4 ways) , wherein the implementation cost of LRU may become prohibitive. In a case wherein a probabilistic scheme that almost always discards one of the least recently used items is sufficient, the PLRU algorithm may be utilized by the enhanced caching and prefetching portion 175 only requiring one bit per cache data item to operate as may be readily appreciated. The enhanced caching and prefetching portion 175 may utilize a Least Frequently Used (LFU) algorithm that counts how often a data item is accessed by the user. Those data items that are used least often, may be discarded first. Further, the enhanced caching and prefetching portion 175 may utilize an Adaptive Replacement Cache

(ARC) algorithm that constantly balances between LRU and LFU, to improve the combined result as may be readily appreciated. A cache control portion 145 is operably coupled between the processor 110 and cached data 140. The cached data 140 is stored in a memory device that is local to the system 100 such that cached data 140 may be accessible to the user of the system 100 regardless of whether the system 100 is (currently) connected to the network 135. The cache control portion 145 operates in response to instructions from the processor 110 to refresh cached content and prefetch content not previously cached. The cache control operates to establish a connection with the network 135 to retrieve and store fresh data, in accordance with prefetch and cache rules stored in the enhanced caching and prefetching portion 175.

A storage and retrieval service portion 160 is operably coupled to the cache control 145 and controls prefetching and caching of data from web applications including caching of data associated with top-level links (URLs provided that directly link to data) and depth based link data retrieval (URLs provided together with data accessible from a top-level link and/or lower link data) . Further, links analysis may be performed together with an analysis of a user's affinity to the data accessible from the top-level link and/or lower link data to determine if the data accessible from the top-level link and/or lower link data should be locally cached. In a case wherein prefetching of data requires authentication and/or other security measures (e.g., user password, etc.) to access the data on a host server, the storage and retrieval service portion 160 may utilize an authentication and security portion 165 to enable access to the data without requiring user intervention in the prefetching process. As may be readily appreciated, the user may be enabled through use of the authentication and security portion 165, to enter security information to make the security information available for use by the system 100 for prefetching of data that requires security and authentication information.

In operation, when the user logs in to a web application such as an Internet web browsing application, the web based proxy 120 redirects all content requests to the storage and retrieval service portion 160. Through use of the cache control 145, the storage and retrieval service portion 160 determines whether requested data (e.g., data associated with a URL that is provided in the web browser address bar) is available as cached data 140. In a case wherein the requested data is available as cached data 140, the cached data 140 is provided to the web browser for rendering. In a case wherein the requested data is not available as cached data 140, the storage and retrieval service portion 160 enables access to the network 135 for retrieval of the requested data and rendering by the web browser. In a case wherein the requested data is not available as cached data, the storage and retrieval service portion 160 may utilize the authentication and security portion 165 to access data that requires authentication for access, or the authentication and security portion 165 may enable prompting of the user for the authentication information.

The storage and retrieval service portion 160 in one embodiment may control portions of the system in an offline mode (e.g., when the network 135 is not available) to operate together with the web-based proxy portion 120 as a companion application to control access to cached data 140 and to record application commands from other portions of the system 100, such as the data analysis portion 150 and the affinity analysis portion 155, to enable continued operation when the user again connects to online applications through the network 135. Further, the storage and retrieval service portion 160 disables real-time command applications, such as a request from the cache control portion 145 to prefetch data, when the system 100 is in offline mode (e.g., when the network 135 is unavailable) . As should be readily appreciated, while the discussion above describes illustrative interaction between portions of the system 100 for purposes of illustrating an operation in accordance with the present system, numerous other interactions between the portions of the system 100 may be readily arranged while still being considered within the described system. Accordingly, other configurations that would occur to a person of ordinary skill in the art should be understood to be encompassed by the described system and claims that follow.

FIG. 2 shows a portion of a system 200 in accordance with an embodiment of the present system that utilizes a peer to peer network to further reduce bandwidth demands on Internet gateways (e.g., network connections that are utilized for accessing external data servers) . As shown, the system 200 includes client (peer) devices 250A, 250B, 250C, 250D, ... 250N, with each of two or more client devices 250A, 250B, 250C, 250D, ... 250N connected to an intranet 230 through corresponding proxy devices 240A, 240B, 240C, ..., 240N. The intranet 230 is connected illustratively to a video repository 210 through an Internet connection 220. This embodiment of the present system 200 provides a unique protocol where a system, such as provided by the illustrative system 100 shown in FIG. 1 and corresponding to each of the client peer devices 250A, 250B, 250C, 250D, ... 250N, may be enabled to discover peer client devices and communicate to peer client devices, data items that are locally cached including communicating changes in data items that are identified by the data analysis module 150 of a client device. To enable peer to peer operation, the system 100 may include a peer to peer management portion 180 operably coupled to the processor 110 as shown in FIG. 1. In accordance with an embodiment of the present system, the (identified) changed data item(s) may be propagated from one of the client devices, that has the changed data item(s) locally cached, to each of the other client devices that has an affinity for the changed data item(s) to reduce bandwidth usage and data access latency that would occur should each of the client devices that has an affinity for the changed data item(s) attempts to access the data item(s) individually over the Internet 220.

In accordance with the embodiment of the present system, a framework is provided that allows users to share download capacity and mutually benefit in a data sharing embodiment, particularly as a method, etc., of distributing large amounts of data items widely without repeatedly accessing an original distributor of the data (e.g., such as the video repository 210) and without incurring related hardware, hosting and bandwidth costs. The system 200 is configured to include a peer to peer data communications protocol that may include sharing of newsfeeds as well as other data items as may be readily appreciated by a person of ordinary skill in the art. In a peer to peer enabled embodiment of the present system, data items (e.g., data feeds such as newsfeeds, headlines, etc.) may be stored on each participating client device based on affinity determinations, such as frequency of data use, time of data use, etc., that may be made separately on each client device. In accordance with this embodiment of the present system, when a user of a client device requests access to a data item, such as in response to a user request and/or a prefetch request, that is not available local to the client device (e.g., stored locally as cached data of the client device) , a peer to peer management module (e.g., the peer to per management module 180 shown in FIG. 1) of the client device may search other client devices that are configured as peers to determine whether the requested data item is available on one or more of the client devices. Naturally since the affinity of the requesting client device may be different then the affinity of the other client devices, a reconciliation of affinity data may be performed to ensure that the requested data is current data. In a case wherein the requested data is determined to be current on the one or more other client devices, the requesting client may download the requested data or corresponding portions thereof from the one or more other client devices as may be typical of a peer to peer data sharing network. While the above described embodiment is illustratively described operating utilizing a peer to peer pull model (e.g., peers request data from other peers), it should be readily appreciated that a peer to peer push model may be similarly utilized. For example, in a case where a given peer prefetches a data item from a remote server, the given peer may publish this information to each other peer or may simply maintain a database identifying data items that are commonly accessed by the given peer and at least one other peer, and thereby push the prefetched data item to the at least one other peer.

FIG. 3 shows a portion of a system 300 in accordance with a further embodiment of the present system that utilizes a peer to peer network. The system 300 includes a data server 350 that may be a publisher of data including application data, and/or may simply perform a function of a data aggregator and/or may simply identify locations of data. As may be readily appreciated, the illustrative system 300 is shown having a single server, the data server 350, for purposes of simplifying the following discussion, where in a typical system, numerous data servers are utilized. The data server 350 is shown operably coupled to peer devices 310, 320, 330 through a network 340. As such, the system 300 operates as a client server system wherein client devices (e.g., the peer devices 310, 320, 30) access content that originates from a server source. As may be readily appreciated, the network 340 may include one or more of Internet and intranet couplings for example as shown in FIG. 2. The peer devices 310, 320, 330 may be similar to the system 100 as illustratively shown in FIG. 1.

Prior approaches to improve network performance have included server side caching (e.g., server caching of data most often accessed) , Content Distribution Network (CDN) solutions like Akamai including data mirroring, client side caching (e.g., directives to a browser to cache content with an indicated "time to live" (TTL) indicating when the content expires) . While these systems all help reduce latency in data retrieval (e.g., audio and/or visual data retrieval), each does little to reduce gateway traffic related to repetitive retrieval of content amongst a plurality of users.

In accordance with the present system, a new peer to peer approach may be applied to caching content between multiple clients through a cooperative and pure P2P network of P2P caches as shown for each of the peer devices 310, 320, 330 including respective P2P caches 312, 322, 332 which may correspond to the cached content 140 shown in FIG. 1. Regarding FIG. 1, it is described above how the proxy 120 may intercept a client request for data and analyze the data request so that the data may be prefetched and cached prior to a users request for data in the future. The embodiments shown in FIGs. 2 and 3 further enhance this approach by offering a P2P cache extension supported by the respective P2P caches 312, 322, 332 under control of cache storage and retrieval portions 314, 324, 334. Each of the peer devices 310, 320, 330 are shown including peer data registration portions 316, 326, 328 and peer propagation portions 318, 328, 338 which are illustratively shown in FIG. 1 as the peer to peer management portion 180.

In operation, an application or its proxy may register with the respective peer data registration portions 316, 326, 328 to register application data that the peer devices 310, 320, 330 cache and prefetch. A determination of which data to register with the respective peer data registration portions 316, 326, 328 may be made through a corresponding data analysis portion and/or the enhanced caching rules portion as shown in FIG. 1. In accordance with an embodiment of the present system, applications and data registered with a given peer device, may be propagated to other peer devices via peer propagation portions such that peer devices are aware of other peers (e.g., nodes) within a peer group (e.g., in the proximity of a peer device and/or within a subnet) that have interest in the same application and/or data.

In this way, if data is requested from the server device 350, for example through a caching process in accordance with the present system and/or an explicit request from a user, the content may be propagated from a peer device that already has the data cached to the peer device requesting the data through the peer propagation portions. In accordance with one embodiment, data may be propagated from a peer that originally downloaded the data from the data server and/or may be propagated from a peer device that received the data from a device that received the data from the peer that downloaded the data from the data server. For example, in a case wherein the peer device 310 downloads content from the data server 350, the peer device may distribute the data to the peer device 320 and the peer device 330. However, the peer device 330 may also receive the data from the peer device 320 in place of the peer device 310.

Naturally, the present system may make use of an indication when the cached data was downloaded from the data server 350 and time to live data to assist in determining whether the cached data is outdated and should therefore be refetched from the data server 350. In this way, the present system reduces the number of download requests made to the server and allows for cached data on a given peer device to be updated by other devices present within the peer to peer network. As may be readily appreciated, this system may be applied for data and device applications that are unsecured and/or that have group and individual security access control logic (e.g., see the authentication and security portion 165 shown in FIG. 1) . FIG. 4 shows a flow diagram 400 including details of an illustrative method in accordance with an embodiment of the present system. Initially, a user may boot up a client device during act 410. Sometime during the boot up process, an application corresponding to one or more portions of the system, for example as shown in one or more of FIGs. 1, 2, 3 is launched during act 420, such as a start-up application as may be readily appreciated. By configuring the present system to be instantiated as a portion of the boot up process, the user is alleviated from having to separately consider and launch the present system on the client device. In operation, the storage and retrieval portion may connect to other client devices and/or remote data sources, for example without requiring opening of a browser application, to prefetch data items during act 430. In accordance with the present system, "fresh" data items that are relevant to the user's preferences, profiles and affinity and that are not available locally as cached data are retrieved. Thereafter, the present system may periodically (e.g., based on user access request characteristics) , request updated data items from a peer client device, if available, or may request updated data items from a remote data server. In response to a request for a data item from the storage and retrieval portion, the data server and/or the peer network, may send the updated data item(s) . Sometime thereafter, the user may open a web browser to access a data item such as a newsfeed during act 440. During act 450, a determination is made whether the data item is locally cached or cached with a peer device in an embodiment that is peer to peer enabled. In a case wherein the data item is locally cached, the data item is retrieved from the locally cached data during act 460. If the data item is not found in the local cache but is found to be cached on a peer device, the data item may be retrieved from the peer device during act 460. In a case wherein the data item is not locally cached or cached on a peer device, the data item is retrieved from a remote server during act 470. During act 480, it is determined if more data items are requested by the user through operation of the web browser. In a case wherein a further data item is requested by the user, the process continues with act 450 as previously described. In a case wherein no further data items are currently requested, the process continues to prefetch and cache data items during act 430.

Further variations of the present system would readily occur to a person of ordinary skill in the art and are encompassed by the following claims. Through operation of the present system, an improved prefetching and caching system is provided that considers user interaction, attention and affinity to types of data when caching and prefetching data from a remote data server. Websites wherein the data is accessed from may be analyzed to determine an ontology of the website and/or to categorize the website to assist with identifying other data that the user may have an affinity towards. For example, if several of the data servers accessed by the user where categorized as hosting right wing political data, the present system may determine that the user has an affinity to right wing political data and therefore prefetch right wing political data. Accordingly, the present system enhances a user's experience in terms of data access. In a peer to peer embodiment of the present system, the prefetch protocol of the present system enables data retrieval by one user to result in updating of cached data on multiple client devices.

Finally, the above-discussion is intended to be merely illustrative of the present system and should not be construed as limiting the appended claims to any particular embodiment or group of embodiments. Thus, while the present system has been described with reference to exemplary embodiments, it should also be appreciated that numerous modifications and alternative embodiments may be devised by those having ordinary skill in the art without departing from the broader and intended spirit and scope of the present system as set forth in the claims that follow. In addition, the section headings included herein are intended to facilitate a review but are not intended to limit the scope of the present system. Accordingly, the specification and drawings are to be regarded in an illustrative manner and are not intended to limit the scope of the appended claims.

In interpreting the appended claims, it should be understood that: a) the word "comprising" does not exclude the presence of other elements or acts than those listed in a given claim; b) the word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements; c) any reference signs in the claims do not limit their scope ; d) several "means" may be represented by the same item or hardware or software implemented structure or function; e) any of the disclosed elements may be comprised of hardware portions (e.g., including discrete and integrated electronic circuitry), software portions (e.g., computer programming), and any combination thereof; f) hardware portions may be comprised of one or both of analog and digital portions; g) any of the disclosed devices or portions thereof may be combined together or separated into further portions unless specifically stated otherwise; h) no specific sequence of acts or steps is intended to be required unless specifically indicated; and i) the term "plurality of" an element includes two or more of the claimed element, and does not imply any particular range of number of elements; that is, a plurality of elements may be as few as two elements, and may include an immeasurable number of elements.

Claims

Claims What is claimed is:

1. A processor configured for controlling local caching of data available from a remote server, the processor comprising: a first portion configured to identify data sources that are repetitively accessed including a time and frequency of data access ; a second portion configured to determine a rate of updating the data source at the remote server; and a third portion configured to retrieve the data from the data source and store the data in a local cache based on the identified time and frequency of data access and the determined rate of updating the data source.

2. The processor of claim 1, comprising a fourth portion configured to receive a request for data, make a determination if the data corresponds to data stored in the local cache, retrieve the data from the local cache if it is determined that the data is stored in the local cache and retrieve the data from the remote server if it is determined that the data is not stored in the local cache.

3. The processor of claim 1, wherein the second portion is configured to determine the rate of updating the data source at the remote server by periodically comparing a stored signature of a data page with a current data page retrieved from the remote server.

4. The processor of claim 1, wherein the third portion is configured to not download changes in the data that occur at times that do not correspond to the identified time and frequency of data access.

5. The processor of claim 1, wherein the third portion is configured to retrieve additional data based on an analysis of top level and lower level link data related to the data and based on user affinity to the additional data.

6. The processor of claim 1, wherein the time of data access includes a time of day and a day of week of the data access.

7. The processor of claim 1, wherein the first portion is configured to analyze user interactions with an application utilized to access remote servers to determine the identified time and frequency of data access.

8. The processor of claim 1, wherein the first portion is configured to analyze user interactions with an application utilized to access remote servers and to store usage characteristics in a user log.

9. The processor of claim 1, wherein the first portion is configured to analyze user interactions with an application utilized to access remote servers to determine at least one of a user's affinity to the data and user attention to the data.

10. The processor of claim 1, comprising a peer portion configured to distribute the stored data to a peer device that has an affinity to the stored data.

11. The processor of claim 10, wherein the identified time and frequency of data access are a first identified time and frequency of data access, wherein the peer portion is configured to distribute the stored data to peer devices based on a second identified time and frequency of data access that is associated with the peer device.

12. The processor of claim 10, wherein the peer portion is configured to distribute the stored data in response to a request from the peer device for the stored data.

13. A method of controlling local caching of data available from a remote server, the method comprising acts of: identifying data sources that are repetitively accessed including a time and frequency of data access; determining a rate of updating the data source at the remote server; retrieving the data from the data source; and storing the data in a local cache based on the identified time and frequency of data access and the determined rate of updating the data source.

14. The method of claim 13, comprising acts of: receiving a request for data,- determining if the data corresponds to data stored in the local cache; retrieving the data from the local cache if it is determined that the data is stored in the local cache; and retrieving the data from the remote server if it is determined that the data is not stored in the local cache.

15. The method of claim 13, wherein the storing the data in a local cache comprises an act of not downloading changes in the data that occur at times that do not correspond to the identified time and frequency of data access.

16. The method of claim 13, wherein the time of data access includes a time of day and a day of week of the data access.

17. The method of claim 13, wherein the act of identifying data sources comprises an act of determining at least one of a user's affinity for the data and user attention to the data.

18. The method of claim 13, comprising an act of distributing the stored data to a peer device that has an affinity to the stored data.

19. The method of claim 18, wherein the identified time and frequency of data access are a first identified time and frequency of data access, wherein the act of distributing the stored data comprises an act of distributing the stored data to peer devices based on a second identified time and frequency of data access that is associated with the peer device.

20. The method of claim 18, wherein the act of distributing the stored data comprises an act of distributing the stored data in response to a request from the peer device for the stored data.