Summary of the invention
In view of this, the present invention proposes a kind of system that obtains web page element in the webpage, its purpose is to quicken the speed that user side obtains web page element.Another object of the present invention is to propose a kind of method of obtaining web page element in the webpage.
According to above-mentioned purpose, the invention provides a kind of system that obtains web page element in the webpage, this system comprises:
Web page server is used for providing the web page element and the sign thereof of webpage;
A plurality of P2P servers, each P2P server are stored the sign of web page element of download of different segmentations and the client-side information of this web page element of corresponding download respectively;
Staging server is used for providing the sign of different segmentations and the corresponding relation between the P2P server to client;
Client comprises browser and quickens client that browser is used for the mode by HTTP, submits the proxy requests of downloading web page element to web page server; Quicken the proxy requests that client is used to monitor browser, sign according to the web page element in the browser agent request, download the sign of described different segmentations and the corresponding relation between the P2P server from described staging server, and according to the definite corresponding P2P server of this corresponding relation and the sign that will download web page element, from the sign corresponding client client information of determined P2P server inquiry with web page element, and download web page element in the P2P mode, and after having downloaded web page element, sign and this corresponding client-side information of this web page element is distributed to described P2P server according to client-side information.
The browser of described client is browsed this webpage according to institute's web pages downloaded element.Described client is further used for judging whether institute's web pages downloaded element is up-to-date, and downloads this web page element when not being up-to-date again; And/or described client is further used for institute's web pages downloaded element is carried out verification, and verification not by the time download this web page element again.
Described client further utilizes the HTTP mode to download web page element, and obtains this web page element in conjunction with the result that HTTP mode and P2P mode are downloaded.
The present invention also provides a kind of method of obtaining web page element in the webpage, set in advance staging server and a plurality of P2P server, each P2P server is stored the sign and the corresponding client client information of different segmentations respectively, and staging server is preserved the sign of different segmentations and the corresponding relation between the P2P server;
This method comprises:
Browser in the client is submitted the proxy requests of downloading web page element to by the HTTP mode to web page server;
Acceleration client in the client is monitored the proxy requests of browser, sign according to the web page element in the browse request, download the sign of described different segmentations and the corresponding relation between the P2P server from described staging server, and according to the definite corresponding P2P server of this corresponding relation and the sign that will download web page element, the sign corresponding client client information of web page element from the inquiry of determined P2P server and webpage, and download this web page element in the P2P mode according to client-side information;
Client is issued sign and this corresponding client-side information of this web page element to described P2P server after having downloaded web page element.
In technique scheme, described being segmented into according to the HASH value of sign carried out segmentation.
This method further comprises: client judges whether institute's web pages downloaded element is up-to-date, and downloads this web page element when not being up-to-date again; And/or client is carried out verification to institute's web pages downloaded element, and verification not by the time download this web page element again.
Preferably, described client also utilizes the HTTP mode to download web page element, and obtains this web page element in conjunction with the result that HTTP mode and P2P mode are downloaded.
From such scheme as can be seen, because client is inquired about and this web page element URL corresponding client client information from the P2P server when needs are downloaded web page element among the present invention, and set up the P2P passage according to client-side information and download this web page element in the P2P mode, then according to institute's web pages downloaded element display web page and at local cache institute web pages downloaded element, and after having downloaded web page element, issue URL and this corresponding client-side information of this web page element to the P2P server, use for client afterwards, like this, client can directly not downloaded needed web page element from WEB SERVER, but from nearby other client downloads, the speed of client downloads web page element can be quickened like this, thereby the speed of browsing page can be further improved.And,, can also solve the bottleneck of mass users and magnanimity flow and areal variation because employing is the P2P technology.
Embodiment
For making the purpose, technical solutions and advantages of the present invention clearer, the present invention is described in more detail by the following examples.
Web page element can be distinguished by multiple sign, and for example URL, IP address or other character string be that example describes in the following description with URL, but the present invention do not limit to therewith.
Core concept of the present invention is: a client downloads after the web page element of certain URL, it is buffered in this locality, when other client need be downloaded the web page element of this URL, can be by the P2P mode from this client downloads.In other words, point-to-point (P2P) Server is set in system, the URL and the corresponding client client information of the web page element of downloading have wherein been stored, when other client need be visited the web page element of certain URL, from P2P Server inquiry corresponding client client information, set up the P2P passage then and download web page element from clients corresponding by the P2P mode, in addition, behind intact certain web page element of client downloads, give the P2P server with the URL of this web page element and corresponding this client-side information issue (PUB).
In order more ideally to combine with prior protocols and technology, the cache mechanism in the embodiment of the invention is deferred to the regulation of http protocol about cache fully, that is only quicken in the following embodiments to handle http response stipulate in front can cache file.Certainly, during specific implementation, can handle the acceleration that any file describes below.
Shown in Figure 1 is the structural representation of downloading the system of web page element in the embodiment of the invention.
With reference to Fig. 1, this system comprises WEB SERVER, P2P SERVER, client.In addition, this system can further include segmentation SERVER.
Wherein, WEB SERVER provides the http server of URL of the web page element in webpage, the webpage or the like, and WEB SERVER is the same with prior art, is not giving unnecessary details here.
The URL and the client-side information of downloading this web page element of P2P SERVER storage web page element.Preferably, according to the residing operator of client self, respectively client-side information is stored in corresponding operator's tabulation, client can be carried out the P2P download in same operator net like this, has further improved the speed of browsing page.For example, as shown in Figure 2, the client-side information of telecommunication user is stored in the telecommunication user tabulation, Netcom's user client information stores is in Netcom's user list, and education network user client information stores is in the education network user list.
Because the number of URL is very huge,, can only in P2P SERVER, store KEYURL in order to reduce the URL number that P2P SERVER need handle.Following brief description KEYURL: in HTTP 1.1 agreements, connect network and the time overhead that brings in order to save newly-built TCP, stipulated the notion of lasting connection, if browser is all supported lasting the connection to the node (comprising acting server) on the path of WEB SERVER, browser may ask to download a plurality of files in this connection so.Preferably, the client in the embodiment of the invention is supported this lasting connection, and first URL in will connecting lastingly is called KEYURL.Further, client issue and inquiring client terminal information are all carried out with KEYURL.For example in lasting a connection browser to have downloaded URL to the WEBSERVER request sequentially be three web page elements of URLA, URLB, URLC, then URLA is called KEYURL.All only can carry out to SERVER inquiry and issue client terminal information, but the client-side information that inquiry is returned is reusable for URLB and URLC URLA.
As a rule, the number of user list is limited among Fig. 2, if surpassed predetermined when big or small, the client-side information that P2P SERVER deletion is the oldest, promptly storage time the longest client-side information.In addition, the KEYURL number of Bao Cuning also is limited simultaneously, surpasses after the predetermined number, then deletes the clauses and subclauses of not upgrading at most.
When client shown in Figure 1 is finished the web page element of certain URL in download, URL and this corresponding client-side information of this web page element can be distributed to this SERVER, allow SERVER note the network information of oneself, use for other client querys that will download this web page element afterwards.When new client need be visited certain URL and downloaded web page element, the client-side information of this web page element had been downloaded in inquiry from the P2P SERVER, carried out P2P with these clients then and downloaded web page element alternately.
In client, the file of cache can all leave under the local file, and each URL is corresponding to a file.The content of storing in the file comprises data such as http response head, URL, has also further stored relevant informations such as HASH value to file for the security that guarantees system in addition, prevents distort data among the cache of user.The Cache file layout is as shown in table 1:
| File layout | Explanation |
| Expires | The expired time of Cache clauses and subclauses |
| Last_Modified | Last_Modified in the http response head |
| LastValidateTime | Rise time last time of Cache clauses and subclauses |
| FileDataLen | The actual file partial-length |
| FileHashLen | The length of file part hash value |
| UrlHashLen | The length of URL HASH value |
| RespHeadLen | The length of head response |
| FileData | The actual file part |
| UrlHash | The UrlHash value |
| FileHash | The File hash value |
| RespHead | Corresponding head |
Table 1
In table 1, FileHashLen represents the length of file part hash value, and UrlHashLen represents the length of URL HASH value, UrlHash represents the hash value of URL, FileHash represents the hash value of File, and remaining all is the content in the existing http protocol, repeats no more here.
In addition, in order to integrate with prior art better, above-mentioned client can be made up of existing browser (Brower) and newly-increased acceleration client.As shown in Figure 3, the browser in the embodiment of the invention is identical with existing browser, and the http proxy server that this browser is set in browser is for quickening client.And quicken the download of client in the client executing embodiment of the invention and cache web pages element and issue functions such as this client-side information to P2P SERVER.
Because the number of URL is more than one hundred million grades at least, and the number of KEYURL is also very huge, if having only a P2P SERVER to handle, treatment capacity is very huge so, and the effect that improves for surfing is not clearly.In order further to improve processing speed, a plurality of P2PSERVER can be provided in system, each P2P SERVER handles web page element and the corresponding client client information thereof of a part of KEYURL.Just say that also the embodiment of the invention is with the URL segmentation, each P2P SERVER is responsible for the relevant treatment of one section URL respectively then.The mode of segmentation can have multiple, for example according to the lexicographic order segmentation of URL, promptly is divided into a*.*, b*.* to c*.*, da*.* to dk*.*...... or the like, and wherein * is an asterisk wildcard.Below introduce in detail the method for carrying out segmentation according to the HASH value of URL.
Each URL comprises a host name, for example the main frame of http://game.qq.com/ad.swf game.qq.com by name.Can directly calculate the HASH value to host name, but the enormous amount of host name, if so do, can produce very big data volume, in order to reduce data volume, be the example explanation to adopt second level domain qq.com to calculate the HASH value here, can certainly adopt three grades of domain names etc.
For this reason, as shown in Figure 1, the system in the embodiment of the invention further comprises segmentation SERVER.Segmentation SERVER does a segmentation to the HASH value of second level domain recited above, and every P2PSERVER only is responsible for wherein the preservation and the inquiry of one section user profile.
For example, the HASH algorithm of employing is to be mapped to integer (INT) data space, and segmentation SERVER selects to be divided into 4 sections: [0-10 hundred million), [1,000,000,000-2,000,000,000), [2,000,000,000-3,000,000,000) and [3,000,000,000-4,000,000,000), correspond respectively to 4 P2P SERVER.This segmentation situation as shown in Figure 4.
Suppose the hash value that qq.com calculates drop on [1,000,000,000-2,000,000,000) in this segmentation, client is that the issue of URL of qq.com and query requests all can send on the P2PSERVER B and handle for second level domain so.
In addition, in order to obtain up-to-date and segmentation situation accurately, quicken client and when starting, can sign in to corresponding relation or the HASH value of URL second level domain and the corresponding relation of P2P SERVER that segmentation SERVER obtains URL and P2P SERVER, as the segmentation table that exists with form, also obtain other runtime parameter simultaneously, as time interval or the like of issue and inquiry.
Below with reference to Fig. 5, the flow process of describing client downloads web page element in the embodiment of the invention and releasing news.
As shown in Figure 5, this flow process comprises mainly and may further comprise the steps:
Step 101 is quickened client terminal start-up, and the snoop agents port, prepares to handle the proxy requests of browser.
Step 102 is quickened client login segmentation SERVER, obtains segmentation table and other operational factor, as time interval of issue and inquiry etc.
Step 103 is quickened the URL that client receives browse request, and for example the URL of browse request is http://game.qq.com/ad.swf.
Step 104, calculate the HASH value of URL second level domain, learn that according to obtain segmentation table from segmentation SERVER it belongs to P2P SERVER B and handles, send the request of inquiry seed then to P2P SERVER B, the URL of web page element is carried in the i.e. inquiry and the request of URL corresponding client client information at least in this request.
P2P SERVER B can will return to the acceleration client of request with above-mentioned URL corresponding client client information after receiving request.
Step 105 is quickened the response that client receives the inquiry seed, create the P2P interface channel according to client-side information wherein, and beginning is downloaded web page element in the P2P mode.
Instep 105, can carry out followingstep 106 in execution instep 104 simultaneously, to use the collaborative web page element of downloading of HTTP mode and P2P mode.Certainly with can an execution instep 104 to step 105.
Step 106 is quickened client and is downloaded web page element according to what the URL of web page element started the HTTP mode from WEBSERVER.
It should be noted that when current HTTP standard can only be supported HTTP mode file in download and download from front to back, so use HTTP and P2P to work in coordination with when downloading, can select the HTTP mode to download from front to back, and the P2P mode is downloaded from back to front, and when both arrived binding site, download was finished.Can avoid the waste of resource like this, save the downloading flow of HTTP mode and P2P mode.
Step 107 is used the collaborative mode of P2P mode or HTTP and P2P to download web page element and is finished.
Step 108 after download is finished, is quickened client institute's web pages downloaded element is buffered in this locality, and finishes information to P2P SERVER B issue download, promptly issues URL and this client-side information corresponding with it.
In addition, after download was finished, the browser of client can be browsed this webpage according to institute's web pages downloaded element.
In addition,, need to solve whether use the P2P downloaded files be the problem of file up-to-date in the webpage, perhaps the problem of the whether same file downloaded of P2P downloaded files and HTTP because the renewal of Web web page element is very frequent.Have the sign of individual Last_Modified to represent the file modifying time in http protocol, the embodiment of the invention identifies according to this judges whether institute's downloaded files is up-to-date.This flow process may further comprise the steps as shown in Figure 6:
Step 201 when beginning to download, is quickened the initial default value time 1 that client is provided with Last_Modified.
Step 202 starts P2P and HTTP download according to above-mentionedsteps 104 to 105 andstep 106.
Step 203, whether judgement obtains new Last_Modified in reciprocal process, promptly whether the Last_Modifled that is obtained is different with local Last_Modified, if, then execution instep 204 and subsequent step thereof, otherwise execution instep 205 and subsequent step thereof.
Step 204 is updated to new Last_Modified with local Last_Modified, and execution instep 202.
Step 205, download is finished, execution instep 206.
Step 206 judges whether to have obtained Last_Modified by the HTTP mode, if execution instep 207 then, otherwise execution instep 208.
Step 207, according to obtain from the HTTP mode as standard, whether checking institute downloaded files is up-to-date, if then execution instep 209, confirms to download and finally finishes, and process ends; Otherwise, execution instep 204.
Step 208 judges whether the HTTP mode is overtime, if then execution instep 207, otherwise execution instep 206 once more.
According to above-mentioned flow process, can be so that the web page element of client downloads be a web page element up-to-date on the WEB SERVER.
In addition, in the P2P downloading mode, check problem is more outstanding always, conventional way is that WEBSERVER preserves the HASH value of file for client downloads, client is verified according to the HASH value of downloading from WEBSERVER with to the HASH value that file in download calculates then, if consistent, verifies out that then downloaded files is correct, otherwise downloaded files is incorrect, needs to download again.But file is too many under the scene that WEB browses, even WEB SERVER can preserve the HASH value, it is also excessive that client is obtained the communication bag amount of these HASH values, greatly postponed download and surfing.
In embodiments of the present invention, preferably adopt the strategy of multi-client checking, that is: when downloading web page element, the P2P mode downloads the HASH value simultaneously, and HASH value that checking is downloaded and the HASH value that calculates according to data download, further for the web page element that gets access to by the P2P mode, just can be confirmed to be correct after having only the HASH value of passing through other clients more than 2 to verify, otherwise download again.
The above only is preferred embodiment of the present invention, and is in order to restriction the present invention, within the spirit and principles in the present invention not all, any modification of being done, is equal to replacement, improvement etc., all should be included within protection scope of the present invention.