技术领域technical field
本发明涉及计算机技术领域,尤其涉及一种获取页面的数据的方法和装置。The present invention relates to the field of computer technology, in particular to a method and device for acquiring page data.
背景技术Background technique
网站在上线之后,用户会在网站的页面上进行浏览、跳转、评论、分享等操作。为了对网站的数据进行分析和对网站提出改进意见,需要获取用户在网站页面上的操作数据,并且在高并发、大访问量的前提下能够做到实时数据的获取。After the website is launched, users will browse, jump, comment, share and other operations on the website pages. In order to analyze the data of the website and put forward suggestions for improvement of the website, it is necessary to obtain the operation data of the user on the website page, and it is possible to obtain real-time data under the premise of high concurrency and large number of visits.
现有技术采用Http请求的方式获取页面的数据。Http的全称是超文本传输协议,是互联网上最广泛的一种网络协议。网站的网页通过Http网络协议与系统后台的接口建立网络连接,然后系统后台与网页进行信息交互以完成网页数据的获取。In the prior art, data of a page is obtained by means of an Http request. The full name of Http is Hypertext Transfer Protocol, which is the most widely used network protocol on the Internet. The webpage of the website establishes a network connection with the interface of the system background through the Http network protocol, and then the system background interacts with the webpage to complete the acquisition of webpage data.
在实现本发明过程中,发明人发现现有技术中至少存在如下问题:一、现有的Http协议为短连接协议,每一次Http请求都要与系统后台服务器建立连接,因此在同一个页面上的用户操作就要频繁的与系统后台服务器建立连接和断开连接,这样就导致系统后台服务器的负载成倍增加;二、随着网站访问量的增加,系统后台服务器的负载会直线增加,这样就需要增加更多的机器组成集群做负载均衡,导致硬件成本和人工维护成本的增加。In the process of realizing the present invention, the inventor finds that there are at least the following problems in the prior art: 1. The existing Http protocol is a short connection protocol, and every Http request will establish a connection with the system background server, so on the same page The user operation will frequently establish and disconnect with the system background server, which will cause the load of the system background server to increase exponentially; 2. With the increase of website visits, the system background server will increase linearly. It is necessary to add more machines to form a cluster for load balancing, resulting in an increase in hardware costs and labor maintenance costs.
发明内容Contents of the invention
有鉴于此,本发明实施例提供一种获取页面的数据的方法和装置,能够达到连接一次进行多次数据交互的效果,减少服务器的负载压力,降低硬件成本和人工维护成本。In view of this, the embodiments of the present invention provide a method and device for obtaining page data, which can achieve the effect of multiple data interactions with one connection, reduce the load pressure on the server, and reduce hardware costs and labor maintenance costs.
为实现上述目的,根据本发明实施例的一个方面,提供了一种获取页面的数据的方法。To achieve the above purpose, according to an aspect of the embodiments of the present invention, a method for acquiring page data is provided.
本发明实施例的一种获取页面的数据的方法包括:监听客户端的页面;根据页面加载请求,并基于双向通信协议,建立与所述页面之间的连接;在断开所述连接之前,多次获取所述页面的数据。A method for obtaining data of a page in an embodiment of the present invention includes: monitoring the page of the client; establishing a connection with the page according to the page loading request and based on a two-way communication protocol; before disconnecting the connection, multiple Get the data for the page once.
可选地,在根据页面加载请求,并基于双向通信协议,建立与所述页面之间的连接之前,所述方法还包括:客户端在所述页面上嵌入JavaScript代码,所述JavaScript代码用于引入双向通信协议的客户端;以及根据页面加载请求,并基于双向通信协议,建立与所述页面之间的连接包括:当页面加载请求时,客户端利用所述JavaScript代码通过互联网协议地址和端口访问双向通信协议的服务端,然后所述双向通信协议的服务端与所述页面建立连接;以及在根据页面加载请求,并基于双向通信协议,建立与所述页面之间的连接之后,所述方法还包括:客户端利用所述JavaScript代码发送所述页面的数据。Optionally, before establishing a connection with the page according to the page loading request and based on a two-way communication protocol, the method further includes: the client embeds JavaScript code on the page, and the JavaScript code is used for Introducing a client with a two-way communication protocol; and according to the page loading request, and based on the two-way communication protocol, establishing a connection with the page includes: when the page loads a request, the client uses the JavaScript code to pass the Internet Protocol address and port accessing the server end of the two-way communication protocol, and then establishing a connection between the server end of the two-way communication protocol and the page; and after establishing a connection with the page according to the page loading request and based on the two-way communication protocol, the The method further includes: the client uses the JavaScript code to send the data of the page.
可选地,在断开所述连接之前,多次获取所述页面的数据包括:在与所述页面建立连接之时,获取所述页面的属性数据;在与所述页面建立连接之后,多次获取所述页面的操作数据,直至断开所述连接。Optionally, before disconnecting the connection, obtaining the data of the page multiple times includes: obtaining attribute data of the page when establishing a connection with the page; Obtain the operation data of the page one at a time until the connection is disconnected.
可选地,所述属性数据包括:用户唯一标识、页面的唯一标识、页面的统一资源定位符、页面所属浏览器类型和页面所属客户端类型中的至少一种;以及所述操作数据包括:电脑端的鼠标滚动或者移动端的手指滑屏、点击页面上的超链接、点击图片和输入文字内容中的至少一种。Optionally, the attribute data includes: at least one of the unique identifier of the user, the unique identifier of the page, the uniform resource locator of the page, the browser type to which the page belongs, and the client type to which the page belongs; and the operation data includes: At least one of mouse scrolling on the computer terminal or finger sliding on the mobile terminal, clicking on a hyperlink on a page, clicking on a picture, and inputting text content.
可选地,在获取所述页面的数据之后,所述方法还包括:发送数据获取成功的信息,并保存所述数据。Optionally, after acquiring the data of the page, the method further includes: sending information that the data is acquired successfully, and saving the data.
为实现上述目的,根据本发明实施例的另一方面,提供了一种获取页面的数据的装置。To achieve the above purpose, according to another aspect of the embodiments of the present invention, an apparatus for acquiring page data is provided.
本发明实施例的一种获取页面的数据的装置,包括:监听模块,用于监听客户端的页面;连接模块,用于根据页面加载请求,并基于双向通信协议,建立与所述页面之间的连接;获取模块,用于在断开所述连接之前,多次获取所述页面的数据。A device for obtaining data of a page in an embodiment of the present invention includes: a monitoring module, configured to monitor a page of a client; a connection module, configured to establish a connection with the page according to a page loading request and based on a two-way communication protocol A connection; an acquisition module, configured to acquire the data of the page multiple times before disconnecting the connection.
可选地,所述装置还包括:嵌入模块,用于客户端在所述页面上嵌入JavaScript代码,所述JavaScript代码用于引入双向通信协议的客户端;以及所述连接模块还用于:当页面加载请求时,客户端利用所述JavaScript代码通过互联网协议地址和端口访问双向通信协议的服务端,然后所述双向通信协议的服务端与所述页面建立连接;所述装置还包括:发送模块,用于客户端利用所述JavaScript代码发送所述页面的数据。Optionally, the device further includes: an embedding module, used for the client to embed JavaScript code on the page, and the JavaScript code is used to introduce the client of the two-way communication protocol; and the connection module is also used for: when When the page loads a request, the client utilizes the JavaScript code to access the server end of the two-way communication protocol through the Internet Protocol address and port, and then the server end of the two-way communication protocol establishes a connection with the page; the device also includes: a sending module , for the client to use the JavaScript code to send the data of the page.
可选地,所述获取模块还用于:在与所述页面建立连接之时,获取所述页面的属性数据;在与所述页面建立连接之后,多次获取所述页面的操作数据,直至断开所述连接。Optionally, the obtaining module is further configured to: obtain the attribute data of the page when establishing a connection with the page; obtain the operation data of the page multiple times after establishing the connection with the page, until Disconnect said connection.
可选地,所述属性数据包括:用户唯一标识、页面的唯一标识、页面的统一资源定位符、页面所属浏览器类型和页面所属客户端类型中的至少一种;以及所述操作数据包括:电脑端的鼠标滚动或者移动端的手指滑屏、点击页面上的超链接、点击图片和输入文字内容中的至少一种。Optionally, the attribute data includes: at least one of the unique identifier of the user, the unique identifier of the page, the uniform resource locator of the page, the browser type to which the page belongs, and the client type to which the page belongs; and the operation data includes: At least one of mouse scrolling on the computer terminal or finger sliding on the mobile terminal, clicking on a hyperlink on a page, clicking on a picture, and inputting text content.
可选地,所述装置还包括:存储模块,用于发送数据获取成功的信息,并保存所述数据。Optionally, the device further includes: a storage module, configured to send information that the data is obtained successfully, and store the data.
为实现上述目的,根据本发明实施例的再一方面,提供了一种电子设备。To achieve the above purpose, according to still another aspect of the embodiments of the present invention, an electronic device is provided.
本发明实施例的一种电子设备包括:一个或多个处理器;存储装置,用于存储一个或多个程序,当一个或多个程序被一个或多个处理器执行,使得一个或多个处理器实现本发明实施例的获取页面的数据的方法。An electronic device according to an embodiment of the present invention includes: one or more processors; a storage device for storing one or more programs, and when one or more programs are executed by one or more processors, one or more The processor implements the method for acquiring page data in the embodiment of the present invention.
为实现上述目的,根据本发明实施例的又一方面,提供了一种计算机可读介质。To achieve the above purpose, according to still another aspect of the embodiments of the present invention, a computer-readable medium is provided.
本发明实施例的一种计算机可读介质,其上存储有计算机程序,程序被处理器执行时实现本发明实施例的获取页面的数据的方法。A computer-readable medium according to an embodiment of the present invention stores a computer program thereon, and when the program is executed by a processor, the method for acquiring page data according to the embodiment of the present invention is implemented.
上述发明中的一个实施例具有如下优点或有益效果:能够基于双向通信协议建立后台服务器和前端页面的连接,能够连接一次进行多次数据交互,从而可以减少服务器的负载压力,降低硬件成本和人工维护成本;本发明实施例中在页面上嵌入JavaScript代码,从而可以利用JavaScript代码访问服务器,并向服务器发送页面数据;本发明实施例中在断开连接之前多次获取页面的数据,从而可以使同一个页面只建立一次连接就可以传输多次数据,减少了后台服务器连接的请求次数;本发明实施例中页面的属性数据为页面的属性信息,从而可以将用户打开的页面与其他页面区分开;本发明实施例中后台服务器在在获取到页面的数据之后,发送获取成功的信息并保存获取的数据,从而可以利用数据对网站进行分析处理。An embodiment of the above invention has the following advantages or beneficial effects: the connection between the background server and the front-end page can be established based on the two-way communication protocol, and multiple data interactions can be performed once connected, thereby reducing the load pressure on the server, reducing hardware costs and labor costs. Maintenance cost; Embed JavaScript code on the page in the embodiment of the present invention, thereby can utilize JavaScript code to visit server, and send page data to server; In the embodiment of the present invention, obtain the data of page repeatedly before disconnecting, thereby can make The same page can transmit data multiple times only by establishing a connection once, which reduces the number of requests for background server connections; the attribute data of the page in the embodiment of the present invention is the attribute information of the page, so that the page opened by the user can be distinguished from other pages ; In the embodiment of the present invention, after acquiring the data of the page, the background server sends the information of successful acquisition and saves the acquired data, so that the data can be used to analyze and process the website.
上述的非惯用的可选方式所具有的进一步效果将在下文中结合具体实施方式加以说明。The further effects of the above-mentioned non-conventional alternatives will be described below in conjunction with specific embodiments.
附图说明Description of drawings
附图用于更好地理解本发明,不构成对本发明的不当限定。其中:The accompanying drawings are used to better understand the present invention, and do not constitute improper limitations to the present invention. in:
图1是根据本发明实施例的获取页面的数据的方法的主要步骤的示意图;FIG. 1 is a schematic diagram of main steps of a method for acquiring page data according to an embodiment of the present invention;
图2是Socket通信模型的示意图;Fig. 2 is the schematic diagram of Socket communication model;
图3是根据本发明实施例的获取页面的数据的方法的主要流程的示意图;FIG. 3 is a schematic diagram of a main flow of a method for acquiring page data according to an embodiment of the present invention;
图4是根据本发明实施例的获取页面的数据的装置的主要模块的示意图;FIG. 4 is a schematic diagram of main modules of an apparatus for acquiring page data according to an embodiment of the present invention;
图5是本发明实施例可以应用于其中的示例性系统架构图;FIG. 5 is an exemplary system architecture diagram to which the embodiment of the present invention can be applied;
图6是适于用来实现本发明实施例的终端设备或服务器的计算机系统的结构示意图。Fig. 6 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server according to an embodiment of the present invention.
具体实施方式Detailed ways
以下结合附图对本发明的示范性实施例做出说明,其中包括本发明实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本发明的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。Exemplary embodiments of the present invention are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present invention to facilitate understanding, and they should be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
每个网站业务系统都需要分析用户的行为和网站的流量数据,在分析之前数据的获取是整个分析的基础,如何在不增加业务系统负担的情况下获取页面数据是需要解决的问题。传统的基于Http请求的埋点技术,没有考虑到浏览器和服务器之间频繁的建立连接而给服务器带来的压力。本发明基于双向通信协议连接前端网页和后台服务器,一旦连接成功,就可以多次频繁的发送数据到服务器,从而降低了频繁创建连接的压力。其中,双向通信是指通信的双方可以同时发送和接收信息。Every website business system needs to analyze user behavior and website traffic data. Data acquisition before analysis is the basis of the entire analysis. How to obtain page data without increasing the burden on the business system is a problem that needs to be solved. The traditional burying technology based on Http requests does not take into account the pressure on the server caused by the frequent establishment of connections between the browser and the server. The invention connects the front-end web page and the back-end server based on the two-way communication protocol. Once the connection is successful, data can be sent to the server multiple times frequently, thereby reducing the pressure of frequently creating connections. Among them, two-way communication means that both parties in the communication can send and receive information at the same time.
图1是根据本发明实施例的获取页面的数据的方法的主要步骤的示意图,如图1所示,本发明实施例的获取页面的数据的方法主要包括以下步骤:Fig. 1 is a schematic diagram of main steps of a method for obtaining page data according to an embodiment of the present invention. As shown in Fig. 1 , the method for obtaining page data according to an embodiment of the present invention mainly includes the following steps:
步骤S101:监听客户端的页面。本发明实施例中监听客户端的页面,从而可以实时接收页面的请求并获取页面的数据。Step S101: Monitor the webpage of the client. In the embodiment of the present invention, the page of the client is monitored, so that the request of the page can be received in real time and the data of the page can be obtained.
步骤S102:根据页面加载请求,并基于双向通信协议,建立与页面之间的连接。其中,客户端的页面上会嵌入引入协议客户端的JavaScript(JavaScript是一种直译式脚本语言,它是广泛用于客户端的脚本语言)代码。步骤S101中服务器监听客户端的页面,因此如果用户打开页面,则页面上嵌入的JavaScript代码会通过IP地址(Internet ProtocolAddress,互联网协议地址,又译为网际协议地址)和端口访问服务器,即页面会与服务器建立连接。Step S102: Establish a connection with the page according to the page loading request and based on the two-way communication protocol. Wherein, the JavaScript (JavaScript is a literal translation scripting language, which is widely used in the client-side scripting language) code introduced into the protocol client-side will be embedded on the page of the client-side. In step S101, the server monitors the client's page, so if the user opens the page, the JavaScript code embedded in the page will access the server through the IP address (Internet Protocol Address, Internet Protocol Address, translated into Internet Protocol address) and port, that is, the page will communicate with The server establishes a connection.
步骤S103:在断开连接之前,多次获取页面的数据。在步骤S102建立连接之后,页面可以与服务器进行通信交互,将页面的数据发送给服务器。Step S103: Before the connection is disconnected, the data of the page is acquired multiple times. After the connection is established in step S102, the page can communicate and interact with the server, and send the data of the page to the server.
本发明实施例中,双向通信协议可以是WebSocket网络协议。WebSocket是由Socket发展而来。Socket又称套接字,应用程序通过套接字向网络发出请求或者应答网络请求。Socket可以实现应用程序间的网络通讯。图2是Socket通信模型的示意图,如图2所示,Socket通信模型为:首先服务器(Server)建立一个服务端的监听,等待客户端(Client)的连接请求,每一个连接都会创建一个新的线程,客户端在页面初始化的过程中就和服务器通过Socket创建连接;当连接建立之后,就可以实现客户端到服务器的双向数据通信了,并且如果双方不主动断开连接,则在有效的时间段内,双方都可以任意的给对方发送消息和返回消息;当任意一方断开Socket时,则本次通信连接结束。In the embodiment of the present invention, the two-way communication protocol may be the WebSocket network protocol. WebSocket is developed from Socket. Socket is also called socket, and the application program sends a request to the network or responds to a network request through the socket. Socket can realize network communication between applications. Figure 2 is a schematic diagram of the Socket communication model. As shown in Figure 2, the Socket communication model is as follows: first, the server (Server) establishes a server monitoring, waits for the connection request of the client (Client), and each connection will create a new thread , the client establishes a connection with the server through the Socket during the page initialization process; when the connection is established, the two-way data communication from the client to the server can be realized, and if the two parties do not actively disconnect, the valid time period Within, both parties can send and return messages to each other arbitrarily; when either party disconnects the Socket, the communication connection ends.
Socket可以提供任意应用程序之间的双工网络通信功能,而WebSocket是一种基于TCP协议(Transmission Control Protocol,即传输控制协议,是一种面向连接的、可靠的、基于字节流的运输层通信协议)的新的网络协议,它实现了浏览器到服务器之间的双工通信,也就是说,在浏览器和服务器建立连接之后,浏览器和服务器之间可以进行双方向的消息通信,不受访问次数的限制,达到了连接一次多次数据交换的目的。本发明实施例中,双向通信协议为WebSocket网络协议时,获取页面的数据的方法可以包括:服务器建立WebSocket服务端并利用WebSocket服务端监听客户端的页面;服务器利用WebSocket服务端监听页面;当客户端浏览器页面中嵌入的JavaScript代码通过IP和端口访问WebSocket服务端时,WebSocket客户端连接上WebSocket服务端,并向服务器多次发送页面的数据。其中,WebSocket服务端是单独建设的不属于任何业务系统的后台服务。服务器建立WebSocket服务端的目的是为了在不增加其他业务系统负担的情况下与前端页面建立连接。Socket can provide duplex network communication function between arbitrary applications, and WebSocket is a TCP-based protocol (Transmission Control Protocol, that is, Transmission Control Protocol, which is a connection-oriented, reliable, byte-stream-based transport layer Communication protocol) is a new network protocol, which realizes the duplex communication between the browser and the server, that is, after the browser and the server establish a connection, the browser and the server can carry out two-way message communication, It is not limited by the number of visits, and achieves the purpose of connecting multiple data exchanges once. In the embodiment of the present invention, when the two-way communication protocol is the WebSocket network protocol, the method for obtaining the data of the page may include: the server establishes a WebSocket server and uses the WebSocket server to monitor the page of the client; the server uses the WebSocket server to monitor the page; when the client When the JavaScript code embedded in the browser page accesses the WebSocket server through IP and port, the WebSocket client connects to the WebSocket server and sends page data to the server multiple times. Among them, the WebSocket server is a background service that is built independently and does not belong to any business system. The purpose of the server to establish the WebSocket server is to establish a connection with the front-end page without increasing the burden on other business systems.
本发明实施例中,在步骤S102之前,获取页面的数据的方法还可以包括:客户端在页面上嵌入JavaScript代码。其中,JavaScript代码用于引入双向通信协议的客户端。以及步骤S102可以包括:当页面加载请求时,客户端利用嵌入的JavaScript代码通过互联网协议地址和端口访问双向通信协议的服务端,然后双向通信协议的服务端与页面建立连接。在步骤S102之后,获取页面的数据的方法还可以包括:客户端利用JavaScript代码发送页面的数据。In the embodiment of the present invention, before step S102, the method for obtaining page data may further include: the client embeds JavaScript code on the page. Among them, the JavaScript code is used to introduce the client side of the two-way communication protocol. And step S102 may include: when the page loads a request, the client uses the embedded JavaScript code to access the server end of the two-way communication protocol through the Internet protocol address and port, and then the server end of the two-way communication protocol establishes a connection with the page. After step S102, the method for acquiring page data may further include: the client uses JavaScript codes to send page data.
本发明实施例中,在断开连接之前,多次获取页面的数据可以包括:在与页面建立连接之时,获取页面的属性数据;在与页面建立连接之后,多次获取页面的操作数据,直至断开连接。本发明中,页面的属性数据指的是页面的属性信息,在建立连接的时候,页面的属性信息就会被发送到服务器,这样可以避免发送页面的操作数据的时候,重复发送页面的属性信息。本发明中,页面的操作数据是指用户在页面上的操作产生的数据,也就是用户行为数据,可以包括电脑端的鼠标滚动或者移动端的手指滑屏、点击页面上的超链接、点击图片和输入文字内容中的至少一种。当这些操作发生时,页面上JavaScript代码就会获取操作数据,通过连接将数据发送到服务器。这些用户操作都是对业务系统的页面进行的操作,对于原有的业务系统没有任何的影响,在页面操作的同时,JavaScript代码就把操作数据发送给服务器。如果页面被关闭了,则页面与服务器的连接断开,终止数据发送。In the embodiment of the present invention, before the connection is disconnected, obtaining the data of the page multiple times may include: obtaining the attribute data of the page when establishing a connection with the page; obtaining the operation data of the page multiple times after establishing a connection with the page, until disconnected. In the present invention, the attribute data of the page refers to the attribute information of the page. When the connection is established, the attribute information of the page will be sent to the server, which can avoid sending the attribute information of the page repeatedly when sending the operation data of the page. . In the present invention, the operation data of the page refers to the data generated by the user's operation on the page, that is, user behavior data, which may include mouse scrolling on the computer side or finger sliding on the mobile side, clicking on a hyperlink on the page, clicking on a picture and inputting At least one of the text content. When these operations occur, the JavaScript code on the page will obtain the operation data and send the data to the server through the connection. These user operations are all operations on the pages of the business system, and have no impact on the original business system. While the page is being operated, the JavaScript code sends the operation data to the server. If the page is closed, the connection between the page and the server is disconnected, and data transmission is terminated.
本发明实施例中,页面的属性数据可以包括:用户唯一标识、页面的唯一标识、页面的统一资源定位符、页面所属浏览器类型和页面所属客户端类型中的至少一种。用户的唯一标识可以是用户ID(Identification Card,即身份标识号)。页面的唯一标识是某一个页面被打开的时候,系统会给当前页面生成的一个唯一标识,此标识可以唯一标识当前这个客户端页面,以区分其他用户打开的同一个页面。页面所属浏览器的类型可以是chrome(一款由谷歌Google公司开发的网页浏览器)、IE(Internet Explorer,是微软公司推出的一款网页浏览器)、Firefox(一个自由及开放源代码的网页浏览器)、Opera(一款挪威OperaSoftware ASA公司制作的支持多页面标签式浏览的网络浏览器)或Safari(一款苹果计算机的操作系统Mac OS中的浏览器)等。页面所属客户端类型可以是指应用移动设备客户端或应用电脑客户端等。In the embodiment of the present invention, the attribute data of the page may include: at least one of the unique identifier of the user, the unique identifier of the page, the uniform resource locator of the page, the type of browser to which the page belongs, and the type of client to which the page belongs. The unique identifier of the user may be a user ID (Identification Card, that is, an identification number). The unique identifier of a page is a unique identifier generated by the system for the current page when a certain page is opened. This identifier can uniquely identify the current client page to distinguish the same page opened by other users. The type of browser to which the page belongs can be chrome (a web browser developed by Google), IE (Internet Explorer, a web browser launched by Microsoft), Firefox (a free and open source web browser) browser), Opera (a web browser that supports multi-page tabbed browsing produced by a Norwegian OperaSoftware ASA company) or Safari (a browser in the operating system Mac OS of an Apple computer), etc. The client type to which the page belongs may refer to an application mobile device client or an application computer client.
本发明实施例中,在获取页面的数据之后,获取页面的数据的方法还可以包括:发送数据获取成功的信息,并保存数据。本发明中,一个服务器可以连接多个客户端,也就是说每打开一个页面,就会产生一个连接,服务器在接收到多个页面发来的数据后,会将数据全部保存到日志或者数据库中,本发明对此不作限定。由于服务器在接收到数据后只进行了日志记录保存的操作,所以一个服务器可以并发接收多个页面的数据。In the embodiment of the present invention, after acquiring the data of the page, the method for acquiring the data of the page may further include: sending information that the data is acquired successfully, and saving the data. In the present invention, a server can connect to multiple clients, that is to say, every time a page is opened, a connection will be generated, and after the server receives the data sent by multiple pages, it will save all the data in the log or database , the present invention is not limited thereto. Since the server only saves the log records after receiving the data, a server can receive data of multiple pages concurrently.
为了便于理解,以双向通信协议为WebSocket网络协议为例,对获取页面的数据的方法进行详细说明。图3是根据本发明实施例的获取页面数据的方法的主要流程的示意图。如图3所示,本发明实施例的获取页面的数据的方法的主要流程可以包括:步骤S301,服务器建立独立于业务系统的WebSocket服务端;步骤S302,在页面上嵌入一行引入WebSocket客户端的JavaScript代码;步骤S303,服务器利用WebSocket服务端监听页面;步骤S304,当页面被打开后,页面上嵌入的JavaScript代码会创建一个WebSocke客户端;步骤S305,WebSocket客户端根据服务器的IP地址和端口号自动连接WebSocket服务端,并将页面的属性信息发送到服务器;步骤S306,当用户在页面上进行操作后,页面上嵌入的JavaScript代码自动收集用户的操作数据,并将其发送到服务器;步骤S307,服务器将获取的页面的数据保存到日志系统中;步骤S308,服务器在成功保存页面的数据之后,通过WebSocket服务端向WebSocket客户端发送获取成功的消息;步骤S309,WebSocket客户端接收到获取成功消息之后,结束本次数据发送的过程,并与服务器通过WebSocket网络协议持续保持通信,等待下一次发送页面的操作数据;步骤S310,当用户关闭页面后,WebSocket服务端注销此连接,并将持续监听页面的连接请求。For ease of understanding, the method for obtaining page data will be described in detail by taking the two-way communication protocol as the WebSocket network protocol as an example. Fig. 3 is a schematic diagram of a main flow of a method for acquiring page data according to an embodiment of the present invention. As shown in Figure 3, the main flow of the method for obtaining page data in the embodiment of the present invention may include: Step S301, the server establishes a WebSocket server independent of the business system; Step S302, embeds a line of JavaScript on the page that introduces the WebSocket client code; Step S303, the server uses the WebSocket server to monitor the page; Step S304, when the page is opened, the JavaScript code embedded in the page will create a WebSocket client; Step S305, the WebSocket client automatically according to the IP address and port number of the server Connect the WebSocket server, and send the attribute information of the page to the server; step S306, when the user operates on the page, the JavaScript code embedded in the page automatically collects the user's operation data, and sends it to the server; step S307, The server saves the data of the acquired page in the log system; step S308, after the server successfully saves the data of the page, sends a message of successful acquisition to the WebSocket client through the WebSocket server; step S309, the WebSocket client receives the successful message of acquisition Afterwards, end the process of sending data this time, and continue to communicate with the server through the WebSocket network protocol, waiting for the next operation data to be sent to the page; step S310, when the user closes the page, the WebSocket server will log off the connection, and will continue to monitor The connection request for the page.
以众筹业务系统为例,详细解释获取页面的数据的方法。每个众筹业务系统在上线正式运营之后,都需要对众筹业务系统的实际数据情况进行统计,包括用户的操作数据、页面曝光率、点击率、PV(page view,即页面浏览量,通常是衡量一个网络新闻频道或网站甚至一条网络新闻的主要指标)和UV(unique visitor,是指通过互联网访问、浏览这个网页的用户)等数据。如果每一个业务系统都各自进行数据统计的话,开发的重复性很高,所以需要开发一个单独的系统,可以获取所有的业务系统的用户操作数据。因此,基于WebSocket网络协议的获取操作数据的系统应运而生,该系统独立于其他业务系统。该系统主要包括以下两个部分:一个是WebSocket服务端程序,用于创建WebSocket服务端、监听客户端的连接请求、接收发送的数据以及将数据保存到日志中;另一个是前端页面的JavaScript程序,用于创建WebSocket客户端、收集用户在页面上的行为操作、通过WebSocket连接将获取到的操作数据发送到WebSocket服务端。其中,该JavaScript代码可以通过一行语句就可以嵌入到其他的业务系统的页面中。在实际的应用中,用于获取页面数据的服务器端只使用了2台Tomcat服务器(Tomcat服务器是一个免费的开放源代码的应用服务器,属于轻量级应用服务器,在中小型系统和并发访问用户不是很多的场合下被普遍使用)就支撑了5个业务系统的数据收集的任务。业务系统平均每日的PV数据可以达到500万,使用了40%的网络传输量以及30%的CPU占有率,远远达不到服务器的瓶颈。因此经过估算,两台Tomcat服务器就可以支撑每日PV上千万的数据量,而在使用WebSocket网络协议之前,每一个业务系统都要配置2台服务器才能支撑起数据收集的任务。Taking the crowdfunding business system as an example, explain in detail how to obtain page data. After each crowdfunding business system goes online and officially operates, it needs to make statistics on the actual data of the crowdfunding business system, including user operation data, page exposure rate, click rate, PV (page view, that is, page views, usually It is the main indicator to measure an online news channel or website or even a piece of online news) and UV (unique visitor, refers to the user who visits and browses this web page through the Internet) and other data. If each business system carries out data statistics separately, the development repeatability is very high, so it is necessary to develop a separate system that can obtain user operation data of all business systems. Therefore, a system for obtaining operational data based on the WebSocket network protocol came into being, which is independent of other business systems. The system mainly includes the following two parts: one is the WebSocket server program, which is used to create the WebSocket server, monitor the connection request of the client, receive and send the data, and save the data in the log; the other is the JavaScript program of the front page, It is used to create a WebSocket client, collect user behavior and operations on the page, and send the obtained operation data to the WebSocket server through a WebSocket connection. Wherein, the JavaScript code can be embedded into the pages of other business systems through a single statement. In actual application, only two Tomcat servers are used on the server side for obtaining page data (Tomcat server is a free and open-source application server, which is a lightweight It is commonly used in not many occasions) to support the data collection tasks of five business systems. The average daily PV data of the business system can reach 5 million, using 40% of the network transmission volume and 30% of the CPU occupancy, which is far from the bottleneck of the server. Therefore, it is estimated that two Tomcat servers can support tens of millions of PV data per day, and before using the WebSocket network protocol, each business system must be configured with two servers to support the task of data collection.
根据本发明实施例的获取页面的数据的技术方案可以看出,能够基于双向通信协议建立后台服务器和前端页面的连接,能够连接一次进行多次数据交互,从而可以减少服务器的负载压力,降低硬件成本和人工维护成本;本发明实施例中在页面上嵌入JavaScript代码,从而可以利用JavaScript代码访问服务器,并向服务器发送页面数据;本发明实施例中在断开连接之前多次获取页面的数据,从而可以使同一个页面只建立一次连接就可以传输多次数据,减少了后台服务器连接的请求次数;本发明实施例中页面的属性数据为页面的属性信息,从而可以将用户打开的页面与其他页面区分开;本发明实施例中后台服务器在在获取到页面的数据之后,发送获取成功的信息并保存获取的数据,从而可以利用数据对网站进行分析处理。According to the technical solution for obtaining page data according to the embodiment of the present invention, it can be seen that the connection between the background server and the front-end page can be established based on the two-way communication protocol, and multiple data interactions can be performed once connected, thereby reducing the load pressure on the server and hardware. Cost and labor maintenance cost; Embed JavaScript code on the page in the embodiment of the present invention, thereby can utilize JavaScript code to visit server, and send page data to server; In the embodiment of the present invention, obtain the data of page multiple times before disconnection, Thereby, the same page can transmit data multiple times only by establishing a connection once, which reduces the number of requests for background server connections; the attribute data of the page in the embodiment of the present invention is the attribute information of the page, so that the page opened by the user can be compared with other pages. The pages are distinguished; in the embodiment of the present invention, after acquiring the data of the page, the background server sends the information of successful acquisition and saves the acquired data, so that the data can be used to analyze and process the website.
图4是根据本发明实施例的获取页面的数据的装置的主要模块的示意图。如图4所示,本发明实施例的获取页面的数据的装置400主要包括以下模块:监听模块401、连接模块402和获取模块403。Fig. 4 is a schematic diagram of main modules of an apparatus for acquiring page data according to an embodiment of the present invention. As shown in FIG. 4 , the apparatus 400 for acquiring page data in the embodiment of the present invention mainly includes the following modules: a listening module 401 , a connecting module 402 and an acquiring module 403 .
其中,监听模块401可用于监听客户端的页面。连接模块402可用于根据页面加载请求,并基于双向通信协议,建立与页面之间的连接。获取模块403可用于在断开连接之前,多次获取页面的数据。Wherein, the monitoring module 401 can be used to monitor the webpage of the client. The connection module 402 can be used to establish a connection with the page according to the page loading request and based on the two-way communication protocol. The acquiring module 403 can be used to acquire the data of the page multiple times before the connection is disconnected.
本发明实施例中,获取页面的数据的装置400还可以包括嵌入模块(图中未示出)。其中,嵌入模块可用于客户端在页面上嵌入JavaScript代码,JavaScript代码可以用于引入双向通信协议的客户端。连接模块402还可用于:当页面加载请求时,客户端利用JavaScript代码通过互联网协议地址和端口访问双向通信协议的服务端,然后双向通信协议的服务端与页面建立连接。获取页面的数据的装置400还可以包括发送模块(图中未示出)。其中,发送模块可用于客户端利用JavaScript代码发送页面的数据。In the embodiment of the present invention, the apparatus 400 for acquiring page data may further include an embedding module (not shown in the figure). Wherein, the embedding module can be used for the client to embed the JavaScript code on the page, and the JavaScript code can be used for the client to introduce the two-way communication protocol. The connection module 402 can also be used for: when the page loads a request, the client uses JavaScript code to access the server end of the two-way communication protocol through the Internet Protocol address and port, and then the server end of the two-way communication protocol establishes a connection with the page. The apparatus 400 for acquiring page data may also include a sending module (not shown in the figure). Wherein, the sending module can be used for the client to send the data of the page by JavaScript code.
本发明实施例中,获取模块403还可用于:在与页面建立连接之时,获取页面的属性数据;在与页面建立连接之后,多次获取页面的操作数据,直至断开连接。In the embodiment of the present invention, the acquiring module 403 can also be used to: acquire the attribute data of the page when the connection is established with the page; after establishing the connection with the page, acquire the operation data of the page multiple times until the connection is disconnected.
本发明实施例中,特定数据可以包括:用户唯一标识、页面的唯一标识、页面的统一资源定位符、页面所属浏览器类型和页面所属客户端类型中的至少一种。操作数据可以包括:电脑端的鼠标滚动或者移动端的手指滑屏、点击页面上的超链接、点击图片和输入文字内容中的至少一种。In the embodiment of the present invention, the specific data may include: at least one of the unique identifier of the user, the unique identifier of the page, the uniform resource locator of the page, the type of browser to which the page belongs, and the type of client to which the page belongs. The operation data may include: at least one of: mouse scrolling on the computer terminal or finger sliding on the mobile terminal, clicking on a hyperlink on a page, clicking on a picture, and inputting text content.
本发明实施例中,获取页面的数据的装置还可以包括:存储模块(图中未示出)。其中,存储模块可用于发送数据获取成功的信息,并保存数据。In the embodiment of the present invention, the apparatus for acquiring page data may further include: a storage module (not shown in the figure). Wherein, the storage module can be used to send the information that the data is obtained successfully, and save the data.
从以上描述可以看出,能够基于双向通信协议建立后台服务器和前端页面的连接,能够连接一次进行多次数据交互,从而可以减少服务器的负载压力,降低硬件成本和人工维护成本;本发明实施例中在页面上嵌入JavaScript代码,从而可以利用JavaScript代码访问服务器,并向服务器发送页面数据;本发明实施例中在断开连接之前多次获取页面的数据,从而可以使同一个页面只建立一次连接就可以传输多次数据,减少了后台服务器连接的请求次数;本发明实施例中页面的属性数据为页面的属性信息,从而可以将用户打开的页面与其他页面区分开;本发明实施例中后台服务器在在获取到页面的数据之后,发送获取成功的信息并保存获取的数据,从而可以利用数据对网站进行分析处理。It can be seen from the above description that the connection between the background server and the front-end page can be established based on the two-way communication protocol, and multiple data interactions can be performed once connected, thereby reducing the load pressure on the server, reducing hardware costs and labor maintenance costs; the embodiment of the present invention Embed JavaScript code on the page, so that JavaScript code can be used to access the server and send page data to the server; in the embodiment of the present invention, the data of the page is obtained multiple times before the connection is disconnected, so that the same page can only be connected once Just can transmit multiple times data, reduced the number of times of request that background server connects; The attribute data of page in the embodiment of the present invention is the attribute information of page, thereby can distinguish the page that the user opens from other pages; Background in the embodiment of the present invention After obtaining the data of the page, the server sends the information of successful obtaining and saves the obtained data, so that the data can be used to analyze and process the website.
图5示出了可以应用本发明实施例的获取页面的数据的方法或获取页面的数据的装置的示例性系统架构500。FIG. 5 shows an exemplary system architecture 500 to which the method for acquiring data of a page or the apparatus for acquiring data of a page according to an embodiment of the present invention can be applied.
如图5所示,系统架构500可以包括终端设备501、502、503,网络504和服务器505。网络504用以在终端设备501、502、503和服务器505之间提供通信链路的介质。网络504可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 5 , a system architecture 500 may include terminal devices 501 , 502 , and 503 , a network 504 and a server 505 . The network 504 is used as a medium for providing communication links between the terminal devices 501 , 502 , 503 and the server 505 . Network 504 may include various connection types, such as wires, wireless communication links, or fiber optic cables, among others.
用户可以使用终端设备501、502、503通过网络504与服务器505交互,以接收或发送消息等。终端设备501、502、503上可以安装有各种通讯客户端应用,例如购物类应用、网页浏览器应用、搜索类应用、即时通信工具、邮箱客户端、社交平台软件等(仅为示例)。Users can use terminal devices 501 , 502 , 503 to interact with server 505 through network 504 to receive or send messages and the like. Various communication client applications can be installed on the terminal devices 501, 502, and 503, such as shopping applications, web browser applications, search applications, instant messaging tools, email clients, social platform software, etc. (just for example).
终端设备501、502、503可以是具有显示屏并且支持网页浏览的各种电子设备,包括但不限于智能手机、平板电脑、膝上型便携计算机和台式计算机等等。The terminal devices 501, 502, 503 may be various electronic devices with display screens and supporting web browsing, including but not limited to smart phones, tablet computers, laptop computers, desktop computers and the like.
服务器505可以是提供各种服务的服务器,例如对用户利用终端设备501、502、503所浏览的购物类网站提供支持的后台管理服务器(仅为示例)。后台管理服务器可以对接收到的产品信息查询请求等数据进行分析等处理,并将处理结果(例如目标推送信息、产品信息--仅为示例)反馈给终端设备。The server 505 may be a server that provides various services, such as a background management server that provides support for shopping websites browsed by users using the terminal devices 501 , 502 , 503 (just an example). The background management server can analyze and process the received data such as product information query requests, and feed back the processing results (such as target push information, product information—just an example) to the terminal device.
需要说明的是,本发明实施例所提供的获取页面的数据的方法一般由服务器505执行,相应地,获取页面的数据的装置一般设置于服务器505中。It should be noted that the method for acquiring page data provided by the embodiment of the present invention is generally executed by the server 505 , and correspondingly, the device for acquiring page data is generally set in the server 505 .
应该理解,图5中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。It should be understood that the numbers of terminal devices, networks and servers in Fig. 5 are only illustrative. According to the implementation needs, there can be any number of terminal devices, networks and servers.
下面参考图6,其示出了适于用来实现本发明实施例的终端设备的计算机系统600的结构示意图。图6示出的终端设备仅仅是一个示例,不应对本发明实施例的功能和使用范围带来任何限制。Referring now to FIG. 6 , it shows a schematic structural diagram of a computer system 600 suitable for implementing a terminal device according to an embodiment of the present invention. The terminal device shown in FIG. 6 is only an example, and should not limit the functions and scope of use of this embodiment of the present invention.
如图6所示,计算机系统600包括中央处理单元(CPU)601,其可以根据存储在只读存储器(ROM)602中的程序或者从存储部分608加载到随机访问存储器(RAM)603中的程序而执行各种适当的动作和处理。在RAM 603中,还存储有系统600操作所需的各种程序和数据。CPU 601、ROM 602以及RAM 603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。As shown in FIG. 6 , a computer system 600 includes a central processing unit (CPU) 601 that can be programmed according to a program stored in a read-only memory (ROM) 602 or a program loaded from a storage section 608 into a random-access memory (RAM) 603 Instead, various appropriate actions and processes are performed. In the RAM 603, various programs and data necessary for the operation of the system 600 are also stored. The CPU 601 , ROM 602 , and RAM 603 are connected to each other through a bus 604 . An input/output (I/O) interface 605 is also connected to the bus 604 .
以下部件连接至I/O接口605:包括键盘、鼠标等的输入部分606;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分607;包括硬盘等的存储部分608;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分609。通信部分609经由诸如因特网的网络执行通信处理。驱动器610也根据需要连接至I/O接口605。可拆卸介质611,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器610上,以便于从其上读出的计算机程序根据需要被安装入存储部分608。The following components are connected to the I/O interface 605: an input section 606 including a keyboard, a mouse, etc.; an output section 607 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker; a storage section 608 including a hard disk, etc. and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the Internet. A drive 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, optical disk, magneto-optical disk, semiconductor memory, etc. is mounted on the drive 610 as necessary so that a computer program read therefrom is installed into the storage section 608 as necessary.
特别地,根据本发明公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本发明公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分609从网络上被下载和安装,和/或从可拆卸介质611被安装。在该计算机程序被中央处理单元(CPU)601执行时,执行本发明的系统中限定的上述功能。In particular, according to the disclosed embodiments of the present invention, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, the disclosed embodiments of the present invention include a computer program product, which includes a computer program carried on a computer-readable medium, where the computer program includes program codes for executing the methods shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via communication portion 609 and/or installed from removable media 611 . When this computer program is executed by a central processing unit (CPU) 601, the above-mentioned functions defined in the system of the present invention are performed.
需要说明的是,本发明所示的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本发明中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本发明中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者上述的任意合适的组合。It should be noted that the computer-readable medium shown in the present invention may be a computer-readable signal medium or a computer-readable storage medium or any combination of the above two. A computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In the present invention, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In the present invention, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, in which computer-readable program codes are carried. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device. . Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
附图中的流程图和框图,图示了按照本发明各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,上述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图或流程图中的每个方框、以及框图或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in a flowchart or block diagram may represent a module, program segment, or portion of code that includes one or more logical functions for implementing specified executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. It should also be noted that each block in the block diagrams or flowchart illustrations, and combinations of blocks in the block diagrams or flowchart illustrations, can be implemented by a dedicated hardware-based system that performs the specified function or operation, or can be implemented by a A combination of dedicated hardware and computer instructions.
描述于本发明实施例中所涉及到的模块可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的模块也可以设置在处理器中,例如,可以描述为:一种处理器包括监听模块、连接模块和获取模块。其中,这些模块的名称在某种情况下并不构成对该模块本身的限定,例如,监听模块还可以被描述为“监听客户端的页面的模块”。The modules involved in the embodiments described in the present invention may be implemented by software or by hardware. The described modules can also be set in a processor, for example, it can be described as: a processor includes a monitoring module, a connection module and an acquisition module. Wherein, the names of these modules do not constitute a limitation on the module itself under certain circumstances, for example, the monitoring module may also be described as a "module monitoring the client's page".
作为另一方面,本发明还提供了一种计算机可读介质,该计算机可读介质可以是上述实施例中描述的设备中所包含的;也可以是单独存在,而未装配入该设备中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被一个该设备执行时,使得该设备包括:监听客户端的页面;根据页面加载请求,并基于双向通信协议,建立与页面之间的连接;在断开连接之前,多次获取页面的数据。As another aspect, the present invention also provides a computer-readable medium. The computer-readable medium may be contained in the device described in the above embodiments, or it may exist independently without being assembled into the device. The above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the device, the device includes: monitoring the page of the client; according to the page loading request, and based on the two-way communication protocol, establishing and Connections between pages; fetch data for a page multiple times before disconnecting.
根据本发明实施例的技术方案,能够基于双向通信协议建立后台服务器和前端页面的连接,能够连接一次进行多次数据交互,从而可以减少服务器的负载压力,降低硬件成本和人工维护成本;本发明实施例中在页面上嵌入JavaScript代码,从而可以利用JavaScript代码访问服务器,并向服务器发送页面数据;本发明实施例中在断开连接之前多次获取页面的数据,从而可以使同一个页面只建立一次连接就可以传输多次数据,减少了后台服务器连接的请求次数;本发明实施例中页面的属性数据为页面的属性信息,从而可以将用户打开的页面与其他页面区分开;本发明实施例中后台服务器在在获取到页面的数据之后,发送获取成功的信息并保存获取的数据,从而可以利用数据对网站进行分析处理。According to the technical solution of the embodiment of the present invention, the connection between the background server and the front-end page can be established based on the two-way communication protocol, and multiple data interactions can be performed once connected, thereby reducing the load pressure on the server, reducing hardware costs and labor maintenance costs; the present invention In the embodiment, the JavaScript code is embedded on the page, so that the JavaScript code can be used to access the server, and the page data is sent to the server; in the embodiment of the present invention, the data of the page is obtained multiple times before disconnection, so that the same page can only be created Data can be transmitted multiple times with one connection, reducing the number of requests for background server connections; the attribute data of the page in the embodiment of the present invention is the attribute information of the page, so that the page opened by the user can be distinguished from other pages; the embodiment of the present invention After obtaining the data of the page, the middle and background servers send the information of successful acquisition and save the obtained data, so that the data can be used to analyze and process the website.
上述具体实施方式,并不构成对本发明保护范围的限制。本领域技术人员应该明白的是,取决于设计要求和其他因素,可以发生各种各样的修改、组合、子组合和替代。任何在本发明的精神和原则之内所作的修改、等同替换和改进等,均应包含在本发明保护范围之内。The above specific implementation methods do not constitute a limitation to the protection scope of the present invention. It should be apparent to those skilled in the art that various modifications, combinations, sub-combinations and substitutions may occur depending on design requirements and other factors. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention shall be included within the protection scope of the present invention.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201810036749.XACN110347945A (en) | 2018-01-15 | 2018-01-15 | The method and apparatus for obtaining the data of the page |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201810036749.XACN110347945A (en) | 2018-01-15 | 2018-01-15 | The method and apparatus for obtaining the data of the page |
| Publication Number | Publication Date |
|---|---|
| CN110347945Atrue CN110347945A (en) | 2019-10-18 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201810036749.XAPendingCN110347945A (en) | 2018-01-15 | 2018-01-15 | The method and apparatus for obtaining the data of the page |
| Country | Link |
|---|---|
| CN (1) | CN110347945A (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111669447A (en)* | 2020-06-16 | 2020-09-15 | 中国建设银行股份有限公司 | Page display method, device, equipment and medium |
| CN113419940A (en)* | 2021-07-07 | 2021-09-21 | 广州方硅信息技术有限公司 | Program log collecting and returning method and corresponding device, equipment and medium |
| CN114546809A (en)* | 2022-02-25 | 2022-05-27 | 北京沃东天骏信息技术有限公司 | Page management method and device |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103905435A (en)* | 2014-03-14 | 2014-07-02 | 北京六间房科技有限公司 | Communication method of front end page and rear end server |
| CN104484823A (en)* | 2014-11-26 | 2015-04-01 | 中金金融认证中心有限公司 | Method and system for PKI (public key infrastructure) services of electronic bank |
| CN106357697A (en)* | 2016-11-14 | 2017-01-25 | 威创软件南京有限公司 | Multi-terminal page synchronizing method based on WebSocket attribute synchronization |
| CN106571949A (en)* | 2016-09-23 | 2017-04-19 | 北京五八信息技术有限公司 | Event tracking point processing method and apparatus |
| CN106899455A (en)* | 2017-03-07 | 2017-06-27 | 广州优视网络科技有限公司 | The method and apparatus that a kind of client is interacted with webpage |
| CN106897215A (en)* | 2017-01-20 | 2017-06-27 | 华南理工大学 | A kind of method gathered based on WebView webpages loading performance and user behavior flow data |
| CN107168963A (en)* | 2016-03-07 | 2017-09-15 | 阿里巴巴集团控股有限公司 | The page loading of client and caching method, system and the client of the page |
| CN107295050A (en)* | 2016-04-01 | 2017-10-24 | 阿里巴巴集团控股有限公司 | Front end user behavioral statisticses method and device |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103905435A (en)* | 2014-03-14 | 2014-07-02 | 北京六间房科技有限公司 | Communication method of front end page and rear end server |
| CN104484823A (en)* | 2014-11-26 | 2015-04-01 | 中金金融认证中心有限公司 | Method and system for PKI (public key infrastructure) services of electronic bank |
| CN107168963A (en)* | 2016-03-07 | 2017-09-15 | 阿里巴巴集团控股有限公司 | The page loading of client and caching method, system and the client of the page |
| CN107295050A (en)* | 2016-04-01 | 2017-10-24 | 阿里巴巴集团控股有限公司 | Front end user behavioral statisticses method and device |
| CN106571949A (en)* | 2016-09-23 | 2017-04-19 | 北京五八信息技术有限公司 | Event tracking point processing method and apparatus |
| CN106357697A (en)* | 2016-11-14 | 2017-01-25 | 威创软件南京有限公司 | Multi-terminal page synchronizing method based on WebSocket attribute synchronization |
| CN106897215A (en)* | 2017-01-20 | 2017-06-27 | 华南理工大学 | A kind of method gathered based on WebView webpages loading performance and user behavior flow data |
| CN106899455A (en)* | 2017-03-07 | 2017-06-27 | 广州优视网络科技有限公司 | The method and apparatus that a kind of client is interacted with webpage |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111669447A (en)* | 2020-06-16 | 2020-09-15 | 中国建设银行股份有限公司 | Page display method, device, equipment and medium |
| CN113419940A (en)* | 2021-07-07 | 2021-09-21 | 广州方硅信息技术有限公司 | Program log collecting and returning method and corresponding device, equipment and medium |
| CN113419940B (en)* | 2021-07-07 | 2023-08-15 | 广州方硅信息技术有限公司 | Program log acquisition and regression method and corresponding device, equipment and medium thereof |
| CN114546809A (en)* | 2022-02-25 | 2022-05-27 | 北京沃东天骏信息技术有限公司 | Page management method and device |
| Publication | Publication Date | Title |
|---|---|---|
| CN102710748B (en) | Data capture method, system and equipment | |
| CN106874471B (en) | Information push method and device | |
| US20150058407A1 (en) | Systems, methods, and apparatuses for implementing the simultaneous display of multiple browser client cursors at each browser client common to a shared browsing session | |
| CN105045887B (en) | The system and method for mixed mode cross-domain data interaction | |
| CN104426985B (en) | Show the method, apparatus and system of webpage | |
| CN110120917A (en) | Method for routing and device based on content | |
| CN107438084B (en) | Multi-client data synchronization method and device | |
| CN110858172A (en) | A kind of automatic test code generation method and device | |
| WO2017174026A1 (en) | Client connection method and system | |
| CN110198351A (en) | Storage method, device, server-side and the readable storage medium storing program for executing of offline message | |
| CN110928934A (en) | Data processing method and device for business analysis | |
| CN113377312A (en) | Same-screen interaction method and device, computer equipment and computer readable storage medium | |
| CN115134208A (en) | Message conversion method, device, electronic device and storage medium | |
| CN110347945A (en) | The method and apparatus for obtaining the data of the page | |
| CN112131092A (en) | Page debugging method and device | |
| CN111953718B (en) | Page debugging method and device | |
| CN112149392A (en) | A rich text editing method and device | |
| CN116150513A (en) | Data processing method, device, electronic device, and computer-readable storage medium | |
| CN112015383A (en) | A login method and device | |
| CN109981546B (en) | Method and device for acquiring remote call relation between application modules | |
| CN116561013B (en) | Test methods, devices, electronic equipment and media based on the target service framework | |
| CN102148869A (en) | Method and device for JAVA application to transfer information to local | |
| CN112532734B (en) | Method and device for detecting message sensitive information | |
| CN115550437A (en) | Method, apparatus, device and computer-readable medium for processing messages | |
| CN113726827B (en) | Data packet processing method and device based on distributed cluster |
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| CB02 | Change of applicant information | Address after:101111 Room 221, 2nd Floor, Block C, 18 Kechuang 11th Street, Beijing Economic and Technological Development Zone Applicant after:Jingdong Technology Holding Co.,Ltd. Address before:101111 Room 221, 2nd Floor, Block C, 18 Kechuang 11th Street, Beijing Economic and Technological Development Zone Applicant before:Jingdong Digital Technology Holding Co.,Ltd. Address after:101111 Room 221, 2nd Floor, Block C, 18 Kechuang 11th Street, Beijing Economic and Technological Development Zone Applicant after:Jingdong Digital Technology Holding Co.,Ltd. Address before:101111 Room 221, 2nd Floor, Block C, 18 Kechuang 11th Street, Beijing Economic and Technological Development Zone Applicant before:JINGDONG DIGITAL TECHNOLOGY HOLDINGS Co.,Ltd. Address after:101111 Room 221, 2nd Floor, Block C, 18 Kechuang 11th Street, Beijing Economic and Technological Development Zone Applicant after:JINGDONG DIGITAL TECHNOLOGY HOLDINGS Co.,Ltd. Address before:101111 Room 221, 2nd Floor, Block C, 18 Kechuang 11th Street, Beijing Economic and Technological Development Zone Applicant before:BEIJING JINGDONG FINANCIAL TECHNOLOGY HOLDING Co.,Ltd. | |
| CB02 | Change of applicant information | ||
| RJ01 | Rejection of invention patent application after publication | Application publication date:20191018 | |
| RJ01 | Rejection of invention patent application after publication |