技术领域technical field
本发明涉及数据处理技术领域,尤其涉及一种数据处理方法和装置。The present invention relates to the technical field of data processing, in particular to a data processing method and device.
背景技术Background technique
随着互联网的快速发展,用户可以通过互联网展现的页面获取更多的信息。对于互联网广告投放商来说,分析用户对互联网上广告的点击行为,对自身的广告投放具有重要意义。用户在互联网上的点击行为可以记录在展现点击日志中,展现点击日志是不可缺少的训练数据,广泛应用在广告的点击率(Click Through Rate,CTR)预估模型中。With the rapid development of the Internet, users can obtain more information through the pages displayed on the Internet. For Internet advertising providers, analyzing users' click behavior on Internet advertisements is of great significance to their own advertisement placement. The user's click behavior on the Internet can be recorded in the display click log. The display click log is indispensable training data and is widely used in the click through rate (CTR) prediction model of advertisements.
相关技术中,直接应用搜索引擎中的展现点击日志作为广告点击率预估模型的输入数据,默认页面上加载的广告都展现给了用户,用户点击即为正例,否则为负例。In the related technology, the display click log in the search engine is directly used as the input data of the advertisement click rate estimation model, and the advertisements loaded on the default page are all displayed to the user, and the user click is a positive example, otherwise it is a negative example.
但是,相对于个人电脑(Personal Computer,PC),移动终端的屏幕较小,展现的内容有限,导致页面上加载的广告不能全部展现给用户。因此,直接采用展现点击日志不仅导致CTR模型的输入数据量较大,且并不能真实反映用户的浏览点击行为,存在大量非真实的数据,影响CTR模型预估的准确率。However, compared with a personal computer (Personal Computer, PC), the screen of the mobile terminal is smaller and the displayed content is limited, so that all the advertisements loaded on the page cannot be displayed to the user. Therefore, directly using display click logs not only leads to a large amount of input data for the CTR model, but also cannot truly reflect the user's browsing and clicking behavior, and there is a large amount of unreal data, which affects the accuracy of the CTR model prediction.
发明内容Contents of the invention
本发明旨在至少在一定程度上解决相关技术中的技术问题之一。The present invention aims to solve one of the technical problems in the related art at least to a certain extent.
为此,本发明的一个目的在于提出一种数据处理方法,该方法可以降低CTR模型的输入数据量,且提高模型的预估准确率。Therefore, an object of the present invention is to propose a data processing method, which can reduce the amount of input data of the CTR model and improve the prediction accuracy of the model.
本发明的另一个目的在于提出一种数据处理装置。Another object of the present invention is to propose a data processing device.
为达到上述目的,本发明第一方面实施例提出的数据处理方法,包括:对搜索结果进行展现,所述搜索结果是根据用户输入的搜索词得到的;获取移动终端屏幕上展现内容的信息,所述展现内容包括:初始屏幕展现内容,以及,用户每次滑屏后的屏幕展现内容;将所述展现内容的信息发送给服务器,以使所述服务器记录所述展现内容的信息,所述展现内容的信息用于获取所述搜索结果中没有被用户浏览到的内容,并在展现点击日志中去除所述没有被用户浏览到的内容的信息。In order to achieve the above object, the data processing method proposed in the embodiment of the first aspect of the present invention includes: displaying the search results obtained according to the search words input by the user; obtaining the information of the content displayed on the screen of the mobile terminal, The display content includes: the initial screen display content, and the screen display content after the user slides each time; the information of the display content is sent to the server, so that the server records the information of the display content, the The displayed content information is used to acquire the content not browsed by the user in the search result, and remove the information of the content not browsed by the user in the displayed click log.
本发明第一方面实施例提出的数据处理方法,通过将移动终端屏幕上展现内容的信息发送给服务器,然后根据移动终端屏幕上展现内容的信息确定没有被用户浏览到的内容,并在展现点击日志中去除没有被用户浏览到的内容的信息,一方面由于去除了部分信息可以降低CTR模型的输入数据量,另一方面由于去除了没有被用户浏览到的内容的信息,可以避免将不准确的信息作为模型的输入数据,提高模型的预估准确率。The data processing method proposed in the embodiment of the first aspect of the present invention sends the information of the displayed content on the screen of the mobile terminal to the server, and then determines the content that has not been browsed by the user according to the information of the displayed content on the mobile terminal screen, and clicks on the displayed content. Removing information about content that has not been browsed by users from the log, on the one hand, can reduce the input data volume of the CTR model due to the removal of part of the information, and on the other hand, can avoid inaccurate The information is used as the input data of the model to improve the prediction accuracy of the model.
为达到上述目的,本发明第二方面实施例提出的数据处理方法,包括:获取服务器中记录的移动终端屏幕上展现内容的信息,所述展现内容的信息是所述移动终端对搜索结果进行展现后得到并发送给所述服务器的,所述搜索结果是根据用户输入的搜索词得到的,所述展现内容包括:初始屏幕展现内容,以及,用户每次滑屏后的屏幕展现内容;根据所述展现内容的信息获取所述搜索结果中没有被用户浏览到的内容;在展现点击日志中去除所述没有被用户浏览到的内容的信息。In order to achieve the above-mentioned purpose, the data processing method proposed in the embodiment of the second aspect of the present invention includes: obtaining the information of the displayed content on the screen of the mobile terminal recorded in the server, and the information of the displayed content is the display of the search results by the mobile terminal After obtaining and sending to the server, the search result is obtained according to the search word input by the user, and the display content includes: the initial screen display content, and the screen display content after the user swipe each time; according to the Obtaining the information of the displayed content that has not been browsed by the user in the search results; removing the information of the content that has not been browsed by the user in the display click log.
本发明第二方面实施例提出的数据处理方法,通过根据移动终端屏幕上展现内容的信息确定没有被用户浏览到的内容,并在展现点击日志中去除没有被用户浏览到的内容的信息,一方面由于去除了部分信息可以降低CTR模型的输入数据量,另一方面由于去除了没有被用户浏览到的内容的信息,可以避免将不准确的信息作为模型的输入数据,提高模型的预估准确率。The data processing method proposed in the embodiment of the second aspect of the present invention determines the content that has not been browsed by the user according to the information displayed on the screen of the mobile terminal, and removes the information of the content that has not been browsed by the user from the display click log. On the one hand, the removal of some information can reduce the amount of input data for the CTR model; on the other hand, the removal of information about content that has not been browsed by users can avoid using inaccurate information as the input data of the model and improve the prediction accuracy of the model. Rate.
为达到上述目的,本发明第三方面实施例提出的数据处理装置,包括:展现模块,用于对搜索结果进行展现,所述搜索结果是根据用户输入的搜索词得到的;获取模块,用于获取移动终端屏幕上展现内容的信息,所述展现内容包括:初始屏幕展现内容,以及,用户每次滑屏后的屏幕展现内容;发送模块,用于将所述展现内容的信息发送给服务器,以使所述服务器记录所述展现内容的信息,所述展现内容的信息用于获取所述搜索结果中没有被用户浏览到的内容,并在展现点击日志中去除所述没有被用户浏览到的内容的信息。In order to achieve the above object, the data processing device proposed in the embodiment of the third aspect of the present invention includes: a display module, used to display search results, the search results are obtained according to the search words input by the user; an acquisition module, used to Acquiring the information of the display content on the screen of the mobile terminal, the display content includes: the initial screen display content, and the screen display content after the user slides the screen each time; the sending module is used to send the information of the display content to the server, In order to make the server record the information of the display content, the information of the display content is used to obtain the content not browsed by the user in the search result, and remove the content not browsed by the user in the display click log content information.
本发明第三方面实施例提出的数据处理装置,通过将所述移动终端屏幕上展现内容的信息发送给服务器,然后根据所述展现内容的信息获取所述搜索结果中没有被用户浏览到的内容,并在展现点击日志中去除所述没有被用户浏览到的内容的信息,一方面由于去除了部分信息可以降低CTR模型的输入数据量,另一方面由于去除了没有被用户浏览到的内容的信息,可以避免将不准确的信息作为模型的输入数据,提高模型的预估准确率。The data processing device proposed in the embodiment of the third aspect of the present invention sends the information of the displayed content on the screen of the mobile terminal to the server, and then obtains the content in the search results that has not been browsed by the user according to the information of the displayed content , and remove the information of the content that has not been browsed by the user in the display click log. On the one hand, the amount of input data for the CTR model can be reduced due to the removal of part of the information, and on the other hand, because the content of the content that has not been browsed by the user is removed Information can avoid using inaccurate information as the input data of the model and improve the prediction accuracy of the model.
为达到上述目的,本发明第四方面实施例提出的数据处理装置,包括:获取模块,用于获取服务器中记录的移动终端屏幕上展现内容的信息,所述展现内容的信息是所述移动终端对搜索结果进行展现后得到并发送给所述服务器的,所述搜索结果是根据用户输入的搜索词得到的,所述展现内容包括:初始屏幕展现内容,以及,用户每次滑屏后的屏幕展现内容;确定模块,用于使所述服务器根据所述展现内容的信息获取所述搜索结果中没有被用户浏览到的内容;去除模块,用于在展现点击日志中去除所述没有被用户浏览到的内容的信息。In order to achieve the above object, the data processing device proposed in the embodiment of the fourth aspect of the present invention includes: an acquisition module, configured to acquire the information of the content displayed on the screen of the mobile terminal recorded in the server, and the information of the content displayed is the information of the mobile terminal The search results are displayed and sent to the server. The search results are obtained according to the search words input by the user. The display content includes: the display content of the initial screen, and the screen after each swipe of the user. presenting content; a determining module, configured to enable the server to acquire content that has not been browsed by the user in the search results according to information about the displayed content; a removing module, configured to remove the content that has not been browsed by the user from the display click log information about the content.
本发明第四方面实施例提出的数据处理装置,通过根据移动终端屏幕上展现内容的信息确定没有被用户浏览到的内容,在展现点击日志中去除所述没有被用户浏览到的内容的信息,一方面由于去除了部分信息可以降低CTR模型的输入数据量,另一方面由于去除了没有被用户浏览到的内容的信息,可以避免将不准确的信息作为模型的输入数据,提高模型的预估准确率。The data processing device proposed in the embodiment of the fourth aspect of the present invention determines the content that has not been browsed by the user according to the information displayed on the screen of the mobile terminal, and removes the information of the content that has not been browsed by the user in the display click log, On the one hand, the removal of part of the information can reduce the amount of input data for the CTR model, and on the other hand, the removal of information about content that has not been browsed by users can avoid using inaccurate information as the input data of the model and improve the prediction of the model Accuracy.
为达到上述目的,本发明第五方面实施例提出的移动终端,包括:壳体、处理器、存储器、电路板和电源电路,其中,所述电路板安置在所述壳体围成的空间内部,所述处理器和所述存储器设置在所述电路板上;所述电源电路,用于为所述移动终端的各个电路或器件供电;所述存储器用于存储可执行程序代码;所述处理器通过读取所述存储器中存储的可执行程序代码来运行与所述可执行程序代码对应的程序,以用于:对搜索结果进行展现,所述搜索结果是根据用户输入的搜索词得到的;获取移动终端屏幕上展现内容的信息,所述展现内容包括:初始屏幕展现内容,以及,用户每次滑屏后的屏幕展现内容;将所述展现内容的信息发送给服务器,以使所述服务器记录所述展现内容的信息,所述展现内容的信息用于获取所述搜索结果中没有被用户浏览到的内容,并在展现点击日志中去除所述没有被用户浏览到的内容的信息。本发明第五方面实施例提出的移动终端,通过将移动终端屏幕上展现内容的信息发送给服务器,然后根据移动终端屏幕上展现内容的信息确定没有被用户浏览到的内容,并在展现点击日志中去除没有被用户浏览到的内容的信息,一方面由于去除了部分信息可以降低CTR模型的输入数据量,另一方面由于去除了没有被用户浏览到的内容的信息,可以避免将不准确的信息作为模型的输入数据,提高模型的预估准确率。In order to achieve the above purpose, the mobile terminal proposed in the embodiment of the fifth aspect of the present invention includes: a housing, a processor, a memory, a circuit board, and a power supply circuit, wherein the circuit board is placed inside the space enclosed by the housing , the processor and the memory are arranged on the circuit board; the power supply circuit is used to supply power to each circuit or device of the mobile terminal; the memory is used to store executable program codes; the processing The device executes the program corresponding to the executable program code by reading the executable program code stored in the memory, so as to: present the search result, the search result is obtained according to the search word input by the user Obtain the information of display content on the mobile terminal screen, the display content includes: the initial screen display content, and the screen display content after each sliding screen of the user; the information of the display content is sent to the server, so that the The server records the information of the displayed content, the information of the displayed content is used to obtain the content not browsed by the user in the search result, and removes the information of the content not browsed by the user in the displayed click log. The mobile terminal proposed in the embodiment of the fifth aspect of the present invention sends the information of the content displayed on the screen of the mobile terminal to the server, and then determines the content that has not been browsed by the user according to the information of the content displayed on the screen of the mobile terminal, and displays the click log Remove the information of the content that has not been browsed by the user, on the one hand, because the removal of part of the information can reduce the input data volume of the CTR model, on the other hand, because of the removal of the information of the content that has not been browsed by the user, it can avoid the inaccurate The information is used as the input data of the model to improve the prediction accuracy of the model.
为达到上述目的,本发明第六方面实施例提出的数据处理装置,包括:壳体、处理器、存储器、电路板和电源电路,其中,所述电路板安置在所述壳体围成的空间内部,所述处理器和所述存储器设置在所述电路板上;所述电源电路,用于为所述装置的各个电路或器件供电;所述存储器用于存储可执行程序代码;所述处理器通过读取所述存储器中存储的可执行程序代码来运行与所述可执行程序代码对应的程序,以用于:获取服务器中记录的移动终端屏幕上展现内容的信息,所述展现内容的信息是所述移动终端对搜索结果进行展现后得到并发送给所述服务器的,所述搜索结果是根据用户输入的搜索词得到的,所述展现内容包括:初始屏幕展现内容,以及,用户每次滑屏后的屏幕展现内容;根据所述展现内容的信息获取所述搜索结果中没有被用户浏览到的内容;在展现点击日志中去除所述没有被用户浏览到的内容的信息。本发明第六方面实施例提出的数据处理装置,通过根据移动终端屏幕上展现内容的信息确定没有被用户浏览到的内容,并在展现点击日志中去除没有被用户浏览到的内容的信息,一方面由于去除了部分信息可以降低CTR模型的输入数据量,另一方面由于去除了没有被用户浏览到的内容的信息,可以避免将不准确的信息作为模型的输入数据,提高模型的预估准确率。In order to achieve the above object, the data processing device proposed in the embodiment of the sixth aspect of the present invention includes: a housing, a processor, a memory, a circuit board and a power supply circuit, wherein the circuit board is placed in the space surrounded by the housing Internally, the processor and the memory are arranged on the circuit board; the power supply circuit is used to supply power to each circuit or device of the device; the memory is used to store executable program codes; the processing The device executes the program corresponding to the executable program code by reading the executable program code stored in the memory, so as to: acquire the information of the content displayed on the screen of the mobile terminal recorded in the server, and the information of the displayed content The information is obtained by the mobile terminal after displaying search results and sent to the server. The search results are obtained according to the search words input by the user. The display content includes: initial screen display content, and user The content displayed on the screen after sliding the screen for the second time; according to the information of the displayed content, the content not browsed by the user in the search results is obtained; the information of the content not browsed by the user is removed in the display click log. The data processing device proposed in the embodiment of the sixth aspect of the present invention determines the content that has not been browsed by the user according to the information displayed on the screen of the mobile terminal, and removes the information of the content that has not been browsed by the user from the display click log. On the one hand, the removal of some information can reduce the amount of input data for the CTR model; on the other hand, the removal of information about content that has not been browsed by users can avoid using inaccurate information as the input data of the model and improve the prediction accuracy of the model. Rate.
本发明附加的方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
附图说明Description of drawings
本发明上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present invention will become apparent and easy to understand from the following description of the embodiments in conjunction with the accompanying drawings, wherein:
图1是本发明一实施例提出的数据处理方法的流程示意图;Fig. 1 is a schematic flow chart of a data processing method proposed by an embodiment of the present invention;
图2是本发明另一实施例提出的数据处理方法的流程示意图;Fig. 2 is a schematic flow chart of a data processing method proposed in another embodiment of the present invention;
图3是本发明另一实施例提出的数据处理方法的流程示意图;Fig. 3 is a schematic flow chart of a data processing method proposed in another embodiment of the present invention;
图4是本发明另一实施例提出的数据处理装置的结构示意图;Fig. 4 is a schematic structural diagram of a data processing device according to another embodiment of the present invention;
图5是本发明另一实施例提出的数据处理装置的结构示意图。FIG. 5 is a schematic structural diagram of a data processing device according to another embodiment of the present invention.
具体实施方式detailed description
下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本发明,而不能理解为对本发明的限制。相反,本发明的实施例包括落入所附加权利要求书的精神和内涵范围内的所有变化、修改和等同物。Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention. On the contrary, the embodiments of the present invention include all changes, modifications and equivalents coming within the spirit and scope of the appended claims.
图1为本发明一实施例提出的数据处理方法的流程示意图,该方法包括:Fig. 1 is a schematic flow chart of a data processing method proposed by an embodiment of the present invention, the method comprising:
S11:移动终端对搜索结果进行展现,所述搜索结果是根据用户输入的搜索词得到的。S11: The mobile terminal displays search results, where the search results are obtained according to the search words input by the user.
类似相关技术中,当用户通过移动终端在浏览器中输入搜索词后,会得到与该搜索词对应的搜索结果。Similar to related technologies, when a user inputs a search term in a browser through a mobile terminal, a search result corresponding to the search term will be obtained.
搜索结果可以包括一条或至少两条的内容,当搜索结果包含的内容较多时,由于移动终端屏幕尺寸的限制,通常不能一次全部展现给用户,需要用户不断滑屏获取更多的信息,每次滑屏后可以在上次展现基础上继续展现上次未展现的内容。The search results can include one or at least two items. When the search results contain a lot of content, due to the limitation of the screen size of the mobile terminal, usually they cannot be displayed to the user at one time, and the user needs to continuously swipe the screen to obtain more information. After sliding the screen, you can continue to display the content that was not displayed last time on the basis of the last display.
另外,本发明实施例中的滑屏不限于通过触摸屏幕实现的滑屏,也包括传统的采用按键方式使得展现内容不断变化的方式。In addition, the sliding screen in the embodiment of the present invention is not limited to the sliding screen realized by touching the screen, and also includes the traditional method of using buttons to make the displayed content change continuously.
S12:移动终端获取移动终端屏幕上展现内容的信息,所述展现内容包括:初始屏幕展现内容,以及,用户每次滑屏后的屏幕展现内容。S12: The mobile terminal acquires information about the content displayed on the screen of the mobile terminal, the displayed content includes: the content displayed on the initial screen, and the content displayed on the screen after each swipe of the screen by the user.
在本实施例中,可以在移动终端搜索时,网页中植入JS(JavaScript)代码,监控用户的滑屏操作,以便获取初始展现后的展现内容的信息以及每次滑屏后的展现内容的信息。In this embodiment, JS (JavaScript) code can be embedded in the webpage when the mobile terminal is searching, and the user's sliding screen operation can be monitored, so as to obtain the information of the displayed content after the initial display and the information of the displayed content after each sliding screen information.
S13:移动终端将所述展现内容的信息发送给服务器,以使所述服务器记录所述展现内容的信息,所述展现内容的信息用于获取所述搜索结果中没有被用户浏览到的内容,并在展现点击日志中去除所述没有被用户浏览到的内容的信息。S13: The mobile terminal sends the information of the displayed content to the server, so that the server records the information of the displayed content, and the information of the displayed content is used to obtain the content in the search result that has not been browsed by the user, And remove the information about the content that has not been browsed by the user in the display click log.
其中,可以根据上述的JS代码将移动终端屏幕上展现内容的信息发送给服务器,例如在用户每次滑屏后,将相应滑屏后的展现内容的信息发送给服务器。服务器接收到该展现内容的信息后可以进行记录,例如记录在滑屏日志中。后续流程中,例如进行CTR建模时可以从服务器中获取该展现内容的信息,并根据该展现内容的信息获取所述搜索结果中没有被用户浏览到的内容,并在展现点击日志中去除所述没有被用户浏览到的内容的信息。Wherein, the information of the displayed content on the screen of the mobile terminal can be sent to the server according to the above JS code, for example, after the user swipes the screen each time, the information of the displayed content after the corresponding swipe is sent to the server. After the server receives the information of the display content, it may record it, for example, record it in a sliding screen log. In the subsequent process, for example, when performing CTR modeling, the information of the displayed content can be obtained from the server, and according to the information of the displayed content, the content that is not browsed by the user in the search results can be obtained, and all displayed content can be removed from the displayed click log. Information about content that has not been browsed by the user.
进一步的,该展现内容的信息可以具体为移动终端屏幕上最后一条展现内容的信息,例如,移动终端屏幕上从上至下展现了三条展现内容,可以只将第三条展现内容的信息发送给服务器。Further, the display content information may specifically be the last display content information on the screen of the mobile terminal. For example, three display contents are displayed from top to bottom on the mobile terminal screen, and only the third display content information may be sent to server.
进一步的,该最后一条展现内容的信息可以具体包括:最后一条展现内容的编号,以及,最后一条展现内容的展现高度。由于搜索结果是由服务器提供的,因此,服务器会记录搜索结果中每条内容的信息。其中,服务器可以将每条内容按照从上到下的展现顺序进行编号,例如,服务器得到多条搜索结果后,根据预定算法可以确定各条搜索结果的从上到下的排列顺序,之后服务器可以按照从上到下的顺序对各条搜索结果进行从小到大的编号。这样,当服务器接收到屏幕上最后一条展现内容的编号(如3)时,就可以确定出之前编号(如1、2)的展现内容是已经被用户浏览的。另外,服务器中还会记录每条搜索结果的整体高度,而在移动终端中最后一条搜索结果可能是不能全部展现的,例如只展现全部高度的一部分,服务器可以根据该展现高度与整体高度确定最后一条是否属于被用户浏览到的内容。Further, the information of the last displayed content may specifically include: the serial number of the last displayed content, and the displayed height of the last displayed content. Since the search results are provided by the server, the server will record the information of each piece of content in the search results. Wherein, the server can number each piece of content according to the display order from top to bottom. For example, after the server obtains multiple search results, it can determine the order of each search result from top to bottom according to a predetermined algorithm, and then the server can Number each search result from small to large in order from top to bottom. In this way, when the server receives the number (such as 3) of the last displayed content on the screen, it can be determined that the displayed content with the previous number (such as 1, 2) has been browsed by the user. In addition, the server will also record the overall height of each search result, and the last search result may not be fully displayed on the mobile terminal, for example, only a part of the entire height is displayed, and the server can determine the final height based on the displayed height and the overall height. Whether a piece belongs to the content browsed by the user.
移动终端发送的屏幕上展现内容的信息可以记录在服务器中,以便后续流程中采用,例如,在CTR建模时可以根据该信息确定没有被用户浏览到的内容,进而在展现点击日志中去除该没有被用户浏览到的内容的信息,避免CTR模型中输入不准确的数据。The information on the content displayed on the screen sent by the mobile terminal can be recorded in the server for use in subsequent processes. For example, during CTR modeling, it can be determined based on this information that the content that has not been browsed by the user can be removed from the display click log. There is no information about the content browsed by the user, so as to avoid inputting inaccurate data in the CTR model.
本实施例通过将移动终端屏幕上展现内容的信息发送给服务器,使得服务器中记录有该展现内容的信息,进而在后续流程中可以根据移动终端屏幕上展现内容的信息确定没有被用户浏览到的内容,并在展现点击日志中去除没有被用户浏览到的内容的信息,一方面由于去除了部分信息可以降低CTR模型的输入数据量,另一方面由于去除了没有被用户浏览到的内容的信息,可以避免将不准确的信息作为模型的输入数据,提高模型的预估准确率。In this embodiment, the information of the displayed content on the screen of the mobile terminal is sent to the server, so that the information of the displayed content is recorded in the server, and then in the subsequent process, it can be determined according to the information of the displayed content on the mobile terminal screen that has not been browsed by the user. content, and remove information about content that has not been browsed by users in the display click log. On the one hand, the amount of input data for the CTR model can be reduced by removing part of the information; , can avoid using inaccurate information as the input data of the model, and improve the prediction accuracy of the model.
图2为本发明另一实施例提出的数据处理方法的流程示意图,该方法包括:Fig. 2 is a schematic flow chart of a data processing method proposed in another embodiment of the present invention, the method comprising:
S21:获取服务器中记录的移动终端屏幕上展现内容的信息,所述展现内容的信息是所述移动终端对搜索结果进行展现后得到并发送给所述服务器的,所述搜索结果是根据用户输入的搜索词得到的,所述展现内容包括:初始屏幕展现内容,以及,用户每次滑屏后的屏幕展现内容。S21: Obtain information about the content displayed on the screen of the mobile terminal recorded in the server. The information about the displayed content is obtained by the mobile terminal after displaying the search results and sent to the server. The search results are based on user input The displayed content includes: the displayed content on the initial screen, and the displayed content on the screen after each swipe of the user.
所述展现内容的信息可以为移动终端屏幕上最后一条展现内容的信息。The information on displaying content may be the last piece of information on displaying content on the screen of the mobile terminal.
进一步的,所述最后一条展现内容的信息,包括:最后一条展现内容的编号,以及最后一条展现内容的展现高度。Further, the information about the last displayed content includes: the serial number of the last displayed content, and the displayed height of the last displayed content.
S22:根据所述展现内容的信息获取所述搜索结果中没有被用户浏览到的内容。S22: Obtain content that has not been browsed by the user in the search results according to the information of the displayed content.
例如,将在所述最后一条展现内容的编号之前的展现内容确定为已经被用户浏览到的内容;以及,For example, determining the display content before the number of the last piece of display content as the content that has been browsed by the user; and,
根据所述最后一条展现内容的展现高度确定所述最后一条展现内容是否为被用户浏览到的内容。Determine whether the last piece of displayed content is content browsed by the user according to the display height of the last piece of displayed content.
进一步的,可以在所述最后一条展现内容的展现高度与所述最后一条展现内容的整体高度之间的比值大于预设的阈值时,确定所述最后一条展现内容为被用户浏览到的内容。Further, when the ratio between the display height of the last displayed content and the overall height of the last displayed content is greater than a preset threshold, it may be determined that the last displayed content is the content browsed by the user.
进一步的,该预设的阈值可以具体为1/4。Further, the preset threshold may be specifically 1/4.
S23:在展现点击日志中去除所述没有被用户浏览到的内容的信息。S23: Remove information about the content not browsed by the user in the display click log.
其中,相关技术中,展现点击日志中会记录每条搜索结果的信息,不论该条搜索结果是否被用户浏览到。例如,用户输入一搜索词后,服务器加载与该搜索词对应的所有搜索结果,假设所有的搜索结果的数目是8条,那么相关技术中展现点击日志中会包含这8条搜索结果的信息,当用户点击其中的某条搜索结果时就是正例,否则为负例。但是,这8条搜索结果中可能会存在根本没出现在屏幕上的结果,,相应的,用户不可能浏览到,造成数据不准确。Wherein, in the related technology, information of each search result is recorded in the display click log, no matter whether the search result is browsed by the user or not. For example, after the user enters a search term, the server loads all search results corresponding to the search term. Assuming that the number of all search results is 8, then related technologies show that the click log will contain the information of these 8 search results. When the user clicks on one of the search results, it is a positive example, otherwise it is a negative example. However, among the eight search results, there may be results that do not appear on the screen at all, and correspondingly, it is impossible for the user to browse, resulting in inaccurate data.
而本实施例中,对于用户没有浏览到的内容,在展现点击日志中去除该用户没有浏览到的内容的信息,避免采用不准确的数据进行预测。However, in this embodiment, for the content that the user has not browsed, the information about the content that the user has not browsed is removed from the display click log, so as to avoid using inaccurate data for prediction.
其中,本实施例的执行主体可以是数据处理装置,通过该装置可以得到处理后的展现点击日志,该处理后的展现点击日志可以用在CTR建模等场合。Wherein, the execution subject of this embodiment may be a data processing device, through which a processed display click log can be obtained, and the processed display click log can be used in occasions such as CTR modeling.
本实施例通过根据移动终端屏幕上展现内容的信息确定没有被用户浏览到的内容,并在展现点击日志中去除没有被用户浏览到的内容的信息,一方面由于去除了部分信息可以降低CTR模型的输入数据量,另一方面由于去除了没有被用户浏览到的内容的信息,可以避免将不准确的信息作为模型的输入数据,提高模型的预估准确率。This embodiment determines the content that has not been browsed by the user according to the information displayed on the screen of the mobile terminal, and removes the information of the content that has not been browsed by the user in the display click log. On the one hand, the CTR model can be reduced due to the removal of part of the information. The amount of input data, on the other hand, because the information of the content that has not been browsed by the user is removed, it can avoid using inaccurate information as the input data of the model, and improve the prediction accuracy of the model.
图3为本发明另一实施例提出的数据处理方法的流程示意图,本实施例中,以移动终端为手机,搜索结果是广告为例。参见图3,本实施例包括:FIG. 3 is a schematic flowchart of a data processing method proposed by another embodiment of the present invention. In this embodiment, the mobile terminal is a mobile phone and the search result is an advertisement as an example. Referring to Fig. 3, the present embodiment includes:
S301:手机接收用户输入的搜索词。S301: The mobile phone receives a search term input by a user.
S302:手机将搜索词发送给服务器。S302: The mobile phone sends the search term to the server.
S303:服务器采用预设算法对搜索词进行搜索,得到多条广告。以及,服务器对从上至下展现的广告依次进行编号,并记录每条广告的整体高度。S303: The server uses a preset algorithm to search for the search term, and obtains multiple advertisements. And, the server sequentially numbers the advertisements displayed from top to bottom, and records the overall height of each advertisement.
S304:服务器将多条广告发送给手机。S304: The server sends multiple advertisements to the mobile phone.
S305:手机对多条广告进行初始展现。S305: The mobile phone initially displays multiple advertisements.
S306:手机将初始展现后屏幕最后一条广告的编号和展现高度发送给服务器。S306: The mobile phone sends the serial number and display height of the last advertisement on the screen after the initial display to the server.
S307:服务器将接收的信息,即编号和展现高度记录在滑屏日志中。S307: The server records the received information, that is, the serial number and the presentation height, in the sliding screen log.
S308:手机在用户每次滑屏后,获取每次滑屏后最后一条广告的编号和展现高度。S308: After the user swipes the screen each time, the mobile phone obtains the serial number and display height of the last advertisement after each swipe.
S309:在每次滑屏后,将每次滑屏后最后一条广告的编号和展现高度发送给服务器。S309: After each screen swipe, send the serial number and display height of the last advertisement after each swipe to the server.
S310:服务器将接收的信息,即每次滑屏后最后一条广告的编号和展现高度,记录在滑屏日志中。S310: The server records the information received by the server, that is, the serial number and display height of the last advertisement after each screen swipe, into the swipe log.
其中,滑屏日志中记录每次状态(初始及每次滑屏后)后展现内容的信息(如,最后一条广告的编号和展现高度)。Wherein, the sliding screen log records the information of the displayed content after each state (initial and after each sliding screen) (for example, the serial number and display height of the last advertisement).
S311:数据处理装置从服务器中获取滑屏日志。S311: The data processing device obtains the sliding screen log from the server.
其中,数据处理装置可以是离线的后期处理人员使用的装置,该装置用于对展现点击日志进行处理。该装置可以位于服务器中或者位于服务器之外。Wherein, the data processing device may be a device used by offline post-processing personnel, and the device is used for processing the presentation click log. The device may be located in the server or external to the server.
S312:数据处理装置根据滑屏日志中记录的所有状态时的信息,确定出用户所有滑屏后都没有被用户浏览到的广告。S312: The data processing device determines the advertisements that have not been browsed by the user after all the user swipes, according to the information of all states recorded in the swipe log.
例如,服务器得到的所有搜索结果共有10条广告,用户滑屏了两次,初始时记录的最后一条广告的信息是:(num3,h1),第一次滑屏后记录的最后一条广告的信息是(num5,h2),第二次滑屏后记录的最后一条广告的信息是(num9,h3),由于最后一次滑屏的最后展现的广告的编号时9,那么可以确定编号1-8的广告是被浏览到的,编号10的广告是没有被用户浏览到的,对于编号9的广告,可以根据展现高度和整体高度确定,例如,假设编号9的广告的整体高度是H,当h3/H的比值大于预设阈值时,可以表明编号9的广告是被用户浏览到的,否则为没有被用户浏览到的。For example, there are 10 advertisements in all the search results obtained by the server. The user swipe the screen twice. The information of the last ad initially recorded is: (num3,h1), and the information of the last ad recorded after the first swipe It is (num5, h2), the information of the last advertisement recorded after the second slide screen is (num9, h3), since the number of the last advertisement shown in the last slide screen is 9, then the number 1-8 can be determined The advertisement is viewed, and the advertisement number 10 is not viewed by the user. For the advertisement number 9, it can be determined according to the display height and the overall height. For example, suppose the overall height of the advertisement number 9 is H, when h3/ When the ratio of H is greater than the preset threshold, it may indicate that the advertisement numbered 9 has been viewed by the user, otherwise it has not been viewed by the user.
S313:数据处理装置在展现点击日志中去除没有被用户浏览到的广告的信息。S313: The data processing device removes information about advertisements not browsed by the user from the display click log.
以上述例子进行说明,相关技术中,展现点击日志中会记录编号1~10的广告的信息,而本实施例中,假设编号9的广告也是没有被用户浏览到的,那么本申请的展现点击日志中会记录编号1~8的广告的信息,不再包含后面两个广告的信息。Using the above example to illustrate, in related technologies, information about advertisements numbered 1 to 10 will be recorded in the display click log. The log will record the information of advertisements numbered 1 to 8, and no longer include the information of the latter two advertisements.
这样对于CTR模型来讲,其对于展现在页面上方的广告,如编号1-8的广告,采用的就是原始的展现点击日志,对于展现在页面下方的广告,如编号9-10的广告,采用的是过滤后的展现点击日志。CTR模型对输入模块的分开处理,可以表明采用错误的数据,保证预测准确性。In this way, for the CTR model, for the advertisements displayed at the top of the page, such as the advertisements numbered 1-8, the original display click log is used; for the advertisements displayed at the bottom of the page, such as the advertisements numbered 9-10, the is the filtered impression click log. The separate treatment of the input modules by the CTR model can indicate that wrong data is used, ensuring predictive accuracy.
本实施例通过手机获知展现内容的信息,服务器记录该展现内容的信息,数据处理装置根据该信息在展现点击日志中去除没有被用户浏览到的内容,一方面由于去除了部分信息可以降低CTR模型的输入数据量,另一方面由于去除了没有被用户浏览到的内容的信息,可以避免将不准确的信息作为模型的输入数据,提高模型的预估准确率。In this embodiment, the information of the display content is obtained through the mobile phone, and the server records the information of the display content. The data processing device removes the content that is not browsed by the user from the display click log according to the information. On the one hand, the CTR model can be reduced due to the removal of part of the information. The amount of input data, on the other hand, because the information of the content that has not been browsed by the user is removed, it can avoid using inaccurate information as the input data of the model, and improve the prediction accuracy of the model.
图4为本发明另一实施例提出的数据处理装置的结构示意图,该装置40包括展现模块41、获取模块42和发送模块43。FIG. 4 is a schematic structural diagram of a data processing device according to another embodiment of the present invention. The device 40 includes a presentation module 41 , an acquisition module 42 and a sending module 43 .
展现模块41用于对搜索结果进行展现,所述搜索结果是根据用户输入的搜索词得到的;Presentation module 41 is used for presenting the search result, and the search result is obtained according to the search term input by the user;
类似相关技术中,当用户通过移动终端在浏览器中输入搜索词后,会得到与该搜索词对应的搜索结果。Similar to related technologies, when a user inputs a search term in a browser through a mobile terminal, a search result corresponding to the search term will be obtained.
搜索结果可以包括一条或至少两条的内容,当搜索结果包含的内容较多时,由于移动终端屏幕尺寸的限制,通常不能一次全部展现给用户,需要用户不断滑屏获取更多的信息,每次滑屏后可以在上次展现基础上继续展现上次未展现的内容。The search results can include one or at least two items. When the search results contain a lot of content, due to the limitation of the screen size of the mobile terminal, usually they cannot be displayed to the user at one time, and the user needs to continuously swipe the screen to obtain more information. After sliding the screen, you can continue to display the content that was not displayed last time on the basis of the last display.
另外,本发明实施例中的滑屏不限于通过触摸屏幕实现的滑屏,也包括传统的采用按键方式使得展现内容不断变化的方式。In addition, the sliding screen in the embodiment of the present invention is not limited to the sliding screen realized by touching the screen, and also includes the traditional method of using buttons to make the displayed content change continuously.
获取模块42用于获取移动终端屏幕上展现内容的信息,所述展现内容包括:初始屏幕展现内容,以及,用户每次滑屏后的屏幕展现内容;The acquiring module 42 is used to acquire the information of the displayed content on the screen of the mobile terminal, and the displayed content includes: the initial screen displayed content, and the screen displayed content after the user slides each time;
在本实施例中,可以在移动终端搜索时,网页中植入JS(JavaScript)代码,监控用户的滑屏操作,以便获取初始展现后的展现内容的信息以及每次滑屏后的展现内容的信息。In this embodiment, JS (JavaScript) code can be embedded in the webpage when the mobile terminal is searching, and the user's sliding screen operation can be monitored, so as to obtain the information of the displayed content after the initial display and the information of the displayed content after each sliding screen information.
发送模块43用于将所述展现内容的信息发送给服务器,以使所述服务器记录所述展现内容的信息,所述展现内容的信息用于获取所述搜索结果中没有被用户浏览到的内容,并在展现点击日志中去除所述没有被用户浏览到的内容的信息。The sending module 43 is configured to send the information of the displayed content to the server, so that the server records the information of the displayed content, and the information of the displayed content is used to obtain the content that has not been browsed by the user in the search result , and remove the information about the content that has not been browsed by the user in the display click log.
其中,可以根据上述的JS代码将移动终端屏幕上展现内容的信息发送给服务器,例如在用户每次滑屏后,将相应滑屏后的展现内容的信息发送给服务器。服务器接收到该展现内容的信息后可以进行记录,例如记录在滑屏日志中。后续流程中,例如进行CTR建模时可以从服务器中获取该展现内容的信息,并根据该展现内容的信息获取所述搜索结果中没有被用户浏览到的内容,并在展现点击日志中去除所述没有被用户浏览到的内容的信息。Wherein, the information of the displayed content on the screen of the mobile terminal can be sent to the server according to the above JS code, for example, after the user swipes the screen each time, the information of the displayed content after the corresponding swipe is sent to the server. After the server receives the information of the display content, it may record it, for example, record it in a sliding screen log. In the subsequent process, for example, when performing CTR modeling, the information of the displayed content can be obtained from the server, and according to the information of the displayed content, the content that is not browsed by the user in the search results can be obtained, and all displayed content can be removed from the displayed click log. Information about content that has not been browsed by the user.
进一步的,展现内容的信息可以具体为移动终端屏幕上最后一条展现内容的信息,例如,移动终端屏幕上从上至下展现了三条展现内容,可以只将第三条展现内容的信息发送给服务器。Further, the displayed content information may specifically be the last displayed content information on the screen of the mobile terminal. For example, three displayed content items are displayed from top to bottom on the mobile terminal screen, and only the third displayed content information may be sent to the server .
进一步的,该最后一条展现内容的信息可以具体包括:最后一条展现内容的编号,以及,最后一条展现内容的展现高度。Further, the information of the last displayed content may specifically include: the serial number of the last displayed content, and the displayed height of the last displayed content.
由于搜索结果是由服务器提供的,因此,服务器会记录搜索结果中每条内容的信息。其中,服务器可以将每条内容按照从上到下的展现顺序进行编号,例如,服务器得到多条搜索结果后,根据预定算法可以确定各条搜索结果的从上到下的排列顺序,之后服务器可以按照从上到下的顺序对各条搜索结果进行从小到大的编号。这样,当服务器接收到屏幕上最后一条展现内容的编号(如3)时,就可以确定出之前编号(如1、2)的展现内容是已经被用户浏览的。另外,服务器中还会记录每条搜索结果的整体高度,而在移动终端中最后一条搜索结果可能是不能全部展现的,例如只展现全部高度的一部分,服务器可以根据该展现高度与整体高度确定最后一条是否属于被用户浏览到的内容。Since the search results are provided by the server, the server will record the information of each piece of content in the search results. Wherein, the server can number each piece of content according to the display order from top to bottom. For example, after the server obtains multiple search results, it can determine the order of each search result from top to bottom according to a predetermined algorithm, and then the server can Number each search result from small to large in order from top to bottom. In this way, when the server receives the number (such as 3) of the last displayed content on the screen, it can be determined that the displayed content with the previous number (such as 1, 2) has been browsed by the user. In addition, the server will also record the overall height of each search result, and the last search result may not be fully displayed on the mobile terminal, for example, only a part of the entire height is displayed, and the server can determine the final height based on the displayed height and the overall height. Whether a piece belongs to the content browsed by the user.
移动终端发送的屏幕上展现内容的信息可以记录在服务器中,以便后续流程中采用,例如,在CTR建模时可以根据该信息确定没有被用户浏览到的内容,进而在展现点击日志中去除该没有被用户浏览到的内容的信息,避免CTR模型中输入不准确的数据。The information on the content displayed on the screen sent by the mobile terminal can be recorded in the server for use in subsequent processes. For example, during CTR modeling, it can be determined based on this information that the content that has not been browsed by the user can be removed from the display click log. There is no information about the content browsed by the user, so as to avoid inputting inaccurate data in the CTR model.
本实施例通过将移动终端屏幕上展现内容的信息发送给服务器,使得服务器中记录有该展现内容的信息,进而在后续流程中可以根据移动终端屏幕上展现内容的信息确定没有被用户浏览到的内容,并在展现点击日志中去除没有被用户浏览到的内容的信息,一方面由于去除了部分信息可以降低CTR模型的输入数据量,另一方面由于去除了没有被用户浏览到的内容的信息,可以避免将不准确的信息作为模型的输入数据,提高模型的预估准确率。In this embodiment, the information of the displayed content on the screen of the mobile terminal is sent to the server, so that the information of the displayed content is recorded in the server, and then in the subsequent process, it can be determined according to the information of the displayed content on the mobile terminal screen that has not been browsed by the user. content, and remove information about content that has not been browsed by users in the display click log. On the one hand, the amount of input data for the CTR model can be reduced by removing part of the information; , can avoid using inaccurate information as the input data of the model, and improve the prediction accuracy of the model.
图5为本发明另一个实施例提供的一种数据处理装置的结构示意图,该装置50包括获取模块51、确定模块52和去除模块53。FIG. 5 is a schematic structural diagram of a data processing device provided by another embodiment of the present invention. The device 50 includes an acquisition module 51 , a determination module 52 and a removal module 53 .
获取模块51用于获取服务器中记录的移动终端屏幕上展现内容的信息,所述展现内容的信息是所述移动终端对搜索结果进行展现后得到并发送给所述服务器的,所述搜索结果是根据用户输入的搜索词得到的,所述展现内容包括:初始屏幕展现内容,以及,用户每次滑屏后的屏幕展现内容;The obtaining module 51 is used to obtain the information of the content displayed on the screen of the mobile terminal recorded in the server. The information of the displayed content is obtained by the mobile terminal after displaying the search result and sent to the server. The search result is Obtained according to the search words input by the user, the display content includes: the initial screen display content, and the screen display content after the user swipe each time;
所述展现内容的信息可以为移动终端屏幕上最后一条展现内容的信息。The information on displaying content may be the last piece of information on displaying content on the screen of the mobile terminal.
进一步的,所述最后一条展现内容的信息,包括:最后一条展现内容的编号,以及最后一条展现内容的展现高度。Further, the information about the last displayed content includes: the serial number of the last displayed content, and the displayed height of the last displayed content.
确定模块52用于根据所述展现内容的信息获取所述搜索结果中没有被用户浏览到的内容;The determining module 52 is configured to obtain content in the search results that has not been browsed by the user according to the information of the displayed content;
例如,在所述最后一条展现内容的编号之前的展现内容确定为已经被用户浏览到的内容;以及,For example, the display content before the number of the last piece of display content is determined to be the content that has been browsed by the user; and,
根据所述最后一条展现内容的展现高度确定所述最后一条展现内容是否为被用户浏览到的内容。Determine whether the last piece of displayed content is content browsed by the user according to the display height of the last piece of displayed content.
进一步的,可以在所述最后一条展现内容的展现高度与所述最后一条展现内容的整体高度之间的比值大于预设的阈值时,确定所述最后一条展现内容为被用户浏览到的内容。Further, when the ratio between the display height of the last displayed content and the overall height of the last displayed content is greater than a preset threshold, it may be determined that the last displayed content is the content browsed by the user.
进一步的,该预设的阈值可以具体为1/4。Further, the preset threshold may be specifically 1/4.
去除模块53用于在展现点击日志中去除所述没有被用户浏览到的内容的信息。The removing module 53 is used for removing the information of the content not browsed by the user in the display click log.
其中,相关技术中,展现点击日志中会记录每条搜索结果的信息,不论该条搜索结果是否被用户浏览到。例如,用户输入一搜索词后,服务器加载与该搜索词对应的所有搜索结果,假设所有的搜索结果的数目是8条,那么相关技术中展现点击日志中会包含这8条搜索结果的信息,当用户点击其中的某条搜索结果时就是正例,否则为负例。但是,这8条搜索结果中可能会存在根本没出现在屏幕上的结果,相应的,用户不可能浏览到,造成数据不准确。Wherein, in the related technology, information of each search result is recorded in the display click log, no matter whether the search result is browsed by the user or not. For example, after the user enters a search term, the server loads all search results corresponding to the search term. Assuming that the number of all search results is 8, then related technologies show that the click log will contain the information of these 8 search results. When the user clicks on one of the search results, it is a positive example, otherwise it is a negative example. However, among the eight search results, there may be results that do not appear on the screen at all. Correspondingly, it is impossible for the user to browse, resulting in inaccurate data.
而本实施例中,对于用户没有浏览到的内容,在展现点击日志中去除该用户没有浏览到的内容的信息,避免采用不准确的数据进行预测。However, in this embodiment, for the content that the user has not browsed, the information about the content that the user has not browsed is removed from the display click log, so as to avoid using inaccurate data for prediction.
本实施例通过接收移动终端发送的所述移动终端屏幕上展现内容的信息,所述展现内容的信息是所述移动终端对搜索结果进行展现后得到的,所述搜索结果是根据用户输入的搜索词得到的;根据所述展现内容的信息获取所述搜索结果中没有被用户浏览到的内容;在展现点击日志中去除所述没有被用户浏览到的内容的信息,一方面由于去除了部分信息可以降低CTR模型的输入数据量,另一方面由于去除了没有被用户浏览到的内容的信息,可以避免将不准确的信息作为模型的输入数据,提高模型的预估准确率。In this embodiment, the information of the display content on the screen of the mobile terminal sent by the mobile terminal is received. The information of the display content is obtained after the mobile terminal displays the search results, and the search results are based on the search results input by the user. words; according to the information of the displayed content, obtain the content that has not been browsed by the user in the search results; remove the information of the content that has not been browsed by the user in the display click log, on the one hand, due to the removal of part of the information It can reduce the amount of input data of the CTR model. On the other hand, because the information of the content that has not been browsed by the user is removed, it can avoid using inaccurate information as the input data of the model and improve the prediction accuracy of the model.
本发明实施例还提供了一种移动终端,该移动终端包括壳体、处理器、存储器、电路板和电源电路,其中,电路板安置在壳体围成的空间内部,处理器和存储器设置在电路板上;电源电路,用于为移动终端的各个电路或器件供电;存储器用于存储可执行程序代码;处理器通过读取存储器中存储的可执行程序代码来运行与可执行程序代码对应的程序,以用于执行以下步骤:S11’:移动终端对搜索结果进行展现,所述搜索结果是根据用户输入的搜索词得到的。An embodiment of the present invention also provides a mobile terminal, which includes a housing, a processor, a memory, a circuit board, and a power supply circuit, wherein the circuit board is placed inside the space enclosed by the housing, and the processor and the memory are arranged in On the circuit board; the power supply circuit is used to supply power to each circuit or device of the mobile terminal; the memory is used to store the executable program code; the processor runs the program corresponding to the executable program code by reading the executable program code stored in the memory The program is used to execute the following steps: S11': The mobile terminal displays the search results obtained according to the search words input by the user.
类似相关技术中,当用户通过移动终端在浏览器中输入搜索词后,会得到与该搜索词对应的搜索结果。Similar to related technologies, when a user inputs a search term in a browser through a mobile terminal, a search result corresponding to the search term will be obtained.
搜索结果可以包括一条或至少两条的内容,当搜索结果包含的内容较多时,由于移动终端屏幕尺寸的限制,通常不能一次全部展现给用户,需要用户不断滑屏获取更多的信息,每次滑屏后可以在上次展现基础上继续展现上次未展现的内容。The search results can include one or at least two items. When the search results contain a lot of content, due to the limitation of the screen size of the mobile terminal, usually they cannot be displayed to the user at one time, and the user needs to continuously swipe the screen to obtain more information. After sliding the screen, you can continue to display the content that was not displayed last time on the basis of the last display.
另外,本发明实施例中的滑屏不限于通过触摸屏幕实现的滑屏,也包括传统的采用按键方式使得展现内容不断变化的方式。In addition, the sliding screen in the embodiment of the present invention is not limited to the sliding screen realized by touching the screen, and also includes the traditional method of using buttons to make the displayed content change continuously.
S12’:移动终端获取移动终端屏幕上展现内容的信息,所述展现内容包括:初始屏幕展现内容,以及,用户每次滑屏后的屏幕展现内容。S12': The mobile terminal acquires information about content displayed on the screen of the mobile terminal. The displayed content includes: the content displayed on the initial screen, and the content displayed on the screen after each swipe of the user.
在本实施例中,可以在移动终端搜索时,网页中植入JS(JavaScript)代码,监控用户的滑屏操作,以便获取初始展现后的展现内容的信息以及每次滑屏后的展现内容的信息。In this embodiment, JS (JavaScript) code can be embedded in the webpage when the mobile terminal is searching, and the user's sliding screen operation can be monitored, so as to obtain the information of the displayed content after the initial display and the information of the displayed content after each sliding screen information.
S13’:移动终端将所述展现内容的信息发送给服务器,以使所述服务器记录所述展现内容的信息,所述展现内容的信息用于获取所述搜索结果中没有被用户浏览到的内容,并在展现点击日志中去除所述没有被用户浏览到的内容的信息。S13': The mobile terminal sends the information of the displayed content to the server, so that the server records the information of the displayed content, and the information of the displayed content is used to obtain the content in the search result that is not browsed by the user , and remove the information about the content that has not been browsed by the user in the display click log.
其中,可以根据上述的JS代码将移动终端屏幕上展现内容的信息发送给服务器,例如在用户每次滑屏后,将相应滑屏后的展现内容的信息发送给服务器。服务器接收到该展现内容的信息后可以进行记录,例如记录在滑屏日志中。后续流程中,例如进行CTR建模时可以从服务器中获取该展现内容的信息,并根据该展现内容的信息获取所述搜索结果中没有被用户浏览到的内容,并在展现点击日志中去除所述没有被用户浏览到的内容的信息。Wherein, the information of the displayed content on the screen of the mobile terminal can be sent to the server according to the above JS code, for example, after the user swipes the screen each time, the information of the displayed content after the corresponding swipe is sent to the server. After the server receives the information of the display content, it may record it, for example, record it in a sliding screen log. In the subsequent process, for example, when performing CTR modeling, the information of the displayed content can be obtained from the server, and according to the information of the displayed content, the content that is not browsed by the user in the search results can be obtained, and all displayed content can be removed from the displayed click log. Information about content that has not been browsed by the user.
进一步的,该展现内容的信息可以具体为移动终端屏幕上最后一条展现内容的信息,例如,移动终端屏幕上从上至下展现了三条展现内容,可以只将第三条展现内容的信息发送给服务器。Further, the display content information may specifically be the last display content information on the screen of the mobile terminal. For example, three display contents are displayed from top to bottom on the mobile terminal screen, and only the third display content information may be sent to server.
进一步的,该最后一条展现内容的信息可以具体包括:最后一条展现内容的编号,以及,最后一条展现内容的展现高度。由于搜索结果是由服务器提供的,因此,服务器会记录搜索结果中每条内容的信息。其中,服务器可以将每条内容按照从上到下的展现顺序进行编号,例如,服务器得到多条搜索结果后,根据预定算法可以确定各条搜索结果的从上到下的排列顺序,之后服务器可以按照从上到下的顺序对各条搜索结果进行从小到大的编号。这样,当服务器接收到屏幕上最后一条展现内容的编号(如3)时,就可以确定出之前编号(如1、2)的展现内容是已经被用户浏览的。另外,服务器中还会记录每条搜索结果的整体高度,而在移动终端中最后一条搜索结果可能是不能全部展现的,例如只展现全部高度的一部分,服务器可以根据该展现高度与整体高度确定最后一条是否属于被用户浏览到的内容。Further, the information of the last displayed content may specifically include: the serial number of the last displayed content, and the displayed height of the last displayed content. Since the search results are provided by the server, the server will record the information of each piece of content in the search results. Wherein, the server can number each piece of content according to the display order from top to bottom. For example, after the server obtains multiple search results, it can determine the order of each search result from top to bottom according to a predetermined algorithm, and then the server can Number each search result from small to large in order from top to bottom. In this way, when the server receives the number (such as 3) of the last displayed content on the screen, it can be determined that the displayed content with the previous number (such as 1, 2) has been browsed by the user. In addition, the server will also record the overall height of each search result, and the last search result may not be fully displayed on the mobile terminal, for example, only a part of the entire height is displayed, and the server can determine the final height based on the displayed height and the overall height. Whether a piece belongs to the content browsed by the user.
移动终端发送的屏幕上展现内容的信息可以记录在服务器中,以便后续流程中采用,例如,在CTR建模时可以根据该信息确定没有被用户浏览到的内容,进而在展现点击日志中去除该没有被用户浏览到的内容的信息,避免CTR模型中输入不准确的数据。The information on the content displayed on the screen sent by the mobile terminal can be recorded in the server for use in subsequent processes. For example, during CTR modeling, it can be determined based on this information that the content that has not been browsed by the user can be removed from the display click log. There is no information about the content browsed by the user, so as to avoid inputting inaccurate data in the CTR model.
本实施例通过将移动终端屏幕上展现内容的信息发送给服务器,使得服务器中记录有该展现内容的信息,进而在后续流程中可以根据移动终端屏幕上展现内容的信息确定没有被用户浏览到的内容,并在展现点击日志中去除没有被用户浏览到的内容的信息,一方面由于去除了部分信息可以降低CTR模型的输入数据量,另一方面由于去除了没有被用户浏览到的内容的信息,可以避免将不准确的信息作为模型的输入数据,提高模型的预估准确率。In this embodiment, the information of the displayed content on the screen of the mobile terminal is sent to the server, so that the information of the displayed content is recorded in the server, and then in the subsequent process, it can be determined according to the information of the displayed content on the mobile terminal screen that has not been browsed by the user. content, and remove information about content that has not been browsed by users in the display click log. On the one hand, the amount of input data for the CTR model can be reduced by removing part of the information; , can avoid using inaccurate information as the input data of the model, and improve the prediction accuracy of the model.
本发明实施例还提供了一种数据处理装置,该装置包括壳体、处理器、存储器、电路板和电源电路,其中,电路板安置在壳体围成的空间内部,处理器和存储器设置在电路板上;电源电路,用于为装置的各个电路或器件供电;存储器用于存储可执行程序代码;处理器通过读取存储器中存储的可执行程序代码来运行与可执行程序代码对应的程序,以用于执行以下步骤:The embodiment of the present invention also provides a data processing device, which includes a housing, a processor, a memory, a circuit board and a power supply circuit, wherein the circuit board is placed inside the space enclosed by the housing, and the processor and the memory are arranged in the On the circuit board; the power supply circuit is used to supply power to each circuit or device of the device; the memory is used to store the executable program code; the processor runs the program corresponding to the executable program code by reading the executable program code stored in the memory , to perform the following steps:
S21’:获取服务器中记录的移动终端屏幕上展现内容的信息,所述展现内容的信息是所述移动终端对搜索结果进行展现后得到并发送给所述服务器的,所述搜索结果是根据用户输入的搜索词得到的,所述展现内容包括:初始屏幕展现内容,以及,用户每次滑屏后的屏幕展现内容。S21': Obtain information about the content displayed on the screen of the mobile terminal recorded in the server. The information about the displayed content is obtained by the mobile terminal after displaying the search results and sent to the server. The search results are based on user The displayed content includes: the displayed content on the initial screen, and the displayed content on the screen after each swipe of the user's screen.
所述展现内容的信息可以为移动终端屏幕上最后一条展现内容的信息。The information on displaying content may be the last piece of information on displaying content on the screen of the mobile terminal.
进一步的,所述最后一条展现内容的信息,包括:最后一条展现内容的编号,以及最后一条展现内容的展现高度。Further, the information about the last displayed content includes: the serial number of the last displayed content, and the displayed height of the last displayed content.
S22’:根据所述展现内容的信息获取所述搜索结果中没有被用户浏览到的内容。S22': Obtain content that has not been browsed by the user in the search results according to the information of the displayed content.
例如,将在所述最后一条展现内容的编号之前的展现内容确定为已经被用户浏览到的内容;以及,For example, determining the display content before the number of the last piece of display content as the content that has been browsed by the user; and,
根据所述最后一条展现内容的展现高度确定所述最后一条展现内容是否为被用户浏览到的内容。Determine whether the last piece of displayed content is content browsed by the user according to the display height of the last piece of displayed content.
进一步的,可以在所述最后一条展现内容的展现高度与所述最后一条展现内容的整体高度之间的比值大于预设的阈值时,确定所述最后一条展现内容为被用户浏览到的内容。Further, when the ratio between the display height of the last displayed content and the overall height of the last displayed content is greater than a preset threshold, it may be determined that the last displayed content is the content browsed by the user.
进一步的,该预设的阈值可以具体为1/4。Further, the preset threshold may be specifically 1/4.
S23’:在展现点击日志中去除所述没有被用户浏览到的内容的信息。S23': Remove information about content that has not been browsed by the user in the display click log.
其中,相关技术中,展现点击日志中会记录每条搜索结果的信息,不论该条搜索结果是否被用户浏览到。例如,用户输入一搜索词后,服务器加载与该搜索词对应的所有搜索结果,假设所有的搜索结果的数目是8条,那么相关技术中展现点击日志中会包含这8条搜索结果的信息,当用户点击其中的某条搜索结果时就是正例,否则为负例。但是,这8条搜索结果中可能会存在根本没出现在屏幕上的结果,,相应的,用户不可能浏览到,造成数据不准确。Wherein, in the related technology, information of each search result is recorded in the display click log, no matter whether the search result is browsed by the user or not. For example, after the user enters a search term, the server loads all search results corresponding to the search term. Assuming that the number of all search results is 8, then related technologies show that the click log will contain the information of these 8 search results. When the user clicks on one of the search results, it is a positive example, otherwise it is a negative example. However, among the eight search results, there may be results that do not appear on the screen at all, and correspondingly, it is impossible for the user to browse, resulting in inaccurate data.
而本实施例中,对于用户没有浏览到的内容,在展现点击日志中去除该用户没有浏览到的内容的信息,避免采用不准确的数据进行预测。However, in this embodiment, for the content that the user has not browsed, the information about the content that the user has not browsed is removed from the display click log, so as to avoid using inaccurate data for prediction.
其中,本实施例的执行主体可以是数据处理装置,通过该装置可以得到处理后的展现点击日志,该处理后的展现点击日志可以用在CTR建模等场合。Wherein, the execution subject of this embodiment may be a data processing device, through which a processed display click log can be obtained, and the processed display click log can be used in occasions such as CTR modeling.
本实施例通过根据移动终端屏幕上展现内容的信息确定没有被用户浏览到的内容,并在展现点击日志中去除没有被用户浏览到的内容的信息,一方面由于去除了部分信息可以降低CTR模型的输入数据量,另一方面由于去除了没有被用户浏览到的内容的信息,可以避免将不准确的信息作为模型的输入数据,提高模型的预估准确率。This embodiment determines the content that has not been browsed by the user according to the information displayed on the screen of the mobile terminal, and removes the information of the content that has not been browsed by the user in the display click log. On the one hand, the CTR model can be reduced due to the removal of part of the information. The amount of input data, on the other hand, because the information of the content that has not been browsed by the user is removed, it can avoid using inaccurate information as the input data of the model, and improve the prediction accuracy of the model.
需要说明的是,在本发明的描述中,术语“第一”、“第二”等仅用于描述目的,而不能理解为指示或暗示相对重要性。此外,在本发明的描述中,除非另有说明,“多个”的含义是两个或两个以上。It should be noted that, in the description of the present invention, terms such as "first" and "second" are used for description purposes only, and should not be understood as indicating or implying relative importance. In addition, in the description of the present invention, unless otherwise specified, "plurality" means two or more.
流程图中或在此以其他方式描述的任何过程或方法描述可以被理解为,表示包括一个或更多个用于实现特定逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分,并且本发明的优选实施方式的范围包括另外的实现,其中可以不按所示出或讨论的顺序,包括根据所涉及的功能按基本同时的方式或按相反的顺序,来执行功能,这应被本发明的实施例所属技术领域的技术人员所理解。Any process or method descriptions described in flowcharts or otherwise herein may be understood as representing a module, segment or portion of code comprising one or more executable instructions for implementing specific logical functions or steps of the process , and the scope of preferred embodiments of the invention includes alternative implementations in which functions may be performed out of the order shown or discussed, including in substantially simultaneous fashion or in reverse order depending on the functions involved, which shall It is understood by those skilled in the art to which the embodiments of the present invention pertain.
应当理解,本发明的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。例如,如果用硬件来实现,和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。It should be understood that various parts of the present invention can be realized by hardware, software, firmware or their combination. In the embodiments described above, various steps or methods may be implemented by software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, it can be implemented by any one or combination of the following techniques known in the art: Discrete logic circuits, ASICs with suitable combinational logic gates, programmable gate arrays (PGAs), field programmable gate arrays (FPGAs), etc.
本技术领域的普通技术人员可以理解实现上述实施例方法携带的全部或部分步骤是可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,该程序在执行时,包括方法实施例的步骤之一或其组合。Those of ordinary skill in the art can understand that all or part of the steps carried by the methods of the above embodiments can be completed by instructing related hardware through a program, and the program can be stored in a computer-readable storage medium. During execution, one or a combination of the steps of the method embodiments is included.
此外,在本发明各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing module, each unit may exist separately physically, or two or more units may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules. If the integrated modules are realized in the form of software function modules and sold or used as independent products, they can also be stored in a computer-readable storage medium.
上述提到的存储介质可以是只读存储器,磁盘或光盘等。The storage medium mentioned above may be a read-only memory, a magnetic disk or an optical disk, and the like.
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。In the description of this specification, descriptions referring to the terms "one embodiment", "some embodiments", "example", "specific examples", or "some examples" mean that specific features described in connection with the embodiment or example , structure, material or characteristic is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。Although the embodiments of the present invention have been shown and described above, it can be understood that the above embodiments are exemplary and should not be construed as limiting the present invention, those skilled in the art can make the above-mentioned The embodiments are subject to changes, modifications, substitutions and variations.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201410198312.8ACN103995852B (en) | 2014-05-12 | 2014-05-12 | Data processing method and device |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201410198312.8ACN103995852B (en) | 2014-05-12 | 2014-05-12 | Data processing method and device |
| Publication Number | Publication Date |
|---|---|
| CN103995852A CN103995852A (en) | 2014-08-20 |
| CN103995852Btrue CN103995852B (en) | 2018-01-09 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201410198312.8AExpired - Fee RelatedCN103995852B (en) | 2014-05-12 | 2014-05-12 | Data processing method and device |
| Country | Link |
|---|---|
| CN (1) | CN103995852B (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104462278B (en)* | 2014-11-26 | 2017-12-08 | 百度在线网络技术(北京)有限公司 | The control method and system that content of pages shows |
| CN105528408B (en)* | 2015-12-03 | 2019-03-12 | 百度在线网络技术(北京)有限公司 | Page display method and device |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101369276A (en)* | 2008-09-28 | 2009-02-18 | 杭州电子科技大学 | A Forensics Method of Web Browser Cache Data |
| CN101923545A (en)* | 2009-06-15 | 2010-12-22 | 北京百分通联传媒技术有限公司 | Method for recommending personalized information |
| CN103530292A (en)* | 2012-07-02 | 2014-01-22 | 阿里巴巴集团控股有限公司 | Webpage displaying method and device |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7693856B2 (en)* | 2004-06-25 | 2010-04-06 | Apple Inc. | Methods and systems for managing data |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101369276A (en)* | 2008-09-28 | 2009-02-18 | 杭州电子科技大学 | A Forensics Method of Web Browser Cache Data |
| CN101923545A (en)* | 2009-06-15 | 2010-12-22 | 北京百分通联传媒技术有限公司 | Method for recommending personalized information |
| CN103530292A (en)* | 2012-07-02 | 2014-01-22 | 阿里巴巴集团控股有限公司 | Webpage displaying method and device |
| Publication number | Publication date |
|---|---|
| CN103995852A (en) | 2014-08-20 |
| Publication | Publication Date | Title |
|---|---|---|
| US8555156B2 (en) | Inferring that a message has been read | |
| CN104462262A (en) | Method and device for achieving voice search and browser client side | |
| RU2014110393A (en) | INTERACTIVE CONTENT FOR DIGITAL BOOKS | |
| CN104536729A (en) | Method and device for achieving image capture on browser page | |
| WO2016107325A1 (en) | Page resource loading method and device based on mobile terminal | |
| CN109492607B (en) | Information push method, information push device and terminal device | |
| CN105224657B (en) | A kind of information recommendation method and electronic equipment based on search engine | |
| CN105138703A (en) | Web search method based on search engines and electronic equipment | |
| CN103686809A (en) | Method, mobile terminal and system for providing solutions to mobile terminal failure problems | |
| CN106777028A (en) | A kind of method and device based on MSN | |
| CN103870121B (en) | A kind of information processing method and electronic equipment | |
| JP2012038207A5 (en) | ||
| CN105608158A (en) | Method and apparatus for displaying picture in waterfall flow manner | |
| KR101971042B1 (en) | Infinite search results page | |
| CN112306365A (en) | Information display method, device, electronic device and storage medium | |
| CN105224652A (en) | A kind of information recommendation method based on video and electronic equipment | |
| JP2017162212A (en) | Information processing device, information processing method, and program | |
| CN103995852B (en) | Data processing method and device | |
| CN105138702B (en) | Network searching method based on search engine and electronic equipment | |
| WO2017107887A1 (en) | Method and apparatus for switching group picture on mobile terminal | |
| CN105468094B (en) | Method for operating computer terminal and computer terminal | |
| CN105224654A (en) | A kind of Web browsing mode changing method and electronic equipment | |
| CN105528446A (en) | Abstract generating method and device meeting extended product demands | |
| US11637939B2 (en) | Server apparatus, user terminal apparatus, controlling method therefor, and electronic system | |
| CN105260432A (en) | Network searching result screening method and electronic device |
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| CF01 | Termination of patent right due to non-payment of annual fee | Granted publication date:20180109 | |
| CF01 | Termination of patent right due to non-payment of annual fee |