CN109086439B

Movatterモバイル変換

Info

Publication number: CN109086439B
Application number: CN201810929455.XA
Authority: CN
Inventors: 张君; 翟俊杰; 杨月奎
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2018-08-15
Filing date: 2018-08-15
Publication date: 2022-02-25
Anticipated expiration: 2038-08-15
Also published as: CN109086439A

Abstract

The invention relates to an information recommendation method and device, wherein the information recommendation method comprises the following steps: recalling candidate information from an information base for providing information recommendation service; predicting the click rate of the candidate information, and generating a pre-sequencing information set according to the candidate information with high click rate; performing diversity reordering on the candidate information in the pre-ordering information set according to the context characteristics of the candidate information in the pre-ordering information set to obtain a reordered information set; and providing the information recommendation service according to the candidate information in the reordering information set. By adopting the information recommendation method and device provided by the invention, the problem of high repetition rate of information recommendation in the prior art is solved.

Description

Information recommendation method and device

Technical Field

The invention relates to the technical field of computers, in particular to an information recommendation method and device.

Background

With the development of internet technology, millions of information can be pushed to users via the internet, for example, when a user visits a forum, a recent hot topic is recommended to the user, or when a user reads a piece of news, information close to the content of the piece of news is recommended to the user.

The existing information recommendation method is based on click rate for recommendation, namely after candidate information is obtained, the probability of clicking the candidate information by a user, namely the click rate, is calculated, the candidate information is ranked according to the click rate of the candidate information, and the candidate information with high click rate is preferentially recommended to the user.

Because the candidate information with a high click rate may have high repeatability, poor quality and the like, how to reduce the problem of high repetition rate in information recommendation still needs to be solved urgently.

Disclosure of Invention

In order to solve the above technical problems, an object of the present invention is to provide an information recommendation method and apparatus.

The technical scheme adopted by the invention is as follows:

in a first aspect, an information recommendation method includes: recalling candidate information from an information base for providing information recommendation service; predicting the click rate of the candidate information, and generating a pre-sequencing information set according to the candidate information with high click rate; performing diversity reordering on the candidate information in the pre-ordering information set according to the context characteristics of the candidate information in the pre-ordering information set to obtain a reordered information set; and providing the information recommendation service according to the candidate information in the reordering information set.

In a second aspect, an information recommendation apparatus includes: the information recalling module is used for recalling candidate information from the information base for providing information recommendation service; the information pre-sorting module is used for predicting the click rate of the candidate information and generating a pre-sorting information set according to the candidate information with high click rate; the information reordering module is used for performing diversity reordering on the candidate information in the pre-ordering information set according to the context characteristics of the candidate information in the pre-ordering information set to obtain a reordered information set; and the information recommendation module is used for providing the information recommendation service according to the candidate information in the reordering information set.

In an exemplary embodiment, the information reordering module comprises: the information acquisition unit is used for acquiring recommended information displayed in the conversation page; the context feature extraction unit is used for extracting context features of the candidate information in the pre-sorting information set by combining the acquired recommended information; the first click rate prediction unit is used for inputting the context characteristics of the candidate information in the pre-sorting information set into a first click rate prediction model and predicting the click rate of the candidate information in the pre-sorting information set; and the reordering set generation unit is used for ordering the candidate information in the pre-ordering information set according to the click rate of the candidate information in the pre-ordering information set to generate the reordering information set, wherein the reordering information set comprises a plurality of slots for storing the candidate information, and the candidate information stored in each slot corresponds to one piece of recommended information displayed in the session page.

In an exemplary embodiment, the context feature extracting unit includes: the set generation subunit is used for classifying the acquired recommended information according to the clicking behavior of the user on the recommended information in the session page, and/or classifying candidate information stored in a plurality of slots in the reordering information set to obtain a plurality of sets; the basic distribution characteristic operation subunit is used for respectively calculating corresponding basic distribution characteristics aiming at the information in the plurality of sets and the candidate information in the pre-sorting information set; the diversity characteristic operation subunit is used for performing diversity characteristic operation according to the basic distribution characteristics obtained by calculation; and the context feature generation subunit is used for generating the context features of the candidate information in the pre-ordering information set according to the basic distribution features and the diversity features.

In an exemplary embodiment, the set generating subunit includes: the first classification subunit is used for dividing the acquired recommended information into a clicked information set and an unchecked information set according to the clicking behavior of the user on the recommended information in the session page; and/or the second classification subunit is used for classifying the recommended information which is clicked last into a last click information set.

In an exemplary embodiment, the set generating subunit includes: a slot position determining subunit, configured to determine a current slot position in the reordering information set; the third classification subunit is used for classifying the candidate information stored in the previous slots in the reordering information set into a current display information set; and/or a fourth classification subunit, configured to divide the candidate information stored in a previous slot in the reordering information set into a previous presentation information set; wherein the first slots are located before the current slot in the reordering information set.

In an exemplary embodiment, the base distribution features include: the method comprises the following steps of information belonging first-level channel distribution characteristic, information belonging second-level channel distribution characteristic, information subject distribution characteristic, information label distribution characteristic and recall reason distribution characteristic.

In an exemplary embodiment, the diversity feature includes: the method comprises the following steps of difference degree characteristic, difference degree and user click rate combination characteristic, similarity characteristic, distribution entropy characteristic and cross entropy characteristic.

In an exemplary embodiment, the reordering set generating unit includes: the sorting subunit is used for sorting the candidate information in the pre-sorting information set according to the click rate of the candidate information in the pre-sorting information set; the storage subunit is used for traversing a plurality of slot positions in the reordering information set and storing the candidate information with the highest click rate to the traversed slot positions; the deleting subunit is used for deleting the candidate information with the highest click rate from the pre-sorting information set; notifying the first click-through rate prediction subunit.

In an exemplary embodiment, the information recall module includes: the request receiving unit is used for receiving an information recommendation request initiated by a client; and the request response unit is used for responding the information recommendation request and recalling the candidate information from the information base according to a recall reason.

In an exemplary embodiment, the information pre-ordering module includes: an information feature extraction unit, configured to extract an information feature from the candidate information; the second click rate prediction unit is used for inputting the information characteristics of the candidate information into a second click rate prediction model and predicting to obtain the click rate of the candidate information; and the pre-sorting set generating unit is used for sorting the candidate information according to the click rate of the candidate information to generate the pre-sorting information set.

In an exemplary embodiment, the click rate prediction model comprises a first click rate prediction model or a second click rate prediction model, the information sample comprises recommended information, and the input feature comprises an information feature or a context feature of the recommended information; the apparatus further includes a model training module, the model training module including: the sample acquisition unit is used for acquiring the information sample carrying a behavior tag, and the behavior tag is used for indicating the clicking behavior of a user for the information sample; an input feature extraction unit, configured to extract the input feature for the information sample; the model training unit is used for guiding a specified model to carry out model training according to the input characteristics and the behavior labels of the information samples; and the model definition unit is used for taking the specified model after model training as the click rate prediction model.

In a third aspect, an information recommendation apparatus includes a processor and a memory, where the memory stores computer readable instructions, and the computer readable instructions, when executed by the processor, implement the information recommendation method as described above.

In a fourth aspect, a computer-readable storage medium has stored thereon a computer program which, when executed by a processor, implements the information recommendation method as described above.

In the technical scheme, the candidate information is ranked twice for providing the information recommendation service, so that the problem of high repetition rate in information recommendation is solved.

Specifically, candidate information is recalled from an information base, click rate prediction is carried out on the candidate information, a pre-ordering information set is generated according to the candidate information with high click rate, then diversity reordering is carried out on the candidate information in the pre-ordering information set according to the context characteristics of the candidate information in the pre-ordering information set, a reordered information set is obtained, and finally information recommendation service is provided by the candidate information in the reordered information set.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

Fig. 1 is a schematic diagram of a specific implementation of an information recommendation method according to the prior art.

FIG. 2 is a schematic illustration of an implementation environment in accordance with the present invention.

Fig. 3 is a block diagram illustrating a hardware architecture of a server according to an example embodiment.

Fig. 4 is a flow chart illustrating an information recommendation method according to an example embodiment.

FIG. 5 is a flow chart of one embodiment ofstep 310 in the corresponding embodiment of FIG. 4.

FIG. 6 is a flow chart of one embodiment ofstep 330 of the corresponding embodiment of FIG. 4.

FIG. 7 is a flow chart of one embodiment ofstep 350 of the corresponding embodiment of FIG. 4.

FIG. 8 is a flow chart illustrating another method of information recommendation, according to an example embodiment.

FIG. 9 is a flowchart of one embodiment ofstep 353 of the corresponding embodiment of FIG. 7.

Fig. 10 is a diagram illustrating sets, pre-ordering information sets, and reordering information sets according to the corresponding embodiment of fig. 9.

FIG. 11 is a flowchart of one embodiment ofstep 357 in the corresponding embodiment of FIG. 7.

Fig. 12 is a schematic diagram of an architecture of an information recommendation service in an application scenario.

Fig. 13 is a schematic diagram of a specific implementation of multiple ranking of candidate information in the application scenario corresponding to fig. 12.

Fig. 14 is a schematic diagram of the generation of a recommended news list in the corresponding application scenario of fig. 12.

Fig. 15 is a block diagram illustrating an information recommendation apparatus according to an example embodiment.

Fig. 16 is a block diagram illustrating a hardware configuration of an information recommendation apparatus according to an exemplary embodiment.

While specific embodiments of the invention have been shown by way of example in the drawings and will be described in detail hereinafter, such drawings and description are not intended to limit the scope of the inventive concepts in any way, but rather to explain the inventive concepts to those skilled in the art by reference to the particular embodiments.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.

As described above, in the conventional information recommendation method, candidate information is recommended according to the probability that a user clicks the candidate information (i.e., the click rate), and situations such as high repeatability and poor quality of the candidate information with a high click rate are likely to occur.

In order to solve the above-mentioned drawbacks, as shown in fig. 1, some diversity control strategies are formulated for candidate information with a high click rate, for example, the topics of adjacent candidate information recommended to the user are different, and in combination with these artificially formulated strategies, candidate information meeting these strategies is further screened from candidate information with a high click rate and is recommended to the user.

However, in the recommendation process, since the formulation of the diversity control strategy mainly depends on manual implementation, not only is the difficulty of recommendation increased, but also the recommendation efficiency is not improved.

Therefore, the present invention specifically provides an information recommendation method, which avoids manually making a diversity control strategy, reduces the difficulty of recommendation, is beneficial to improving recommendation efficiency, and can effectively avoid the repeatability of information recommendation.

Fig. 2 is a schematic diagram of an implementation environment related to an information recommendation method. The implementation environment includes aterminal 100 and aserver 200.

Theterminal 100 may be a desktop computer, a notebook computer, a tablet computer, a smart phone, or other electronic devices that can be operated by a client (e.g., an information recommendation client), and is not limited herein.

Theterminal 100 and theserver 200 establish a network connection in advance through wireless or wired connection, so that data transmission between theterminal 100 and theserver 200 is realized through the network connection. For example, the transmitted data includes recommended candidate information.

Theserver 200 may be a single server, a server cluster including a plurality of servers, or a cloud computing center including a plurality of servers. The server is an electronic device for providing background services to users, for example, the background services include multimedia recommendation services.

Through the interaction between theterminal 100 and theserver 200, the client running in theterminal 100 sends an information recommendation request to theserver 200, and theserver 200 provides a multimedia recommendation service, and pushes recommended candidate information to the client running in theterminal 100, so as to show the recommended candidate information to the user.

Fig. 3 is a block diagram illustrating a hardware architecture of a server according to an example embodiment. This server is suitable for the server in the implementation environment shown in fig. 2.

It should be noted that the server is only an example adapted to the present invention, and should not be considered as providing any limitation to the scope of the present invention. This service also cannot be interpreted as requiring reliance on, or necessity of, one or more components of theexemplary service 200 shown in fig. 3.

The hardware structure of the server may be greatly different due to different configurations or performances, as shown in fig. 3, theserver 200 includes: apower supply 210, aninterface 230, at least onememory 250, and at least one Central Processing Unit (CPU) 270.

Thepower supply 210 is used for providing an operating voltage for each hardware device on theserver 200.

Theinterface 230 includes at least one wired orwireless network interface 231, at least one serial-to-parallel conversion interface 233, at least one input/output interface 235, and at least oneUSB interface 237, etc. for communicating with external devices.

Thestorage 250 is used as a carrier for resource storage, and may be a read-only memory, a random access memory, a magnetic disk or an optical disk, etc., and the resources stored thereon include anoperating system 251, anapplication 253,data 255, etc., and the storage manner may be a transient storage or a permanent storage. Theoperating system 251 is used for managing and controlling each hardware device and theapplication 253 on theserver 200 to implement the computation and processing of themass data 255 by thecentral processing unit 270, which may be Windows server, Mac OSXTM, unix, linux, FreeBSDTM, or the like. Theapplication 253 is a computer program that performs at least one specific task on theoperating system 251, and may include at least one module (not shown in fig. 3), each of which may respectively include a series of computer-readable instructions for theserver 200.Data 255 may be documents, audio, video, pictures, etc. stored in disk.

Thecentral processor 270 may include one or more processors and is arranged to communicate with thememory 250 via a bus for computing and processing themass data 255 in thememory 250.

As described in detail above, theserver 200 to which the present invention is applied will complete the information recommendation method by thecpu 270 reading a series of computer readable instructions stored in thememory 250.

Furthermore, the present invention can be implemented by hardware circuits or by a combination of hardware circuits and software, and thus, the implementation of the present invention is not limited to any specific hardware circuits, software, or a combination of both.

Referring to fig. 4, in an exemplary embodiment, an information recommendation method is applied to a server in the implementation environment shown in fig. 2, and the structure of the server may be as shown in fig. 3.

The information recommendation method can be executed by a server side and can comprise the following steps:

atstep 310, candidate information is recalled from the information base for providing the information recommendation service.

The information recommendation service refers to that a server side recommends candidate information to a client side so that the client side can display the recommended candidate information for a user. The candidate information may be text information, video information, audio information, picture information, etc., and the present embodiment does not specifically limit the type of the candidate information.

Accordingly, since different types of candidate information may correspond to different application scenarios, for example, text information may correspond to a news reading scenario, video information may correspond to a user-requested movie program scenario, audio information may correspond to a user-requested song scenario, and picture information may correspond to a user-browsed picture scenario, the information recommendation method provided by the present embodiment may be applicable to different application scenarios according to different types of candidate information.

As shown in fig. 5, in a specific implementation of an embodiment, step 310 may include the following steps:

step 311, receiving an information recommendation request initiated by a client.

Step 313, recalling the candidate information from the information base according to the recall reason in response to the information recommendation request.

For the client, the client provides a request initiation entry for the user, and if the user desires to perform information recommendation, a relevant operation can be triggered in the request initiation entry, so that the client detects the operation and generates an information recommendation request.

The request initiation entries are different according to different input components (such as a mouse, a keyboard, a touch screen, and the like) configured for the terminal, and the related operations triggered in the request initiation entries are different. For example, the related operations include, but are not limited to, clicking, moving, dragging, sliding, and the like.

For example, if the terminal is a smart phone, the request initiation entry may be a session page presented in a touch screen configured for the smart phone, where the session page shows a plurality of recommended candidate information, and the user may pull down the session page to enable the client to initiate an information recommendation request, so as to update and show the candidate information returned by the server in the session page. The pull-down operation is a related operation that requests to initiate an entry trigger.

For the server, after the client initiates the information recommendation request, the information recommendation request can be received, so as to provide the information recommendation service for the user.

Specifically, the candidate information is recalled from the information base according to the recall reason, and then the candidate information is ranked and recommended for multiple times.

Recall reasons include, but are not limited to: just released, hit, user interested. That is, for the mass information stored in the information base, the recalled candidate information belongs to the just released information, or belongs to the popular information, or belongs to the information in which the user is interested.

It should be noted that mass information stored in the information base is actively uploaded by the information publisher. Namely, the information issuing party uploads the information to be issued to the information base of the server for storage, so that the server can recommend the information to the user.

And step 330, predicting the click rate of the candidate information, and generating a pre-sorting information set according to the candidate information with high click rate.

It is to be understood that more than one candidate information is recalled by recall reason, and not necessarily all of the candidate information is recommended to the user. In this embodiment, based on the probability that the candidate information is clicked by the user, that is, the click rate, the recalled candidate information is preliminarily screened, and a pre-ranking information set is generated.

Specifically, as shown in fig. 6, the generation process of the pre-ordered information set may include the following steps:

step 331, extracting information features from the candidate information.

And 333, inputting the information characteristics of the candidate information into a second click rate prediction model, and predicting the click rate of the candidate information.

Step 335, sorting the candidate information according to the click rate of the candidate information, and generating a pre-sorting information set.

And the information characteristic is an accurate description of the candidate information so as to uniquely identify the candidate information on the information. The information features include, but are not limited to, user features, information content features, and matching degree features of user interests and information. Further, the user characteristics are used for describing the age, sex, occupation, interest and the like of the user, the information content characteristics are used for describing the information subject, the information keywords and the like, the matching degree characteristics of the user interest and the information are used for indicating whether the information accords with the user interest, and the higher the matching degree is, the more the information accords with the user interest.

For example, when the information recommendation service is provided for the user a, firstly, the user characteristics of the user a are extracted, then, the corresponding information content characteristics are extracted for the recalled candidate information, and the matching degree characteristics of the recalled candidate information and the interests of the user a are calculated, so that the information characteristics of the candidate information for the user a can be obtained.

Therefore, if the users to which the information recommendation service is directed are different, even if the recalled candidate information is the same, the corresponding information characteristics are different, so that the differentiated information recommendation service is provided for different users, and the recommendation experience of the users is favorably improved.

And 350, performing diversity reordering on the candidate information in the pre-ordering information set according to the context characteristics of the candidate information in the pre-ordering information set to obtain a reordered information set.

It can be understood that when the candidate information is recommended to the user, i.e. considered as recommended information, in order to avoid the repeatability problem in information recommendation, the influence of the recommended information on the candidate information needs to be considered, for example, the subject of the candidate information is different from the subject of the recommended information.

And the context characteristics are used for describing the correlation between the candidate information and the recommended information and describing the correlation between the candidate information.

Thus, the diversity reordering considers not only the diversity of the candidate information and the recommended information with respect to the recommended information, but also the diversity between the candidate information with respect to each other.

Step 370, providing information recommendation service according to the candidate information in the reordering information set.

For the client, after the candidate information in the reordering information set is pushed by the server, the candidate information in the reordering information set can be received and then recommended to the user.

For example, in a session page presented by the client, several candidate information in the reordering information set are presented to the user in the form of a session list.

Through the process, diversified candidate information is provided for the user, diversity of information recommendation is achieved, and the situation that the candidate information with high repeatability and poor quality is possible to occur is effectively avoided.

Referring to FIG. 7, in an exemplary embodiment, step 350 may include the steps of:

step 351, obtaining the recommended information displayed in the conversation page.

And the session page is used for displaying candidate information in the reordering information set pushed by the server.

For the client, as the client runs on the terminal, the session page correspondingly presents the candidate information pushed by the server in a screen configured by the terminal, so that the candidate information is recommended to the user.

For the server, with the presentation of the session page, the candidate information in the reordering information set can be regarded as the recommended information presented in the session page.

And 353, combining the acquired recommended information, and performing context feature extraction on the candidate information in the pre-sorting information set.

Step 355, inputting the context characteristics of the candidate information in the pre-ranking information set into the first click rate prediction model, and predicting to obtain the click rate of the candidate information in the pre-ranking information set.

Step 357, sorting the candidate information in the pre-sorting information set according to the click rate of the candidate information in the pre-sorting information set, and generating a re-sorting information set.

The reordering information set comprises a plurality of slots for storing candidate information, and the candidate information stored in each slot corresponds to recommended information displayed in the session page.

It can be understood that, in the information recommendation process, the client side will continuously initiate an information recommendation request, and therefore, candidate information stored in a plurality of slots in the reordering information set will be continuously updated, so that recommended information displayed in the session page correspondingly changes continuously.

Under the effect of the embodiment, the diversity of the candidate information in the re-ranking information set is realized, and a necessary basis is provided for the diversity of information recommendation.

As described above, the first click rate prediction model and the second click rate prediction model are used for realizing click rate prediction, and are different only in that an input object is different from an output object.

Here, in order to better describe the commonality of the first click rate prediction model and the second click rate prediction model in the model training process, the following definition is made for the above differences.

The click rate prediction model comprises a first click rate prediction model or a second click rate prediction model.

The information sample includes recommended information.

Accordingly, the input features include information features or context features of the recommended information.

Accordingly, as shown in FIG. 8, in an exemplary embodiment, the method as described above may further include a model training process of the click-through rate prediction model, and the model training process of the click-through rate prediction model may include the following steps:

step 410, obtaining an information sample carrying a behavior tag.

Wherein the behavior tag is used for indicating the clicking behavior of the user for the information sample.

That is, for recommended information presented in the session page, if the user clicks on the recommended information, the behavior tag indicates that the recommended information has been clicked by the user. Conversely, if the user does not click on the recommended information, the behavior tag indicates that the recommended information is not clicked on by the user.

And step 430, extracting input features of the information sample.

And step 450, guiding the appointed model to carry out model training according to the input characteristics and the behavior labels of the information samples.

Wherein, the specified model includes but is not limited to: machine learning models such as logistic regression, support vector machines, random forests, neural networks, and the like.

And model training, namely optimizing the model parameters of the specified model according to the input features and the behavior labels of the information samples to learn to obtain the optimal model parameters for converging the specified model.

Specifically, model parameters of a specified model are initialized randomly, input features and behavior labels of a current information sample are input into the specified model, and if the randomly initialized model parameters do not enable the specified model to be converged, the randomly initialized model parameters are updated, and the input features and the behavior labels of a later information sample are input into the specified model.

And iterating in the above manner until the iteration times reach an iteration threshold or the updated model parameters enable the specified model to be converged, and finishing the model training of the specified model.

The iteration threshold value can be flexibly adjusted according to the actual needs of the application scenario. For example, in an application scenario with a high requirement on prediction accuracy, a larger iteration threshold is set, or in an application scenario with a high requirement on prediction speed, a smaller iteration threshold is set.

And 470, taking the specified model after model training as a click rate prediction model.

After the model training is finished, the specified model is converged into a click rate prediction model, and the optimal model parameter is used as an input parameter of the click rate prediction model, so that the click rate of the candidate information is obtained through prediction based on the click rate prediction model.

It is noted that, the click-through rate prediction substantially calls a click-through rate prediction model to predict the probability of obtaining the behavior tag to which the candidate information belongs according to the information feature or the context feature of the candidate information.

For example, assume that the behavior tag is 0, indicating that the user will not click on the candidate information, and the behavior tag is 1, indicating that the user will click on the candidate information.

Further, if the probability that the candidate information belongs to the behavior tag 0 is P0, the probability that the candidate information belongs to the behavior tag 1 is P1, if P0> P1, the click rate is 0 if the user is predicted not to click on the candidate information, and conversely, if P0< P1, the user is predicted to click on the candidate information, the click rate of the candidate information is P1.

With the cooperation of the embodiments, model training based on the designated model is realized, that is, the click rate prediction model can well predict the click behavior of the user for the candidate information through machine learning of the information sample, so that the accuracy of the click rate prediction model is effectively improved.

In addition, based on machine learning, the diversity control strategy is avoided being made manually, the recommendation difficulty is reduced, and the diversity recommendation service which is more accurate and personalized and is thousands of people is provided for the server side.

Referring to fig. 9, in an exemplary embodiment, step 353 may include the following steps:

step 3531, according to the clicking behavior of the user on the recommended information in the session page, the obtained recommended information is classified, and/or candidate information stored in a plurality of slots in the reordering information set is classified, so that a plurality of sets are obtained.

As shown in fig. 10, asession page 501 shows a fixed number of pieces of recommended information. For example, the fixed number is 5.

The pre-ordered information set 502 includes a plurality of candidate information.

The reordering information set 503 comprises a fixed number of slots for storing candidate information. Each slot corresponds to a piece of recommended information in thesession page 501, for example, the fixed number is 5.

Several sets 504, including clicked info set 5041, unchecked info set 5042, current presentation info set 5043, last clicked info set 5044, previouspresentation info set 5045.

Taking the candidate information as news for example, for the recommended information shown in thesession page 501, if the recommended information is clicked by the user, the recommended information is divided into the clicked information set 5041. If the recommended information is not clicked on by the user, the recommended information is divided into the set ofunchecked information 5042. The last clicked recommended information is divided into the last click information set 5044.

Through the three sets, the click distribution of the recommended information is reflected, so that the interest preference of the user is indicated.

In order to enable thesession page 501 to display a fixed number of pieces of recommended information, a fixed number of slots in the reordering information set 503 need to store candidate information, and the candidate information stored in each slot is derived from the candidate information in the pre-sorting information set 502.

Assuming that thecurrent slot 5031 does not yet store the candidate information and the first 3

slots

5032, 5033 and 5034 before thecurrent slot 5031 store the candidate information, the candidate information stored in the first 3

slots

5032, 5033 and 5034 is divided into the current display information set 5043. The candidate information stored in the first 1slot 5032 is partitioned into a previous presentation information set 5045.

Through the two sets, the relevance between the candidate information recommended to the same user is reflected, and the candidate information with high repeatability is avoided from appearing in information recommendation.

Step 3533, corresponding basic distribution features are calculated for the information in the sets and the candidate information in the pre-ordered information set, respectively.

As mentioned above, the sets include a clicked information set, an unchecked information set, a current presentation information set, a last clicked information set, a previous presentation information set, and the like. Accordingly, the information in the sets may be recommended information or candidate information stored in a certain slot of the reordering set.

Wherein the base distribution features include: the method comprises the following steps of information belonging first-level channel distribution characteristic, information belonging second-level channel distribution characteristic, information subject distribution characteristic, information label distribution characteristic and recall reason distribution characteristic.

For example, primary channels include "politics," "military," "science," "sports," "entertainment," "education," "travel," "food," "health," and so forth.

The second level channel is a subdivision of the first level channel, for example, the first level channel "sports" may be subdivided into "football", "basketball", "swimming", "diving" and other second level channels.

The information topic is equivalent to the classification of information, for example, for massive information, similar massive information is divided into 500 classifications by clustering, and the 500 classifications can be regarded as the information topic of the massive information.

The information tag may be a keyword of the information, a publisher of the information, or a mood expressed by the information. For example, if the information is a song, the emotion expressed by the song is sadness, and the sadness can be regarded as the information tag of the song, or the singer and the creator of the word song can be regarded as the information tag of the song.

The recall reason, as previously described, may be trending, just released, of interest to the user, and so forth.

Of course, according to the actual needs of the application scenario, the basic distribution characteristics may further include information word distribution characteristics, information content distribution characteristics, and the like, which is not limited in this embodiment.

Specifically, for several sets and pre-ordering information sets, the following calculations are performed, respectively.

(1) The information belongs to a first-level channel distribution characteristic:

wherein n is_cIs the total number of channels at one level,

(2) the information belongs to the second-level channel distribution characteristics:

wherein n is_sIn order to subdivide the total number of channels,

(3) information topic distribution characteristics:

wherein n is_tFor the total number of information topics,

(4) information label distribution characteristics:

wherein n is_gIs the total number of the information labels,

(5) recall reason distribution characteristics:

wherein n is_rIn order to sum up the number of reasons for recall,

andstep 3535, performing diversity characteristic operation according to the calculated basic distribution characteristics.

Wherein the diversity characteristics include: the method comprises the following steps of difference degree characteristic, difference degree and user click rate combination characteristic, similarity characteristic, distribution entropy characteristic and cross entropy characteristic.

Specifically, the following calculation is performed for the basic distribution characteristics corresponding to the sets and the pre-sorting information set, respectively.

For convenience of description, in the sets, the clicked information set is referred to as set 1, the unchecked information set is referred to as set 2, the current display information set is referred to as set 3, the last clicked information set is referred to as set 4, and the previous display information set is referred to as set 5.

(1) The difference degree characteristic:

respectively calculating the distribution difference between the first-level channel distribution characteristic of the information corresponding to the sets 1-5 and the pre-ordering information set, the second-level channel distribution characteristic of the information, the information subject distribution characteristic, the information label distribution characteristic and the recall reason distribution characteristic to obtain the difference degree characteristics which are respectively marked as Diff_C(j)，Diff_S(j)，Diff_T(j)，Diff_G(j)，Diff_R(j)，1≤j≤5。

Taking the distribution difference between the first-level channel distribution characteristics of the information corresponding to the set 1 and the pre-sorting information set as an example, the first-level channel distribution characteristics of the information of the pre-sorting information set are assumed to be

The first-level channel distribution characteristic of the information of the set 1 is

The distribution difference between the pre-ranking information set and the first-level channel distribution characteristics to which the information of the set 1 belongs, that is, the difference characteristic, is:

(2) the difference degree and the user click rate are combined:

according to the difference degree characteristics obtained in the step (1) and click rate distribution of the user on the information in the sets 1-5 in the first-level channel, the second-level channel, the information subject, the information label and the recall reason, calculating difference degree and user click rate combination characteristics which are respectively marked as DiffCTR_C(j)，DiffCTR_S(j)DiffCTR_T(j)，DiffCTR_G(j)，DiffCTR_R(j)，1≤j≤5。

Characterised by the degree of difference Diff_c(1) And for the click rate distribution of the information in the set 1 in the first-level channel, for example, the user assumes the distribution difference between the pre-sorted information set and the distribution characteristics of the first-level channel to which the information in the set 1 belongs, that is, the difference characteristics

The click rate distribution of the user on the first-level channel to which the information in the set 1 belongs is

Wherein n is_cIs the total number of channels at one level,

correspondingly, the combination of the difference degree and the click rate of the user is characterized in that:

(3) similarity characteristics:

respectively calculating the similarity between the first-level channel distribution characteristic, the second-level channel distribution characteristic, the information subject distribution characteristic, the information label distribution characteristic and the recall reason distribution characteristic of the information corresponding to the pre-sorting information set 1-5 to obtain similarity characteristics which are respectively marked as Simi_C(j)，Simi_S(j)，Simi_T(j)，Simi_G(j)，Simi_R(j)，1≤j≤5。

Taking the similarity between the first-level channel distributions of the information corresponding to the set 1 and the pre-ordering information set as an example, assume that the first-level channel distribution characteristics of the information of the pre-ordering information set are

The similarity between the two is characterized by:

Simi_C(1)＝cos(C_d，C_h)。

wherein cos () is the standard vector cosine similarity function.

(4) Distribution entropy characteristics:

respectively calculating the distribution entropies of the first-level channel distribution characteristic, the second-level channel distribution characteristic, the information subject distribution characteristic, the information label distribution characteristic and the recall reason distribution characteristic of the information corresponding to the sets 1-5 to obtain distribution entropy characteristics which are respectively marked as Etp_C(j)，Etp_S(j)，Etp_T(j)，Etp_G(j)，Etp_R(j)，1≤j≤5。

Taking the distribution entropy of the first-level channel distribution characteristic to which the information of the set 1 belongs as an example, assume that the first-level channel distribution characteristic to which the information of the set 1 belongs is

The corresponding distribution entropy characteristic is then:

where log () is a logarithmic function.

(5) Cross entropy characteristics:

respectively calculating the cross entropy between the first-level channel distribution characteristic of the information corresponding to the pre-ordering information set and the sets 1-5, the second-level channel distribution characteristic of the information, the information subject distribution characteristic, the information label distribution characteristic and the recall reason distribution characteristic to obtain cross entropy characteristics which are respectively marked as Ctp_C(j)，Ctp_S(j)，Ctp_T(j)，Ctp_G(j)，Ctp_R(j)，1≤j≤5。

Taking the cross entropy of the first-level channel distribution to which the information of the set 1 belongs as an example, the first-level channel distribution characteristic to which the information of the pre-ordering information set belongs is assumed to be

The corresponding cross entropy signature is then:

where log () is a logarithmic function.

Step 3537, generating context characteristics of the candidate information in the pre-ordering information set according to the basic distribution characteristics and the diversity characteristics.

The context features of the candidate information are the combination of the basic distribution features and the diversity features.

Through the process, the context feature extraction of the candidate information is realized, the diversity among the recommended candidate information is described from multiple dimensions, the diversity among the candidate information in the reordering information set is considered, the diversity among the recommended information and the candidate information is also considered, the repeated occurrence of the candidate information can be effectively avoided, the interest diversity preference of the user can be effectively described, different diversity recommendation services are provided for different users, and the recommendation experience of the user is favorably improved.

Referring to FIG. 11, in an exemplary embodiment, step 357 may include the following steps:

step 3571, rank the candidate information in the pre-ranked information set according to the click rate of the candidate information in the pre-ranked information set.

Step 3573, traversing a plurality of slots in the reordering information set, and storing the candidate information with the highest click rate to the traversed slots.

Step 3575, the candidate information with the highest click rate is deleted from the pre-sorted information set.

As mentioned above, the reordering information set includes a plurality of slots for storing candidate information, and each slot corresponds to a recommended piece of information displayed in the session page.

Then, the generating process of the reordering information set is substantially to store the candidate information in the pre-ordering information set to several slots in the reordering information set.

As shown in fig. 10, for the candidate information in the pre-ordering information set 502, the candidate information with the highest click rate is stored preferentially to thefirst slot 5034 in the reordering information set 503.

It is understood that, considering the correlation between the candidate information in the re-ranking information set 503, the candidate information with the second highest click rate may not be stored in thesecond slot 5033 in the re-ranking information set 503, for example, when the candidate information with the second highest click rate and the candidate information with the highest click rate have the same subject, they may not be recommended to the same user due to the repeated occurrence.

Therefore, after the candidate information with the highest click rate is deleted from the pre-sorted information set, thestep 353 is executed to re-extract the context features of the candidate information in the pre-sorted information set.

And so on until thelast slot 5035 in the set ofreordering information 503 stores the candidate information.

Therefore, the candidate information in the reordering information set 503 can be pushed to the client for being displayed in thesession page 501 presented by the client, thereby completing the information recommendation service.

Fig. 12 to 14 are schematic diagrams of specific implementations of an information recommendation method in an application scenario. In the application scene, the terminal is a smart phone and can be used for a news reader to operate.

And sending a news recommendation request to the server along with the operation of the news reader on the smart phone, so that the server provides news recommendation service for the user.

For the server, the provided news recommendation service comprises three modules: a contextualfeature extraction module 601, an offlinemodel training module 602, and an onlinenews recommender module 603, as shown in fig. 12.

The contextualcharacteristic extraction module 601 extracts contextual characteristics of candidate information according to thecandidate news 605 recalled in real time and the historical recommendednews 604.

The offlinemodel training module 602 trains the click-throughrate prediction model 606 based on massive information samples, so as to achieve click-through rate pre-ranking and diversity re-ranking of candidate news.

Specifically, as shown in fig. 13, by executingsteps 701 to 702, a plurality of candidate information with a high click rate are selected from the candidate information recalled in real time, and a pre-ranking information set is formed. For example, 100 candidate messages.

By executingsteps 703 to 707, a plurality of candidate information with a high click rate are selected from the pre-ordered information set in combination with the context characteristics of the candidate information, so as to form a reordered information set.

The onlinenews recommending module 603 forms a recommendednews list 607 based on the candidate information in the re-ordered information set, as shown in fig. 14, and finally pushes the recommended news list to the news reader, so that the news reader displays the recommended news list containing a plurality of candidate information to the user.

In the application scenario, by providing diversified candidate news, the reading experience of a user can be well improved, the click rate and the reading time of the user for the candidate news are improved, and better product economic benefits are created.

The following is an embodiment of the apparatus of the present invention, which can be used to execute the information recommendation method of the present invention. For details that are not disclosed in the embodiments of the apparatus of the present invention, please refer to the method embodiments of the information recommendation method according to the present invention.

Referring to fig. 15, in an exemplary embodiment, aninformation recommendation apparatus 900 includes, but is not limited to: aninformation recall module 910, aninformation pre-ordering module 930, aninformation reordering module 950, and aninformation recommendation module 970.

Theinformation recalling module 910 is configured to recall candidate information from the information base for providing the information recommendation service.

The informationpre-sorting module 930 is configured to perform click rate prediction on the candidate information, and generate a pre-sorting information set according to the candidate information with a high click rate.

Theinformation reordering module 950 is configured to perform diversity reordering on the candidate information in the pre-ordered information set according to the context characteristics of the candidate information in the pre-ordered information set, so as to obtain a reordered information set.

Theinformation recommendation module 970 is configured to provide an information recommendation service according to the candidate information in the reordering information set.

It should be noted that, when the information recommendation apparatus provided in the foregoing embodiment performs the information recommendation process, only the division of the above functional modules is illustrated, and in practical applications, the functions may be distributed to different functional modules according to needs, that is, the internal structure of the information recommendation apparatus is divided into different functional modules to complete all or part of the functions described above.

In addition, the information recommendation apparatus provided in the above embodiment and the information recommendation method belong to the same concept, wherein the specific manner in which each module performs operations has been described in detail in the method embodiment, and is not described herein again.

Referring to fig. 16, in an exemplary embodiment, aninformation recommendation device 1000 includes at least oneprocessor 1001, at least onememory 1002, and at least onecommunication bus 1003.

Wherein thememory 1002 has computer readable instructions stored thereon, theprocessor 1001 reads the computer readable instructions stored in thememory 1002 through thecommunication bus 1003.

The computer readable instructions, when executed by a processor, implement the information recommendation method in the above embodiments.

In an exemplary embodiment, a computer-readable storage medium has a computer program stored thereon, and the computer program, when executed by a processor, implements the information recommendation method in the above embodiments.

The above-mentioned embodiments are merely preferred examples of the present invention, and are not intended to limit the embodiments of the present invention, and those skilled in the art can easily make various changes and modifications according to the main concept and spirit of the present invention, so that the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. An information recommendation method, comprising:

recalling candidate information from an information base for providing information recommendation service;

predicting the click rate of the candidate information, and generating a pre-sequencing information set according to the candidate information with high click rate;

performing diversity reordering on the candidate information in the pre-ordering information set according to the context characteristics of the candidate information in the pre-ordering information set to obtain a reordered information set;

providing the information recommendation service according to the candidate information in the reordering information set;

the conducting diversity reordering on the candidate information in the pre-ordering information set according to the context characteristics of the candidate information in the pre-ordering information set to obtain a reordered information set comprises the following steps:

acquiring recommended information displayed in a session page;

classifying the obtained recommended information according to the clicking behavior of the user for the recommended information in the session page, and/or classifying candidate information stored in a plurality of slot positions in the reordering information set to obtain a plurality of sets;

respectively calculating corresponding basic distribution characteristics aiming at information in a plurality of sets and candidate information in the pre-sorting information set; the base distribution characteristics include: the method comprises the following steps of (1) distributing characteristics of a first-level channel to which information belongs, distributing characteristics of a second-level channel to which the information belongs, distributing characteristics of information topics, distributing characteristics of information labels and distributing characteristics of recall reasons;

performing diversity characteristic operation according to the calculated basic distribution characteristics; the diversity characteristics include: the method comprises the following steps of (1) difference degree characteristic, difference degree and user click rate combination characteristic, similarity degree characteristic, distribution entropy characteristic and cross entropy characteristic;

generating context characteristics of candidate information in the pre-ordering information set according to the basic distribution characteristics and the diversity characteristics;

inputting the context characteristics of the candidate information in the pre-sorting information set into a first click rate prediction model, and predicting to obtain the click rate of the candidate information in the pre-sorting information set;

sorting the candidate information in the pre-sorting information set according to the click rate of the candidate information in the pre-sorting information set to generate a re-sorting information set; the reordering information set comprises a plurality of slots for storing candidate information, and the candidate information stored in each slot corresponds to recommended information displayed in the session page.

2. The method of claim 1, wherein the classifying the acquired recommended information according to the click behavior of the user on the recommended information in the session page to obtain a plurality of sets comprises:

according to the clicking behavior of the user on the recommended information in the session page, dividing the acquired recommended information into a clicked information set and an unchecked information set; and/or

And dividing the recommended information which is clicked last into a last click information set.

3. The method of claim 1, wherein the sorting candidate information stored in slots of the reordering information set to obtain sets comprises:

determining a current slot position in the reordering information set;

dividing candidate information stored in a plurality of slots in the reordering information set into a current display information set; and/or

Dividing candidate information stored in a previous slot position in the reordering information set into a previous display information set;

wherein the first slots are located before the current slot in the reordering information set.

4. The method of claim 1, wherein the sorting the candidate information in the pre-ordered information set according to the click-through rate of the candidate information in the pre-ordered information set to generate the re-ordered information set comprises:

sorting the candidate information in the pre-sorting information set according to the click rate of the candidate information in the pre-sorting information set;

traversing a plurality of slot positions in the reordering information set, and storing the candidate information with the highest click rate to the traversed slot positions;

deleting the candidate information with the highest click rate from the pre-sorting information set;

skipping to execute the step of classifying the obtained recommended information according to the clicking behavior of the user on the recommended information in the session page, and/or classifying candidate information stored in a plurality of slots in the reordering information set to obtain a plurality of sets.

5. The method of any one of claims 1 to 4, wherein recalling candidate information from an information repository for providing an information recommendation service comprises:

receiving an information recommendation request initiated by a client;

and recalling the candidate information from the information base according to a recall reason in response to the information recommendation request.

6. The method according to any one of claims 1 to 4, wherein said predicting click-through rate of said candidate information, and generating a pre-ranked information set according to the candidate information with high click-through rate, comprises:

extracting information characteristics from the candidate information;

inputting the information characteristics of the candidate information into a second click rate prediction model, and predicting to obtain the click rate of the candidate information;

and sorting the candidate information according to the click rate of the candidate information to generate the pre-sorting information set.

7. The method of claim 6, wherein the click-through rate prediction model comprises a first click-through rate prediction model or a second click-through rate prediction model, the information sample comprises recommended information, the input features comprise information features or context features of the recommended information;

the method further comprises the following steps:

acquiring the information sample carrying a behavior tag, wherein the behavior tag is used for indicating the clicking behavior of a user for the information sample;

extracting the input features from the information sample;

guiding a specified model to carry out model training according to the input characteristics and the behavior labels of the information samples;

and taking the specified model after model training as the click rate prediction model.

8. An information recommendation apparatus, comprising:

the information recalling module is used for recalling candidate information from the information base for providing information recommendation service;

the information pre-sorting module is used for predicting the click rate of the candidate information and generating a pre-sorting information set according to the candidate information with high click rate;

the information reordering module is used for performing diversity reordering on the candidate information in the pre-ordering information set according to the context characteristics of the candidate information in the pre-ordering information set to obtain a reordered information set;

the information recommendation module is used for providing the information recommendation service according to the candidate information in the reordering information set;

the information reordering module comprises: the information acquisition unit is used for acquiring recommended information displayed in the conversation page; the context feature extraction unit is used for extracting context features of the candidate information in the pre-sorting information set by combining the acquired recommended information; the first click rate prediction unit is used for inputting the context characteristics of the candidate information in the pre-sorting information set into a first click rate prediction model and predicting the click rate of the candidate information in the pre-sorting information set; the reordering set generating unit is configured to order the candidate information in the pre-ordering information set according to a click rate of the candidate information in the pre-ordering information set, and generate the reordering information set, where the reordering information set includes a plurality of slot positions for storing the candidate information, and the candidate information stored in each slot position corresponds to one piece of recommended information displayed in the session page;

the context feature extraction unit includes: the set generation subunit is used for classifying the acquired recommended information according to the clicking behavior of the user on the recommended information in the session page, and/or classifying candidate information stored in a plurality of slots in the reordering information set to obtain a plurality of sets; the basic distribution characteristic operation subunit is used for respectively calculating corresponding basic distribution characteristics aiming at the information in the plurality of sets and the candidate information in the pre-sorting information set; the diversity characteristic operation subunit is used for performing diversity characteristic operation according to the basic distribution characteristics obtained by calculation; and the context feature generation subunit is used for generating the context features of the candidate information in the pre-ordering information set according to the basic distribution features and the diversity features.

9. The information recommendation device of claim 8,

the set generating subunit includes: the first classification subunit is used for dividing the acquired recommended information into a clicked information set and an unchecked information set according to the clicking behavior of the user on the recommended information in the session page; and/or the second classification subunit is used for classifying the recommended information which is clicked last into a last click information set.

10. The information recommendation device of claim 8,

the set generating subunit includes: a slot position determining subunit, configured to determine a current slot position in the reordering information set; the third classification subunit is used for classifying the candidate information stored in the previous slots in the reordering information set into a current display information set; and/or a fourth classification subunit, configured to divide the candidate information stored in a previous slot in the reordering information set into a previous presentation information set; wherein the first slots are located before the current slot in the reordering information set.

11. The information recommendation device of claim 8,

the reordering set generating unit includes: the sorting subunit is used for sorting the candidate information in the pre-sorting information set according to the click rate of the candidate information in the pre-sorting information set; the storage subunit is used for traversing a plurality of slot positions in the reordering information set and storing the candidate information with the highest click rate to the traversed slot positions; and the deleting subunit is used for deleting the candidate information with the highest click rate from the pre-sorting information set and informing the context feature extracting unit.

12. An information recommendation apparatus, comprising:

a processor; and

a memory having stored thereon computer readable instructions which, when executed by the processor, implement the information recommendation method of any of claims 1-7.

13. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, carries out the method of any one of claims 1-7.