Disclosure of Invention
In view of the above, the present invention provides an information recommendation method and apparatus for selecting an appropriate channel to recommend information.
In order to achieve the purpose, the invention has the following technical scheme:
an information recommendation method, comprising:
acquiring historical channels of a user, determining the sum of specific gravity values of all the historical channels of the user according to the specific gravity value of each historical channel, and determining the priority of the user according to the sum of specific gravity values of all the historical channels of the user, wherein the historical channels are historical channels for acquiring information by the user;
respectively calculating the similarity between a target current channel and each historical channel, and determining the specific gravity value of the target current channel according to the similarity between the target current channel and each historical channel and the specific gravity value of the corresponding historical channel, wherein the target current channel is each current channel;
sorting the current channels according to the specific gravity values of the current channels, and reordering the sorting of the current channels by utilizing a maximum boundary correlation algorithm;
and recommending information through the reordered current channel according to the priority of the user.
Optionally, the calculating the similarity between the target current channel and each of the historical channels, and determining the specific gravity value of the target current channel according to the similarity between the target current channel and each of the historical channels and the specific gravity value of the corresponding historical channel respectively includes:
calculating the similarity between the target current channel and each historical channel to obtain n weighted values; n is the number of the historical channels;
multiplying each weight value by the specific gravity value of the corresponding historical channel to obtain n maximum values;
and adding the n maximum values to obtain the specific gravity value of the target current channel.
Optionally, the obtaining of the historical channels of the user determines, according to the specific gravity value of each historical channel, a total specific gravity value of each historical channel of the user, including;
acquiring historical channels of a user, and summarizing the historical channels;
determining a specific gravity value of each historical channel according to the user usage number of each historical channel;
and determining the sum of the specific gravity values of the historical channels of the user according to the specific gravity value of each historical channel.
Optionally, after the obtaining of the historical channels of the user, before the determining of the sum of the specific gravity values of the respective historical channels of the user according to the specific gravity value of each historical channel, the method further includes:
cleaning data of the historical channel of the user, wherein the cleaning of the data comprises the following steps: null values and outliers are removed.
Optionally, the respectively calculating the similarity between each historical channel and the target current channel includes:
determining a feature vector of each historical channel according to the priority of the user and the user of each historical channel;
determining a feature vector of the target current channel according to the priority of the user and the user of the current channel;
and calculating cosine values of the feature vector of each historical channel and the feature vector of the target current channel, wherein each cosine value represents the similarity of each historical channel and the target current channel respectively.
An information recommendation apparatus comprising:
the acquiring unit is used for acquiring historical channels of a user, determining the sum of specific gravity values of all the historical channels of the user according to the specific gravity value of each historical channel, and determining the priority of the user according to the sum of specific gravity values of all the historical channels of the user, wherein the historical channels are historical channels for acquiring information by the user;
the calculating unit is used for calculating the similarity between a target current channel and each historical channel respectively, and determining the specific gravity value of the target current channel according to the similarity between the target current channel and each historical channel and the specific gravity value of the corresponding historical channel, wherein the target current channel is each of the current channels respectively;
the sorting unit is used for sorting the current channels according to the specific gravity values of the current channels and reordering the sorting of the current channels by utilizing a maximum boundary correlation algorithm;
and the recommending unit is used for recommending information through the reordered current channel according to the priority of the user.
Optionally, the calculating unit is specifically configured to calculate a similarity between the target current channel and each of the historical channels, so as to obtain n weight values; n is the number of the historical channels; multiplying each weight value by the specific gravity value of the corresponding historical channel to obtain n maximum values; and adding the n maximum values to obtain the specific gravity value of the target current channel.
Optionally, the obtaining unit is specifically configured to obtain a history channel of a user, and summarize the history channel; determining a specific gravity value of each historical channel according to the user usage number of each historical channel; and determining the sum of the specific gravity values of the historical channels of the user according to the specific gravity value of each historical channel.
Optionally, the method further includes: a cleaning unit, configured to clean data of the historical channel of the user, where the data cleaning includes: null values and outliers are removed.
Optionally, the computing unit is specifically configured to determine a feature vector of each history channel according to the priority of the user and the user of each history channel; determining a feature vector of the target current channel according to the priority of the user and the user of the current channel; and calculating cosine values of the feature vector of each historical channel and the feature vector of the target current channel, wherein each cosine value represents the similarity of each historical channel and the target current channel respectively.
The information recommendation method provided by the embodiment of the invention comprises the following steps: acquiring historical channels of a user, and determining the sum of specific gravity values of the historical channels of the user according to the specific gravity of each historical channel; determining the priority of the user according to the sum of the specific gravity values of all historical channels of the user, wherein the historical channels are the historical channels for the user to obtain information; respectively calculating the similarity of each historical channel and the target current channel, and determining the specific gravity value of the target current channel according to the similarity of each historical channel and the target current channel and the specific gravity value of the corresponding historical channel, wherein the target current channel is each of the current channels; and sequencing the current channels according to the specific gravity values of the current channels, reordering the sequencing of the current channels by using a maximum boundary correlation algorithm, and recommending information through the reordered current channels according to the priority of a user. Therefore, the priority of the user is determined according to the specific gravity of the historical channel, so that information recommendation can be performed on the user with higher priority in the following priority, the information recommendation efficiency is improved, then the specific gravity of the current channel is obtained according to the similarity between the current channel and the historical channel and the specific gravity of the historical channel, the sequence of the current channel is determined, the sequence of the current channel is reordered to ensure the relevance of the sequencing result, a proper channel is selected for the user, and then information recommendation is performed on the reordered current channel according to the priority of the user, so that the user can effectively receive information.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways than those specifically described and will be readily apparent to those of ordinary skill in the art without departing from the spirit of the present invention, and therefore the present invention is not limited to the specific embodiments disclosed below.
As described in the background art, with the development of information technology, many conveniences are brought to people's lives, and at the same time, a huge amount of data information is brought. The current information recommendation method is mainly to recommend information to users directly according to each channel, the receiving degree of the users to the channels is different, and when the channel estimation is not accurate, the receiving of the users to the information is affected, so that the users cannot receive the information effectively. Therefore, there is a need for an information recommendation method that selects an appropriate channel to recommend information according to the needs of a user.
Therefore, an embodiment of the present application provides an information recommendation method, including: acquiring historical channels of a user, and determining the sum of specific gravity values of the historical channels of the user according to the specific gravity of each historical channel; determining the priority of the user according to the sum of the specific gravity values of all historical channels of the user, wherein the historical channels are the historical channels for the user to obtain information; respectively calculating the similarity of each historical channel and the target current channel, and determining the specific gravity value of the target current channel according to the similarity of each historical channel and the target current channel and the specific gravity value of the corresponding historical channel, wherein the target current channel is each of the current channels; and sequencing the current channels according to the specific gravity values of the current channels, reordering the sequencing of the current channels by using a maximum boundary correlation algorithm, and recommending information through the reordered current channels according to the priority of a user. Therefore, the priority of the user is determined according to the specific gravity of the historical channel, so that information recommendation can be performed on the user with higher priority in the following priority, the information recommendation efficiency is improved, then the specific gravity of the current channel is obtained according to the similarity between the current channel and the historical channel and the specific gravity of the historical channel, the sequence of the current channel is determined, the sequence of the current channel is reordered to ensure the relevance of the sequencing result, a proper channel is selected for the user, and then information recommendation is performed on the reordered current channel according to the priority of the user, so that the user can effectively receive information.
In order to facilitate understanding of the technical solutions and effects of the present application, specific embodiments will be described in detail below with reference to the accompanying drawings.
Referring to fig. 1, in step S01, historical channels of a user are obtained, a sum of specific gravity values of the historical channels of the user is determined according to the specific gravity value of each historical channel, and a priority of the user is determined according to the sum of specific gravity values of the historical channels of the user, where the historical channels are historical channels through which the user obtains information.
When information recommendation is performed, in order to improve the satisfaction degree of a user, a proper information recommendation channel needs to be selected according to the requirements of different target groups. The history channel is a history channel for the user to acquire information, namely a channel for the user to acquire information in the past. After the historical channels of the user are obtained, the sum of the specific gravity values of the historical channels of the user is determined according to the specific gravity value of each historical channel. The data of the historical channel can be cleaned after the historical channel of the user is acquired, for example, null values are removed to hi and outliers are removed, and therefore reliability of the data is improved. Specifically, historical channels of the users are obtained, the historical channels are collected, the specific gravity value of each historical channel is determined according to the using number of the users of each historical channel, and the sum of the specific gravity values of the historical channels of the users is determined according to the specific gravity value of each historical channel.
The users are taken as four examples and are marked as a first user, a second user, a third user and a fourth user, the historical channels of the first user comprise a first channel, a second channel and a third channel, the historical channels of the second user comprise a first channel, a second channel, a fourth channel and a fifth channel, the historical channels of the third user comprise a second channel, a fifth channel and a sixth channel, and the historical channels of the fourth user comprise a first channel, a second channel and a fourth channel. The historical channels are summarized into a first channel, a second channel, a third channel, a fourth channel, a fifth channel, a sixth channel and a seventh channel. The users of the first channel are the first user, the third user and the fourth user, namely the number of the users of the first channel is three; the users of the second channel are the first user, the second user, the third user and the fourth user, namely the number of the users of the second channel is four; the user using the third channel is the first user, namely the user using number of the third channel is one; the users in the fourth channel are the first user, the second user and the fourth user, namely the number of the users in the fourth channel is three; the users in the fifth channel are the second user and the third user, namely the number of the users in the fifth channel is two; the user of the sixth channel is the third user, that is, the number of the users of the sixth channel is one. Then, the number of user usages of each channel is sorted into a second channel, a first channel, a fourth channel, a fifth channel, a third channel and a sixth channel, and the number of user usages of the first channel is equal to the number of user usages of the fourth channel, the number of user usages of the third channel is equal to the number of user usages of the sixth channel, the specific gravity value of the first channel may be 3/14, the specific gravity value of the second channel is 4/14, the specific gravity value of the third channel is 1/14, the specific gravity value of the fourth channel is 3/14, the specific gravity value of the fifth channel is 2/14, and the specific gravity value of the sixth channel is 1/14.
And then, determining the priority of the user according to the specific gravity value of each historical channel of the user, specifically, after determining the specific gravity value of each historical channel, obtaining the total specific gravity value of each historical channel used by the user according to the historical channel used by the user and the specific gravity corresponding to the historical channel, and determining the priority of the user according to the specific gravity value. For example, all the historical channels include a first channel, a second channel, a third channel, a fourth channel, a fifth channel and a sixth channel, the specific gravity value of the first channel can be 3/14, the specific gravity value of the second channel is 4/14, the specific gravity value of the third channel is 1/14, the specific gravity value of the fourth channel is 3/14, the specific gravity value of the fifth channel is 2/14 and the specific gravity value of the sixth channel is 1/14. The historical channels of the first user comprise a first channel, a second channel and a third channel, the historical channels of the second user comprise the first channel, the second channel, a fourth channel and a fifth channel, the historical channels of the third user comprise the second channel, the fifth channel and a sixth channel, and the historical channels of the fourth user comprise the first channel, the second channel and the fourth channel. The specific gravity value of each history channel of the first user is 8/14 in total, the specific gravity value of each history channel of the second user is 10/14 in total, the specific gravity value of each history channel of the third user is 7/14 in total, and the specific gravity value of each history channel of the fourth user is 11/14 in total. Thus, the first user, the second user, the third user, and the fourth user are prioritized as the fourth user, the second user, the first user, and the third user.
In step S02, a similarity between each of the historical channels and a target current channel is calculated, and a specific gravity value of the target current channel is determined according to the similarity between each of the historical channels and the target current channel and the corresponding specific gravity value of the historical channel, where the target current channel is each of the current channels.
In this embodiment, the similarity of each historical channel to the target current channel is first calculated, so as to obtain n weighted values, where n is the number of the historical channels. For example, historical channels include: the method comprises the steps of calculating the similarity between a target current channel and the first history channel to obtain a first weight value, calculating the similarity between the target current channel and the second history channel to obtain a second weight value, calculating the similarity between the target current channel and the third history channel to obtain a third weight value, and calculating the similarity between the target current channel and the fourth history channel to obtain a fourth weight value. Obviously, each history channel corresponds to a weight value.
And then, multiplying each weight value by the specific gravity value of the corresponding historical channel respectively to obtain n maximum values. For example, the similarity between the target current channel and the first historical channel is 0.5, that is, the first weight value is 0.5; the similarity between the target current channel and the second historical channel is 0.8, namely the second weighted value is 0.8; the similarity between the target current channel and the third history channel is 0.2, namely the third weight value is 0.2; the similarity between the target current channel and the fourth historical channel is 0.7, that is, the fourth weight value is 0.7. And the specific gravity value of the first channel may be 3/14, the specific gravity value of the second channel is 4/14, the specific gravity value of the third channel is 1/14, and the specific gravity value of the fourth channel is 3/14, the maximum value obtained by multiplying the first history channel by the first weight value is 3/28, the maximum value obtained by multiplying the second history channel by the second weight value is 8/35, the maximum value obtained by multiplying the second history channel by the third weight value is 1/70, and the maximum value obtained by multiplying the fourth history channel by the fourth weight value is 3/20. And adding the n maximum values to obtain a specific gravity value of the target current channel, namely 1/2. Since the target current channels are each of the current channels, when the number of the current channels is multiple, the specific gravity value of each target current channel, that is, the specific gravity value of each channel in the current channels, can be obtained.
In this embodiment, the feature vector of each historical channel may be determined according to the priority of the user and the user of each historical channel, the feature vector of the target current channel may be determined according to the priority of the user and the user of the current channel, and then cosine values of the feature vector of each historical channel and the feature vector of the target current channel are calculated, where each cosine value represents the similarity between the historical channel and the target channel. For example, the history channels include a first history channel, a second history channel, a third history channel and a fourth history channel, the priority of the customers is ranked as a first customer, a second customer, a third customer, a fourth customer, … Nth customer, the users of the first history channel are the first customer and the second customer, and the feature vector of the first history channel is {1, 1, 0, …, 0 }; if the users of the second history channel are the second customer, the third customer and the fourth customer, the feature vector of the second history channel is {0, 1, 1, 1, …, 0 }; if the users of the third history channel are the first customer, the second customer and the third customer, the feature vector of the third history channel is {1, 1, 1, …, 0}, and if the users of the fourth history channel are the first customer, the second customer, the third customer and the fourth customer, the feature vector of the fourth history channel is {1, 1, 1, 1, … 0 }. The users of the target current channel are the second customer and the third customer, and the feature vector of the target current channel is {0, 1, 1, …, 0 }. And then, calculating the cosine values of the first historical channel and the target current channel as 1/2, the cosine values of the second historical channel and the target current channel as √ 6/6, the cosine values of the third historical channel and the target current channel as √ 6/3, and the cosine values of the fourth historical channel and the target current channel as √ 2/2. The cosine values of the first historical channel and the target current channel represent the similarity between the first historical channel and the target current channel, the cosine values of the second historical channel and the target current channel represent the similarity between the second historical channel and the target current channel, the cosine values of the third historical channel and the target current channel represent the similarity between the third historical channel and the target current channel, and the cosine values of the fourth historical channel and the target current channel represent the similarity between the fourth historical channel and the target current channel.
In step S03, the current channel is ranked according to its specific gravity value, and the ranking of the current channel is reordered by using a maximum boundary correlation algorithm.
After determining each current channel specific gravity value for each user, the ranking of each user' S current channel may be determined and then reordered at step S02.
For example, the current channels of the first user include a first current channel, a second current channel, a third current channel and a fourth current channel, and the specific gravity value of the first current channel is 1/2, the specific gravity value of the second current channel is 1/3, the specific gravity value of the third current channel is 3/4, and the specific gravity value of the fourth current channel is 1/4, then the current channels are ranked according to the specific gravity value of the current channels, and then the current channels are the third current channel, the first current channel, the second current channel and the fourth current channel.
And then, reordering the ranking of the current channel of the first user by using a maximum boundary correlation algorithm (MMR) so as to ensure the correlation of the ranking results, reduce redundant ranking results and increase the diversity. Specifically, the formula of the maximum boundary correlation algorithm is as follows:
in the formula: q represents a user, S represents a channel set, R \ S represents a channel set Sim1(Di, Q) selected from the channel set and represents a specific gravity value of a current channel, and Sim2(Di, Dj) represents the similarity between the current channel and a historical channel.
For example, λ is 0.5, λ value can be adjusted according to actual conditions, the specific gravity value of the first current channel is 1/2, the specific gravity value of the second current channel is 1/3, the specific gravity value of the third current channel is 3/4, and the specific gravity value of the fourth current channel is 1/4. And the maximum similarity between the first current channel and the historical channel is 0.2, the maximum similarity between the second current channel and the historical channel is 0.5, the maximum similarity between the third current channel and the historical channel is 0.3, and the maximum similarity between the fourth current channel and the historical channel is 0.8. And sorting the current channels according to the specific gravity value of the current channel to form a third current channel, a first current channel, a second current channel and a fourth current channel. Then the channel set is { third current channel, first current channel, second current channel, fourth current channel }. Selecting a third current channel from the channel set, wherein the MMR value of the third current channel is 0.525; selecting a first current channel from the channel set, wherein the MMR value of the first current channel is 0.15; selecting a second current channel from the channel set, wherein the MMR value of the second current channel is-0.08; selecting a fourth current channel from the channel set, the MMR value of the fourth current channel is-0.125. And extracting the third current channel from the channel set because the MMR value of the third current channel in the channel set is the maximum, wherein the channel set is changed into { the first current channel, the second current channel and the fourth current channel }. For convenience of description, the channel set before the change is referred to as a first channel set, and the channel set after the change is referred to as a second channel set, i.e., the second channel set is { the first current channel, the second current channel, and the fourth current channel }. If a channel is extracted from the second channel set, for example, the fourth current channel, the second channel set is changed to { the first current channel, the second current channel }, and is recorded as a third channel set. Then, a channel is extracted from the third set of channels, such as the second current channel. And determining that the reordered channel sequence of the current channel of the first user is a third current channel, a fourth current channel, a second current channel and a first current channel. Similarly, the current channel ordering of the second user can be reordered by using a maximum boundary correlation algorithm, the current channel ordering of the third user can be reordered, and the reordering of the current channel of each user can be realized.
In step S04, information recommendation is performed through the reordered current channel according to the priority of the user.
In this embodiment, the sorted channels are sorted into a third current channel, a first current channel, a second current channel, and a fourth current channel, and the sorted channels are sorted into the first current channel, the second current channel, the third current channel, and the fourth current channel. The priority order of the users is a first user, a second user, a third user and a fourth user, information recommendation can be performed on the first user sequentially through a first current channel, a second current channel, a third current channel and a fourth current channel, information recommendation can be performed on the second user sequentially through the first current channel, the second current channel, the third current channel and the fourth current channel, information recommendation is performed on the third user sequentially through the first current channel, the second current channel, the third current channel and the fourth current channel, and information recommendation is performed on the fourth user sequentially through the first current channel, the second current channel, the third current channel and the fourth current channel.
As described above in detail with respect to the information recommendation method provided in the embodiment of the present application, an embodiment of the present application further provides an information recommendation apparatus, which is shown in fig. 2 and includes:
an obtainingunit 201, configured to obtain historical channels of a user, determine, according to a specific gravity value of each historical channel, a total specific gravity value of each historical channel of the user, and determine, according to a total specific gravity value of each historical channel of the user, a priority of the user, where the historical channel is a historical channel through which the user obtains information;
a calculatingunit 202, configured to calculate similarities between a target current channel and each of the historical channels, respectively, and determine a specific gravity value of the target current channel according to the similarities between the target current channel and each of the historical channels and the specific gravity value of the corresponding historical channel, where the target current channel is each of the current channels;
thesorting unit 203 is configured to sort the current channels according to the specific gravity values of the current channels, and reorder the sort of the current channels by using a maximum boundary correlation algorithm;
and the recommendingunit 204 is configured to recommend information through the reordered current channel according to the priority of the user.
In this embodiment, the calculatingunit 202 is specifically configured to calculate a similarity between the target current channel and each of the historical channels, so as to obtain n weighted values; n is the number of the historical channels; multiplying each weight value by the specific gravity value of the corresponding historical channel to obtain n maximum values; and adding the n maximum values to obtain the specific gravity value of the target current channel.
In this embodiment, the obtainingunit 201 is specifically configured to obtain a history channel of a user, and collect the history channel; determining a specific gravity value of each historical channel according to the user usage number of each historical channel; and determining the sum of the specific gravity values of the historical channels of the user according to the specific gravity value of each historical channel.
In this embodiment, the method further includes: a cleaning unit, configured to clean data of the historical channel of the user, where the data cleaning includes: null values and outliers are removed.
In this embodiment, the calculatingunit 202 is specifically configured to determine a feature vector of each history channel according to the priority of the user and the user of each history channel; determining a feature vector of the target current channel according to the priority of the user and the user of the current channel; and calculating cosine values of the feature vector of each historical channel and the feature vector of the target current channel, wherein each cosine value represents the similarity of each historical channel and the target current channel respectively.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points.
The foregoing is only a preferred embodiment of the present invention, and although the present invention has been disclosed in the preferred embodiments, it is not intended to limit the present invention. Those skilled in the art can make numerous possible variations and modifications to the present teachings, or modify equivalent embodiments to equivalent variations, without departing from the scope of the present teachings, using the methods and techniques disclosed above. Therefore, any simple modification, equivalent change and modification made to the above embodiments according to the technical essence of the present invention are still within the scope of the protection of the technical solution of the present invention, unless the contents of the technical solution of the present invention are departed.