Movatterモバイル変換


[0]ホーム

URL:


CN109670855A - The methods of marking and device of information flow platform author - Google Patents

The methods of marking and device of information flow platform author
Download PDF

Info

Publication number
CN109670855A
CN109670855ACN201811299493.8ACN201811299493ACN109670855ACN 109670855 ACN109670855 ACN 109670855ACN 201811299493 ACN201811299493 ACN 201811299493ACN 109670855 ACN109670855 ACN 109670855A
Authority
CN
China
Prior art keywords
author
log
user
information flow
flow platform
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811299493.8A
Other languages
Chinese (zh)
Inventor
陈翔
张济显
唐传洋
韩振岭
张颖
李伟力
赵国振
范强
任宝鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co LtdfiledCriticalBeijing Qihoo Technology Co Ltd
Priority to CN201811299493.8ApriorityCriticalpatent/CN109670855A/en
Publication of CN109670855ApublicationCriticalpatent/CN109670855A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

The present invention provides the methods of marking and device of a kind of information flow platform author.This method comprises: obtaining user's original log of information flow platform from multiple and different channels;User's original log is parsed by rule parsing engine, the user journal after being parsed, wherein the rule parsing engine is constructed according to the different respective log resolution rules of channel;Author's log is obtained from the background data base of the information flow platform;According to the evaluation index of the output performance of each author in the user journal and author's log statistic special time period after the parsing, wherein the evaluation index includes quality, production capacity, temperature, profession degree and credit rating;Summation is weighted to the quality, production capacity, temperature, profession degree and credit rating, obtains the evaluation score of the author.The present invention realizes multisource data fusion, ensure that the accuracy, stability and availability of data, and guarantee the fairness, objectivity and accuracy to author assessment.

Description

The methods of marking and device of information flow platform author
Technical field
The present invention relates to Internet technical field, especially a kind of methods of marking of information flow platform author, information levellingScoring apparatus, computer storage medium and the calculating equipment of platform author.
Background technique
Information flow (feed) is that one kind can be with the content stream of rolling view.The number of users of information flow platform is huge at present,In order to be supplied to the good reading experience of user, platform author need to be evaluated according to user behavior data, thus to authorQuality carries out control.
In current enterprise, in order to more comprehensively understand user, need to obtain different users from multiple products and dimensionBehavioral data, and then more comprehensive evaluation is made to author.However, due to company size and technological reserve is different, multi-source numberIt is huge according to the scale of construction, regular it is chaotic, process is cumbersome, data delay and exception, the problems such as business demand is different, cause to be difficult quickly to takeBuild practical, accurate, stable multisource data fusion system.
In addition, realizing the evaluation to author's performance using machine learning algorithm, specifically in existing author assessment systemImplementation are as follows: select the related data of a part of author as data set, according to the behavioral data of user to author's mark pointSelected data set is divided into training set and test set by number, according to the different regression model of training set training, uses test setData select optimal regression model, are predicted according to the optimum regression model selected testing data.However, using returningModel evaluates the performance of author, actually learns the rule that label is marked for author, but there is always certain for regression modelError rate, it cannot be guaranteed that 100% accuracy, so not can guarantee the fairness for all authors.
Therefore, a kind of Stability and veracity that can guarantee multi-source data is needed, and guarantees the justice for authorThe author assessment method of property.
Summary of the invention
In view of the above problems, it proposes on the present invention overcomes the above problem or at least be partially solved in order to provide one kindState the methods of marking of information flow platform author of problem, the scoring apparatus of information flow platform author, computer storage medium andCalculate equipment.
One side according to an embodiment of the present invention provides the methods of marking of information flow platform author a kind of, comprising:
User's original log of information flow platform is obtained from multiple and different channels;
User's original log is parsed by rule parsing engine, the user journal after being parsed, whereinThe rule parsing engine is constructed according to the different respective log resolution rules of channel;
Author's log is obtained from the background data base of the information flow platform;
According to after the parsing user journal and author's log score the author of the information flow platform.
Optionally, the different channels include mobile device application APP client and/or PC APP clientEnd.
Optionally, after obtaining author's log in the background data base from the information flow platform, the method is also wrappedIt includes:
By after the parsing user journal and author's log be saved into Hadoop distributed file system.
Optionally, article/video system that the user journal after the parsing and author's log are delivered by authorOne Resource Locator URL is associated.
Optionally, after the author to the information flow platform scores, the method also includes:
The appraisal result of author to the information flow platform is saved in MySQL tables of data.
Optionally, according to after the parsing user journal and author's log to the author of the information flow platform intoRow scoring, comprising:
According to the output of each author in the user journal and author's log statistic special time period after the parsingThe evaluation index of performance, and calculate according to the evaluation index evaluation score of the author.
Optionally, the evaluation index includes quality, production capacity, temperature, profession degree and credit rating;
The evaluation score of the author is calculated according to the evaluation index, comprising:
Summation is weighted to the quality, production capacity, temperature, profession degree and credit rating, obtains the evaluation score of the author.
Optionally, the quality of the article/video user's evaluation parametric statistics author delivered according to each author,In, the user's evaluation parameter include reading/viewing duration, user click data, user's sharing data, user comment data,User collects data and user thumbs up one or more of data.
Optionally, the quality Q (X) of each author is counted according to the following formula:
Q (X)=conversion ratio+reading/viewing duration+log (mean apparent)+log (optimal performance);
Wherein, conversion ratio=clicking rate+sharing rate+comment rate+collection rate+thumbs up rate-and does not like rate,
The amount of thumbing up of the mean apparent=average click volume+average amount of collection of average sharing amount+average review amount++ averagely,
Optimal performance=highest click volume+highest sharing amount+highest comment amount+highest amount of collection+highest amount of thumbing up,
Conversion ratio, reading/viewing duration, log (mean apparent) and log (optimal performance) are normalized.
Optionally, the production capacity is used to characterize the output efficiency of author;
Quantity is delivered according to article/video of each author and delivers the production capacity of the Efficiency Statistics author.
Optionally, the production capacity P1 of each author is counted according to the following formula:
P1=log (delivering quantity)+deliver efficiency;
Wherein, the total quantity that quantity is article or video that the author delivers is delivered,
Efficiency is delivered to have delivered the number of days of article or video and the designated time period in the author at the appointed time sectionThe ratio between total number of days and the author have delivered all numbers of article or video and total week of the designated time period in the designated time periodThe adduction of the ratio between number,
It log (delivering quantity) and delivers efficiency and is all normalized.
Optionally, the designated time period is monthly.
Optionally, article/video user's concern amount, user's pageview and the user's amount of sharing delivered according to each authorCount the temperature of the author.
Optionally, the temperature P2 of each author is counted according to the following formula:
P2=log (user's concern amount)+log (user's pageview)+log (user's sharing amount);
Wherein, place is normalized with log (user's amount of sharing) in log (user's concern amount), log (user's pageview)Reason.
Optionally, the profession degree is for characterizing author in the influence power of different field;
The professional degree of the author is counted in the quality and production capacity of different field according to each author.
Optionally, each author is counted according to the following formula in the professional degree P3 in each field:
The adduction of the quality in a certain field P3=and adduction/all spectra quality of production capacity and production capacity.
Optionally, the credit rating C of each author is counted according to the following formula:
C=100- audits deduction of points-customer complaint deduction of points;
Wherein, the standard of the audit deduction of points includes at least one following:
It is against the form of the statute, violate social ethics, contain flame.
Optionally, according to each author in the user journal and author's log statistic special time period after the parsingOutput performance evaluation index, and calculate according to the evaluation index evaluation score of the author, comprising:
According to after the parsing user journal and author's log count each work in the special time period respectivelyThe evaluation index of the article output performance of person and the evaluation index of video output performance;
Respectively according to the evaluation index of the evaluation index of article output performance and video output performance, calculateTo the article overall evaluation score and video overall evaluation score of the author;
Article overall evaluation score and video overall evaluation score to the author are weighted summation, obtain the author'sOverall merit score.
Optionally, according to after the parsing user journal and author's log counted in the special time period respectivelyThe evaluation index of the article output performance of each author and the evaluation index of video output performance, comprising:
According to after the parsing user journal and author's log count each work in the special time period respectivelyThe evaluation index of article output performance of the person in variant field and the evaluation index of video output performance;
Respectively according to the evaluation index of the evaluation index of article output performance and video output performance, calculateTo the article overall evaluation score and video overall evaluation score of the author, comprising:
The evaluation index of article output performance according to the author in each field and the video output tables respectivelyArticle evaluation score and video evaluation score of the author in each field is calculated in existing evaluation index;
Summation is weighted to article evaluation score of the author in each field, obtains the article overall evaluation of the authorScore;
Summation is weighted to video evaluation score of the author in each field, obtains the video overall evaluation of the authorScore.
According to another aspect of an embodiment of the present invention, the scoring apparatus of information flow platform author a kind of is additionally provided, comprising:
User journal obtains module, suitable for obtaining user's original log of information flow platform from multiple and different channels;
User journal parsing module is obtained suitable for being parsed by rule parsing engine to user's original logUser journal after parsing, wherein the rule parsing engine is according to the different respective log resolution rules buildings of channel's;
Author's log acquisition module, suitable for obtaining author's log from the background data base of the information flow platform;And
Author score statistical module, suitable for according to after the parsing user journal and author's log to the informationThe author of levelling platform scores.
Optionally, the different channels include mobile device application APP client and/or PC APP clientEnd.
Optionally, described device further include:
Daily record data preserving module, suitable for author's log of user journal and the acquisition after the parsing to be saved intoIn Hadoop distributed file system.
Optionally, article/video system that the user journal after the parsing and author's log are delivered by authorOne Resource Locator URL is associated.
Optionally, described device further include:
Appraisal result preserving module, suitable for being carried out in author of the author scoring statistical module to the information flow platformAfter scoring, the appraisal result of the author to the information flow platform is saved in MySQL tables of data.
Optionally, author's scoring statistical module is further adapted for:
According to the output of each author in the user journal and author's log statistic special time period after the parsingThe evaluation index of performance, and calculate according to the evaluation index evaluation score of the author.
Optionally, the evaluation index includes quality, production capacity, temperature, profession degree and credit rating;
Author's scoring statistical module is further adapted for:
Summation is weighted to the quality, production capacity, temperature, profession degree and credit rating, obtains the evaluation score of the author.
Optionally, author's scoring statistical module is further adapted for:
The quality of the article/video user's evaluation parametric statistics author delivered according to each author, wherein the useFamily evaluation parameter includes reading/viewing duration, user click data, user's sharing data, user comment data, user's collection numberOne or more of data are thumbed up according to user.
Optionally, author's scoring statistical module is further adapted for:
The quality Q (X) of each author is counted according to the following formula:
Q (X)=conversion ratio+reading/viewing duration+log (mean apparent)+log (optimal performance);
Wherein, conversion ratio=clicking rate+sharing rate+comment rate+collection rate+thumbs up rate-and does not like rate,
The amount of thumbing up of the mean apparent=average click volume+average amount of collection of average sharing amount+average review amount++ averagely,
Optimal performance=highest click volume+highest sharing amount+highest comment amount+highest amount of collection+highest amount of thumbing up,
Conversion ratio, reading/viewing duration, log (mean apparent) and log (optimal performance) are normalized.
Optionally, the production capacity is used to characterize the output efficiency of author;
Author's scoring statistical module is further adapted for:
Quantity is delivered according to article/video of each author and delivers the production capacity of the Efficiency Statistics author.
Optionally, author's scoring statistical module is further adapted for:
The production capacity P1 of each author is counted according to the following formula:
P1=log (delivering quantity)+deliver efficiency;
Wherein, the total quantity that quantity is article or video that the author delivers is delivered,
Efficiency is delivered to have delivered the number of days of article or video and the designated time period in the author at the appointed time sectionThe ratio between total number of days and the author have delivered all numbers of article or video and total week of the designated time period in the designated time periodThe adduction of the ratio between number,
It log (delivering quantity) and delivers efficiency and is all normalized.
Optionally, the designated time period is monthly.
Optionally, author's scoring statistical module is further adapted for:
Article/video user's concern amount, user's pageview and user's amount of the sharing statistics delivered according to each author shouldThe temperature of author.
Optionally, author's scoring statistical module is further adapted for:
The temperature P2 of each author is counted according to the following formula:
P2=log (user's concern amount)+log (user's pageview)+log (user's sharing amount);
Wherein, place is normalized with log (user's amount of sharing) in log (user's concern amount), log (user's pageview)Reason.
Optionally, the profession degree is for characterizing author in the influence power of different field;
Author's scoring statistical module is further adapted for:
The professional degree of the author is counted in the quality and production capacity of different field according to each author.
Optionally, author's scoring statistical module is further adapted for:
Each author is counted according to the following formula in the professional degree P3 in each field:
The adduction of the quality in a certain field P3=and adduction/all spectra quality of production capacity and production capacity.
Optionally, author's scoring statistical module is further adapted for:
The credit rating C of each author is counted according to the following formula:
C=100- audits deduction of points-customer complaint deduction of points;
Wherein, the standard of the audit deduction of points includes at least one following:
It is against the form of the statute, violate social ethics, contain flame.
Optionally, author's scoring statistical module is further adapted for:
According to after the parsing user journal and author's log count each work in the special time period respectivelyThe evaluation index of the article output performance of person and the evaluation index of video output performance;
Respectively according to the evaluation index of the evaluation index of article output performance and video output performance, calculateTo the article overall evaluation score and video overall evaluation score of the author;
Article overall evaluation score and video overall evaluation score to the author are weighted summation, obtain the author'sOverall merit score.
Optionally, author's scoring statistical module is further adapted for:
According to after the parsing user journal and author's log count each work in the special time period respectivelyThe evaluation index of article output performance of the person in variant field and the evaluation index of video output performance;
The evaluation index of article output performance according to the author in each field and the video output tables respectivelyArticle evaluation score and video evaluation score of the author in each field is calculated in existing evaluation index;
Summation is weighted to article evaluation score of the author in each field, obtains the article overall evaluation of the authorScore;
Summation is weighted to video evaluation score of the author in each field, obtains the video overall evaluation of the authorScore.
It is according to an embodiment of the present invention in another aspect, additionally provide a kind of computer storage medium, the computer storageMedia storage has computer program code, when the computer program code is run on the computing device, leads to the calculatingEquipment executes the methods of marking according to above described in any item information flow platform authors.
Another aspect according to an embodiment of the present invention additionally provides a kind of calculating equipment, comprising:
Processor;And
It is stored with the memory of computer program code;
When the computer program code is run by the processor, the calculating equipment is caused to execute according to aboveThe methods of marking of described in any item information flow platform authors.
The methods of marking and device for the information flow platform author that the embodiment of the present invention proposes are obtained from multiple and different channelsAfter user's original log of information flow platform, first with what is constructed according to the respective log form of different channels and resolution rulesRule parsing engine parses user's original log of multi-source, with the user journal after being parsed;Then further according to solutionIt user journal after analysis and scores from author's log that background data base obtains author.By using rule parsing engineThe user journal of separate sources and form is parsed, is solved under the background in user journal from multiple support channels,The problem that data volume is big in data resolving, data are dirty, resolution rules are chaotic realizes multisource data fusion, ensure that dataAccuracy, stability and availability.
Further, according to the output of each author in the user journal and author's log statistic special time period after parsingThe evaluation index of performance, evaluation index include quality, production capacity, temperature, profession degree and credit rating, and according to these evaluation index metersCalculate the evaluation score of author.Production by using quality, production capacity, five temperature, profession degree and credit rating evaluation indexes to authorPerformance is evaluated out, is provided a kind of fair, objective, accurate appraisement system, can be successfully managed different information sources, is hadThere is universality.
The above description is only an overview of the technical scheme of the present invention, in order to better understand the technical means of the present invention,And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects of the present invention, feature and advantage canIt is clearer and more comprehensible, the followings are specific embodiments of the present invention.
According to the following detailed description of specific embodiments of the present invention in conjunction with the accompanying drawings, those skilled in the art will be brighterThe above and other objects, advantages and features of the present invention.
Detailed description of the invention
By reading the following detailed description of the preferred embodiment, various other advantages and benefits are common for this fieldTechnical staff will become clear.The drawings are only for the purpose of illustrating a preferred embodiment, and is not considered as to the present inventionLimitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 is shown to be illustrated using the application scenarios for the method that machine learning algorithm carries out author assessment in the prior artFigure;
The flow chart of the methods of marking of Fig. 2 information flow platform author according to an embodiment of the invention;
Fig. 3 shows the flow diagram of the methods of marking of information flow platform author according to another embodiment of the present invention;
Fig. 4 shows the flow diagram of the data flow in the methods of marking of information flow platform author shown in Fig. 3;
Fig. 5, which is shown, calculates commenting for author in the methods of marking of the information flow platform author of another embodiment according to the present inventionThe flow diagram of valence index and final evaluation score;
Fig. 6 shows the structural schematic diagram of the scoring apparatus of information flow platform author according to an embodiment of the invention;WithAnd
Fig. 7 shows the structural schematic diagram of the scoring apparatus of information flow platform author according to another embodiment of the present invention.
Specific embodiment
Exemplary embodiments of the present disclosure are described in more detail below with reference to accompanying drawings.Although showing the disclosure in attached drawingExemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth hereIt is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosureIt is fully disclosed to those skilled in the art.
The number of users of information flow platform is huge at present, need to be according to user in order to be supplied to the good reading experience of userBehavioral data evaluates platform author, to carry out control to author's mass, and can also be according to author assessment in turnUser carries out commending contents and distribution, and provides excitation for author.However, in the prior art, for from multiple products and dimensionSpend the different user behavior datas obtained, that there are the multi-source data scale of constructions is huge, it is regular it is chaotic, process is cumbersome, data delay andExtremely, the problems such as business demand is different, leads to not realize that accurate, stable multisource data fusion, availability of data are poor.
Further, in the prior art, the evaluation to author's performance is realized usually using machine learning algorithm.In Fig. 1Show a kind of application scenarios schematic diagram for the method that author assessment is carried out using machine learning algorithm in the prior art, whereinAuthor to be evaluated is public's account, and the quality score problem of public's account is realized by the regression model in machine learning.As shown in Figure 1, in the author assessment method, score is marked to author according to the behavioral data of user first, and by author's numberAccording to being divided into training sample data and test sample data.Then, training sample data carry out manual features work to regression modelJourney analysis, specifically, each statistical index data to author carries out screening analysis according to feature importance, by the instruction after screeningWhite silk sample data, which is input in regression model, to be learnt, and is completed the training to regression model, is obtained multiple regression models.It connects, test sample data also carry out manual features project analysis to regression model, so that the evaluation and test to regression model is completed, selectionBest regression model out.Next optimization is iterated to regression model again, obtains final regression model.Finally using mostWhole regression model carries out fractional value prediction to public's account to be evaluated.However, it is found by the inventors that being commented using regression modelThe performance of valence author, actually learns the rule that label is marked for author, but regression model is there is always certain error rate,It cannot be guaranteed that 100% accuracy, so for this part author of prediction error, being unfair property.
In order to solve the above technical problems, the embodiment of the present invention proposes the methods of marking of information flow platform author a kind of.Fig. 2 showsThe flow chart of the methods of marking of information flow platform author according to an embodiment of the invention is gone out.Referring to fig. 2, this method at least may be usedTo include the following steps S202 to step S208.
Step S202 obtains user's original log of information flow platform from multiple and different channels.
Step S204 parses user's original log by rule parsing engine, the user journal after being parsed,Wherein the rule parsing engine is constructed according to the respective log resolution rules of different channels.
Step S206 obtains author's log from the background data base of information flow platform.
Step S208, according to after parsing user journal and author's log score the author of information flow platform.
The methods of marking for the information flow platform author that the embodiment of the present invention proposes is obtaining information flow from multiple and different channelsAfter user's original log of platform, solved first with according to the rule of the respective log form of different channels and resolution rules buildingAnalysis engine parses user's original log of multi-source, with the user journal after being parsed;Then further according to parsing afterUser journal and from background data base obtain author's log score author.By using rule parsing engine to differenceThe user journal of source and form is parsed, and is solved under the background in user journal from multiple support channels, in data solutionThe problem that data volume is big during analysis, data are dirty, resolution rules are chaotic realizes multisource data fusion, ensure that the accurate of dataProperty, stability and availability.
It is more fully evaluated to be made to author, the user behavior data of more fully multiple and different channels need to be obtained,And when user data derive from different channels when, due to from without channel user journal record form (or claim latticeFormula) it is different, then its resolution rules is not also identical, and it is chaotic that dirty data, resolution rules inevitably occurs in conventional log analysis modeProblem.The present invention is above-mentioned to solve by constructing rule parsing engine in advance according to the respective log resolution rules of different channelsProblem.In turn, after executing the user journal that above step S202 is collected from multiple and different channels, in step S204In, collected user journal is carried out according to the resolution rules of separate sources log using constructed rule parsing engineParsing, to realize multisource data fusion, ensure that the accuracy of user data.
Difference channel mentioned above may include mobile device application APP client and/or PC APP visitorFamily end etc., for example, the cell phone application client based on Android (Android) system, based on the cell phone application client of IOS system, baseIn the PC PC APP client of Windows operating system, PC APP client based on (SuSE) Linux OS etc..WithRecord has user to data such as the click of article/video, browsing, comments in the log of family.
In above step S206, author's log is obtained from the background data base of information flow platform, is recorded in author's logThere is author to publish an article the/information such as quantity, time, the field of video.
In alternative embodiment of the invention, user journal after being parsed to user's original log,And from back-end data obtain author's log after, the methods of marking can with the following steps are included:
By after parsing user journal and acquired author's log be saved into Hadoop distributed file system.
Hadoop distributed file system (Hadoop Distributed File System, HDFS) is a kind of suitable fortuneDistributed file system of the row on common hardware (Commodity Hardware), is disposed by means of Hadoop tool,The main advantage of file system is mainly the reading efficiency for improving client.One HDFS cluster is run on master by oneNamenode and it is multiple run on slave Datanode composition.The name that Namenode is responsible for managing file system is emptyBetween and client to the access operation of file system, Datanode is responsible for the data of management storage.File is carried out in the form of blockIt is stored in datanode, the number of copies of block is set, the storage of identical copy block is reached into redundancy into different datanodeEffect prevents loss of data after single datanode disk failure.Therefore, HDFS has high fault tolerance (Fault-Tolerant)The characteristics of, and be designed to be deployed on cheap (low-cost) hardware.Moreover, it provides high-throughput (HighThroughput) carry out the data of access application, the storage for not being afraid of failure is provided for mass data, be super large data set(Large Data Set's) brings many conveniences using processing.
In the embodiment of the present invention, by the way that the user journal after parsing and acquired author's log are saved into HDFS systemIn, the application processing of the high serious forgiveness for mass data, high reading efficiency is provided, to ensure that the stability of data.
Further, it when storing the user journal after parsing and author's log in HDFS system, is sent out by authorThe uniform resource locator (Uniform Resource Locator, URL) of article/video (being referred to as news) of table willUser journal after parsing is associated with author's log, to improve the efficiency of reading data and processing when subsequent log statistic.It should be noted that news mentioned herein is interpreted as sensu lato information, such as hot news, entertainment information, society's moneyNews etc., and the event news more than propagated on TV or network.
In above step S208, according to after parsing user journal and author's log the author of information flow platform is commentedPoint, control is carried out with the output performance to author.
In alternative embodiment of the invention, step S208 can be specifically embodied as following steps:
It is commented according to what the output of each author in the user journal and author's log statistic special time period after parsing showedValence index, and calculate according to evaluation index the evaluation score of the author.
Special time period mentioned herein can be set as the arbitrary target period according to statistical demand, such as nearest oneWeek, one month, 1 year etc..
Further, evaluation index mentioned above may include quality Q (X), production capacity P1, temperature P2, profession degree P3 andCredit rating C.At this point it is possible to add according to the following formula to quality Q (X), production capacity P1, temperature P2, profession degree P3 and credit rating CPower sums to calculate the evaluation score of author:
Score=a1 × Q (X)+a2 × P1+a3 × P2+a4 × P3+a5 × C;
Wherein, a1, a2, a3, a4 and a5 are quality Q (X), production capacity P1, temperature P2, profession degree P3 and credit rating C respectivelyWeight.
In a specific embodiment, a1, a2, a3, a4 and a5 can be set as 1, at this point, author assessment scoreCalculation formula indicate are as follows:
Score=Q (X)+P1+P2+P3+C.
The output of author is showed and is carried out by using five quality, production capacity, temperature, profession degree and credit rating evaluation indexesEvaluation, provides a kind of fair, objective, accurate appraisement system, can successfully manage different information sources, has universality.
It, should after scoring in step S208 the author of information flow platform in alternative embodiment of the inventionMethod can with the following steps are included:
The appraisal result of author to information flow platform is saved in MySQL tables of data.
MySQL is current most popular Relational DBMS, is saved the data in by linked databaseIn different tables, to increase processing speed and managerial flexibility.By the way that calculated result is saved in MySQL tables of data,The reading and use for being conducive to subsequent evaluation data, guarantee the availability of data.
Below to the statistics work of this five evaluation indexes of quality Q (X), production capacity P1, temperature P2, profession degree P3 and credit rating CIt further illustrates.
(1) quality Q (X)
Quality Q (X) characterizes article/video superiority and inferiority degree that each author delivers.It, can in a kind of optional embodimentQuality with the article/video user's evaluation parametric statistics author delivered according to each author.User mentioned herein commentsValence parameter may include article/video reading/viewing duration, user click data, user's sharing data, user comment numberData are collected according to, user and user thumbs up one or more of data.
In a preferred embodiment, the quality Q (X) of each author can be counted according to the following formula:
Q (X)=conversion ratio+reading/viewing duration+log (mean apparent)+log (optimal performance).
In formula above, conversion ratio is the conversion data for all article/videos that the author delivers, is defined as: turnRate=clicking rate+sharing rate+comment rate+collection rate+thumbs up rate-and does not like rate.
User's reading/the viewing time for all article/videos that a length of author delivers when reading/viewing.
Mean apparent is the average value of the user's evaluation data for all article/videos that the author delivers, is defined as: it is averageThe amount of thumbing up of the performance=average click volume+average amount of collection of average sharing amount+average review amount++ averagely.
It is optimal to show as highest user's evaluation data in all article/videos that the author delivers, is defined as: optimal tableExisting=highest click volume+highest sharing amount+highest comment amount+highest amount of collection+highest amount of thumbing up.
Above-mentioned conversion ratio, reading/viewing duration, mean apparent and optimal performance require to be normalized.NormalizingChange is a kind of dimensionless processing means, i.e., the expression formula that will have dimension turns to nondimensional expression formula, become mark by transformationAmount calculates to simplify, and reduces magnitude.
Further, since the numerical value such as click volume, sharing amount, comment amount are usually larger, mean apparent and optimal tableBefore being normalized now, logarithm is first taken, to reduce its order of magnitude, is further simplified calculating.It specifically, can be bottom with e or 10Take logarithm.
(2) production capacity P1
Production capacity P1 is used to characterize the output efficiency of author.It, can be according to each author's in a kind of optional embodimentArticle/video delivers quantity and delivers the production capacity of the Efficiency Statistics author.
In a preferred embodiment, the production capacity P1 of the author can be counted according to the following formula:
P1=log (delivering quantity)+deliver efficiency.
In formula above, the total quantity that quantity is article or video that the author delivers is delivered.
Efficiency is delivered to have delivered the number of days of article or video and the designated time period in the author at the appointed time sectionThe ratio between total number of days and the author have delivered all numbers of article or video and total week of the designated time period in the designated time periodThe adduction of the ratio between number.Designated time period mentioned herein can be identical or different with special time period mentioned above.
In a specific embodiment, which can be for monthly.At this point, delivering efficiency can indicate are as follows:
/ moon number of days+all numbers of sending the documents the moon of delivering efficiency=moon dispatch number of days/week moon number.
It is above-mentioned to deliver quantity and deliver efficiency and require to be normalized.Further, quantity is delivered to be returnedBefore one changes, logarithm is taken, first to reduce its order of magnitude.
(3) temperature P2
Temperature P2 (being referred to as popularity) is for characterizing the welcome or concerned degree of author.In a kind of optional implementationIt, can be (or clear for medium according to article/video user's concern amount that each author delivers, user's pageview in modeThe amount of looking at) temperature of the author is counted with user's amount of sharing (or be medium sharing amount).
In a preferred embodiment, the temperature P2 of the author can be counted according to the following formula:
P2=log (user's concern amount)+log (user's pageview)+log (user's sharing amount).
Above-mentioned user's concern amount, user's pageview require to be normalized with user's amount of sharing.Further, existBefore being normalized, logarithm first is taken to them, to reduce its order of magnitude.
(4) profession degree P3
Professional degree P3 is for characterizing author in the influence power of different field.It, can basis in a kind of optional embodimentEach author counts the professional degree of the author in the quality and production capacity of different field.
In a preferred embodiment, the author can be counted according to the following formula in the professional degree P3 in each field:
The adduction of the quality in a certain field P3=and adduction/all spectra quality of production capacity and production capacity.
The different field being mentioned above can be divided according to actual needs, such as can be divided into current events, sport, joyHappy, science and technology etc..
(5) credit rating C
Credit rating C is intended to encourage original, strike unlawful practice.
In a preferred embodiment, the credit rating C of each author can be counted according to the following formula:
C=100- audits deduction of points-customer complaint deduction of points.
The standard of above-mentioned audit deduction of points may include it is against the form of the statute, violate social ethics, containing flame (such asAt least one of pornography).
The deduction of points value of above-mentioned audit deduction of points and customer complaint deduction of points item can be by platform sets itself.For example, contrary to law10 points of regulation button, divides containing flame button 5, primary 3 points of button of the every complaint of user etc..
A variety of implementations of the links of embodiment illustrated in fig. 2 are described above, specific embodiment will be passed through belowCome be discussed in detail information flow platform author of the invention methods of marking realization process.
Fig. 3 shows the flow diagram of the methods of marking of information flow platform author according to another embodiment of the present invention.It is illustrated referring to methods of marking of the Fig. 3 to the information flow platform author of the embodiment of the present invention.As shown in figure 3, the scoringMethod may comprise steps of:
The first step, log collection.
In this step, user's original log is collected from multiple and different channels, and simultaneously from the rear number of units of information flow platformAccording to collection author's log in library.
In the present embodiment, channel 1 shown in Fig. 3 to channel 4 can be respectively the cell phone application based on android systemIt client, the cell phone application client based on IOS system, the PC APP client based on Windows operating system and is based onThe PC APP client of (SuSE) Linux OS.It should be noted that collecting the quantity of the channel of user journal shown in Fig. 3It is only illustrative with title, the present invention is not limited thereto.
Second step, log parsing.
Due to user data source present diversification, need to according to the log form and log resolution rules of different channels,Construct rule parsing engine.In this step, it using constructed rule parsing engine, is advised according to the parsing of separate sources logThen, collected user's original log is parsed, to realize multisource data fusion, data is accurate after guarantee parsingProperty.
Third step, log storage.
After being parsed to user's original log, by after parsing user journal and collected author's log storage arriveIn HDFS system, the application processing of the high serious forgiveness for mass data, high reading efficiency is provided, to guarantee the steady of dataIt is qualitative.When carrying out log storage, user journal and author's log are associated by news URL.It should be noted that mention hereinAnd news be interpreted as information that sensu lato author delivers, such as hot news, entertainment information, social information etc., withoutThe event news only propagated on TV or network.
4th step, log statistic.
In this step, the author assessment system model constructed using the present invention, according to the user journal and work after parsingPerson's log counts the evaluation index of the output performance of each author in special time period.
In author assessment system model of the invention, the evaluation index of the output performance of author includes quality Q (X), producesEnergy P1, five temperature P2, profession degree P3 and credit rating C dimensions, respective calculation method are as described above.
5th step calculates score.
In this step, summation is weighted to quality Q (X), production capacity P1, temperature P2, profession degree P3 and credit rating C to countCalculate the evaluation score of author.
Further, by the statistical result of evaluation index and commenting for author of the output performance for the author that log statistic obtainsThe calculated result of valence score is saved in MySQL tables of data, in favor of the reading and use of subsequent evaluation data, guarantees dataAvailability.
Further, it after storing the evaluation result data of author, can be developed by platform technology correspondingApplication interface reads the score data in MySQL tables of data by the application interface, is that user carries out content based on score dataRecommend and distribute, or provides excitation by certain method for running for author.
The embodiment of the present invention realizes multisource data fusion, efficiently solves data by building rule parsing engineQuantity is big in resolving, data are dirty, the problem of regular confusion, ensure that the accuracy, stability and availability of data.TogetherWhen, the general appraisement system to information flow platform author is realized, can guarantee fairness, objectivity and standard to author assessmentTrue property effectively supports the content distribution of algorithm, and can be improved the enthusiasm of author's creation by certain method for running.
The process that the data flow in the methods of marking of information flow platform author shown in Fig. 3 is further illustrated in Fig. 4 is shownIt is intended to.Below with reference to Fig. 4, the flow of data stream in the methods of marking of the information flow platform author of the embodiment of the present invention is saidIt is bright.
It is shown in Figure 4, firstly, after collecting user journal and author's log, by platform technology, using constructedRule parsing engine collected user journal is parsed, and by after parsing user journal and collected author dayWill is as periodical initial data storage into HDFS system.
Then, data analysis portion is from periodical initial data is obtained (that is, user journal and work after parsing in HDFS systemPerson's log), according to the periodicity initial data, model calculating is carried out by author assessment system model, obtains score data (packetInclude the evaluation index of the output performance of author and the final evaluation score of author), and score data is saved in MySQL dataEvaluation model data in source are used as in table.
Finally, developing corresponding application interface by platform technology again, MySQL tables of data is read by the application interfaceIn score data, based on score data be that user carries out commending contents and distribution, or by certain method for running be authorExcitation is provided.
Fig. 5, which is shown, calculates commenting for author in the methods of marking of the information flow platform author of another embodiment according to the present inventionThe flow diagram of valence index and final evaluation score.The evaluation of the calculating author of the embodiment of the present invention is referred to referring to Fig. 5The process of mark and final evaluation score is illustrated.
As shown in figure 5, obtaining user's original log and author's log, and user's original log is parsed and is solvedAfter user journal after analysis, firstly, according to after parsing user journal and author's log count in special time period every respectivelyThe evaluation index of the article output performance of one author and the evaluation index of video output performance.Specifically, it is sent out according to each authorUser journal and author's log after the corresponding parsing of the article of table count the article output tables of author special time period NeiExisting evaluation index, including quality Q (X), production capacity P1, temperature P2, profession degree P3 and credit rating C.It is delivered according to each authorWhat user journal and author's log after the corresponding parsing of video showed to count the video output of author special time period NeiEvaluation index, including quality Q (X), production capacity P1, temperature P2, profession degree P3 and credit rating C.
Then, referred to respectively according to the evaluation of the evaluation index of the article output of each author performance and the performance of video outputMark, is calculated the article overall evaluation score and video overall evaluation score of the author.
Finally, the article overall evaluation score and video overall evaluation score to the author are weighted summation, it is somebody's turn to doThe overall merit score of author.The weight of article overall evaluation score and video overall evaluation score can be set according to actual needsIt is fixed, the invention is not limited in this regard.
It is further preferred that the article output performance of each author comments in statistics special time period referring still to Fig. 5When valence index, it is possible to implement are as follows: user journal and author's log after the corresponding parsing of article delivered according to each author comeCount the evaluation index of article output performance of the author in variant field in special time period.In turn, each work is calculatedThe article overall evaluation score of person can be implemented are as follows: according to the author in each field article output performance evaluation index,Article evaluation score of the author in each field is calculated, then, to article evaluation score of the author in each fieldIt is weighted summation, obtains the article overall evaluation score of the author.Field mentioned herein may include current events, sport, joyHappy, science and technology etc..The weight of article evaluation score in each field can be set according to actual needs, and the present invention does not limit thisSystem.
Similarly, in the evaluation index of video output performance for counting each author in special time period, it is possible to implementAre as follows: user journal and author's log after the corresponding parsing of the video delivered according to each author should in special time period to countThe evaluation index of video output performance of the author in variant field.In turn, the video overall evaluation point of each author is calculatedNumber can be implemented are as follows: according to the evaluation index of video output performance of the author in each field, the author is calculated eachThen video evaluation score in field is weighted summation to video evaluation score of the author in each field, is somebody's turn to doThe video overall evaluation score of author.Field mentioned herein may include current events, sport, amusement, science and technology etc..In each fieldThe weight of video evaluation score can set according to actual needs, the invention is not limited in this regard.
Overall merit author again after being evaluated respectively by article to author and the performance of video output, realizes to authorMore objective, accurate evaluation.Further, by calculating separately the article of author and the evaluation of video output performance in each fieldIndex and its overall evaluation score, further simplify calculating.
Based on the same inventive concept, the embodiment of the invention also provides the scoring apparatus of information flow platform author a kind of, useIn the methods of marking for supporting information flow platform author provided by any one above-mentioned embodiment or combinations thereof.Fig. 6 shows rootAccording to the structural schematic diagram of the scoring apparatus of the information flow platform author of one embodiment of the invention.Referring to Fig. 6, which at least can be withIt include: that user journal obtains module 610, user journal parsing module 620, author's log acquisition module 630 and author's scoringStatistical module 640.
Now introduce each composition of the scoring apparatus of the information flow platform author of the embodiment of the present invention or the function of device andConnection relationship between each section:
User journal obtains module 610, suitable for obtaining user's original log of information flow platform from multiple and different channels.
User journal parsing module 620 obtains module 610 with user journal and connect, and is suitable for passing through rule parsing engine pairUser's original log parses, the user journal after being parsed, and wherein the rule parsing engine is each according to different channelsFrom log resolution rules building.
Author's log acquisition module 630, suitable for obtaining author's log from the background data base of information flow platform.
Author's scoring statistical module 640, connects with user journal parsing module 620 and author's log acquisition module 630 respectivelyConnect, suitable for according to after parsing user journal and author's log score the author of information flow platform.
In one alternate embodiment, different channel mentioned above include mobile device application APP client and/Or PC APP client.
In one alternate embodiment, as shown in fig. 7, the scoring apparatus for the information flow platform author that Fig. 6 is shown can be withIncluding daily record data preserving module 750.Daily record data preserving module 750 respectively with user journal parsing module 620, author's logIt obtains module 630 to be connected with author's scoring statistical module 640, author's log suitable for user journal and acquisition after parsingIt is saved into Hadoop distributed file system.In turn, author scores statistical module 640 from Hadoop distributed file systemUser journal and author's log after obtaining parsing carry out author's scoring statistics.
In one alternate embodiment, when storing the user journal after parsing and author's log in HDFS system,Article/video uniform resource position mark URL that user journal and author's log after parsing are delivered by author is associated.
In one alternate embodiment, still referring to shown in Fig. 7, the scoring apparatus of information flow platform author can also be wrappedInclude appraisal result preserving module 760.Appraisal result preserving module 760 is connect with author's scoring statistical module 640, is suitable in authorAfter scoring statistical module 640 scores to the author of information flow platform, by the appraisal result of the author to information flow platformIt is saved in MySQL tables of data.
In one alternate embodiment, author's scoring statistical module 640 is further adapted for:
It is commented according to what the output of each author in the user journal and author's log statistic special time period after parsing showedValence index, and calculate according to evaluation index the evaluation score of the author.
In one alternate embodiment, the evaluation index of the output performance of author includes quality, production capacity, temperature, profession degreeAnd credit rating.
Correspondingly, author's scoring statistical module 640 is further adapted for:
Summation is weighted to the quality of each author, production capacity, temperature, profession degree and credit rating, obtains commenting for the authorValence score.
In one alternate embodiment, author's scoring statistical module 640 is further adapted for:
The quality of the article/video user's evaluation parametric statistics author delivered according to each author, wherein Yong HupingValence parameter include reading/viewing duration, user click data, user's sharing data, user comment data, user collect data andUser thumbs up one or more of data.
In one alternate embodiment, author's scoring statistical module 640 is further adapted for:
The quality Q (X) of each author is counted according to the following formula:
Q (X)=conversion ratio+reading/viewing duration+log (mean apparent)+log (optimal performance);
Wherein, conversion ratio=clicking rate+sharing rate+comment rate+collection rate+thumbs up rate-and does not like rate,
The amount of thumbing up of the mean apparent=average click volume+average amount of collection of average sharing amount+average review amount++ averagely,
Optimal performance=highest click volume+highest sharing amount+highest comment amount+highest amount of collection+highest amount of thumbing up,
Conversion ratio, reading/viewing duration, log (mean apparent) and log (optimal performance) are normalized.
In one alternate embodiment, production capacity is used to characterize the output efficiency of author.Correspondingly, author's scoring statistical module640 are further adapted for:
Quantity is delivered according to article/video of each author and delivers the production capacity of the Efficiency Statistics author.
In one alternate embodiment, author's scoring statistical module 640 is further adapted for:
The production capacity P1 of each author is counted according to the following formula:
P1=log (delivering quantity)+deliver efficiency;
Wherein, the total quantity that quantity is article or video that the author delivers is delivered,
Efficiency is delivered to have delivered the number of days of article or video and the designated time period in the author at the appointed time sectionThe ratio between total number of days and the author have delivered all numbers of article or video and total week of the designated time period in the designated time periodThe adduction of the ratio between number,
It log (delivering quantity) and delivers efficiency and is all normalized.
In one alternate embodiment, designated time period mentioned above is monthly.
In one alternate embodiment, author's scoring statistical module 640 is further adapted for:
Article/video user's concern amount, user's pageview and user's amount of the sharing statistics delivered according to each author shouldThe temperature of author.
In one alternate embodiment, author's scoring statistical module 640 is further adapted for:
The temperature P2 of each author is counted according to the following formula:
P2=log (user's concern amount)+log (user's pageview)+log (user's sharing amount);
Wherein, place is normalized with log (user's amount of sharing) in log (user's concern amount), log (user's pageview)Reason.
In one alternate embodiment, professional degree is for characterizing author in the influence power of different field.Correspondingly, Zuo ZhepingStatistical module 640 is divided to be further adapted for:
The professional degree of the author is counted in the quality and production capacity of different field according to each author.
In one alternate embodiment, author's scoring statistical module 640 is further adapted for:
Each author is counted according to the following formula in the professional degree P3 in each field:
The adduction of the quality in a certain field P3=and adduction/all spectra quality of production capacity and production capacity.
In one alternate embodiment, author's scoring statistical module 640 is further adapted for:
The credit rating C of each author is counted according to the following formula:
C=100- audits deduction of points-customer complaint deduction of points;
Wherein, it includes at least one following for auditing the standard of deduction of points:
It is against the form of the statute, violate social ethics, contain flame.
In one alternate embodiment, author's scoring statistical module 640 is further adapted for:
According to after parsing user journal and author's log count the article output of each author in special time period respectivelyThe evaluation index of evaluation index and video the output performance of performance;
Respectively according to the evaluation index of the evaluation index of the article output of each author performance and the performance of video output, calculateObtain the article overall evaluation score and video overall evaluation score of the author;
Article overall evaluation score and video overall evaluation score to the author are weighted summation, obtain the author'sOverall merit score.
In one alternate embodiment, author's scoring statistical module 640 is further adapted for:
According to after parsing user journal and author's log count in special time period each author respectively in variant neckThe evaluation index of article output performance in domain and the evaluation index of video output performance;
The evaluation of the evaluation index of article output performance according to the author in each field and the performance of video output respectivelyArticle evaluation score and video evaluation score of the author in each field is calculated in index;
Summation is weighted to article evaluation score of the author in each field, obtains the article overall evaluation of the authorScore;
Summation is weighted to video evaluation score of the author in each field, obtains the video overall evaluation of the authorScore.
Based on the same inventive concept, the embodiment of the invention also provides a kind of computer storage mediums.Computer storageMedia storage has computer program code, when the computer program code is run on the computing device, calculating equipment is caused to be heldThe methods of marking of row information flow platform author according to any one above-mentioned embodiment or combinations thereof.
Based on the same inventive concept, the embodiment of the invention also provides a kind of calculating equipment.The calculating equipment may include:
Processor;And
It is stored with the memory of computer program code;
When the computer program code is run by processor, the calculating equipment is caused to execute according to any one above-mentioned realityApply the methods of marking of information flow platform author described in example or combinations thereof.
According to the combination of any one above-mentioned alternative embodiment or multiple alternative embodiments, the embodiment of the present invention can reachIt is following the utility model has the advantages that
The methods of marking and device for the information flow platform author that the embodiment of the present invention proposes are obtained from multiple and different channelsAfter user's original log of information flow platform, first with what is constructed according to the respective log form of different channels and resolution rulesRule parsing engine parses user's original log of multi-source, with the user journal after being parsed;Then further according to solutionIt user journal after analysis and scores from author's log that background data base obtains author.By using rule parsing engineThe user journal of separate sources and form is parsed, is solved under the background in user journal from multiple support channels,The problem that data volume is big in data resolving, data are dirty, resolution rules are chaotic realizes multisource data fusion, ensure that dataAccuracy, stability and availability.
Further, according to the output of each author in the user journal and author's log statistic special time period after parsingThe evaluation index of performance, evaluation index include quality, production capacity, temperature, profession degree and credit rating, and according to these evaluation index metersCalculate the evaluation score of author.Production by using quality, production capacity, five temperature, profession degree and credit rating evaluation indexes to authorPerformance is evaluated out, is provided a kind of fair, objective, accurate appraisement system, can be successfully managed different information sources, is hadThere is universality.
It is apparent to those skilled in the art that the specific work of the system of foregoing description, device and unitMake process, can refer to corresponding processes in the foregoing method embodiment, for brevity, does not repeat separately herein.
In addition, each functional unit in each embodiment of the present invention can be physically independent, can also two orMore than two functional units integrate, and can be all integrated in a processing unit with all functional units.It is above-mentioned integratedFunctional unit both can take the form of hardware realization, can also be realized in the form of software or firmware.
Those of ordinary skill in the art will appreciate that: if the integrated functional unit is realized and is made in the form of softwareIt is independent product when selling or using, can store in a computer readable storage medium.Based on this understanding,Technical solution of the present invention is substantially or all or part of the technical solution can be embodied in the form of software products,The computer software product is stored in a storage medium comprising some instructions, with so that calculating equipment (such asPersonal computer, server or network equipment etc.) various embodiments of the present invention the method is executed when running described instructionAll or part of the steps.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM), random access memoryDevice (RAM), the various media that can store program code such as magnetic or disk.
Alternatively, realizing that all or part of the steps of preceding method embodiment can be (all by the relevant hardware of program instructionSuch as personal computer, the calculating equipment of server or network equipment etc.) it completes, described program instruction can store in oneIn computer-readable storage medium, when described program instruction is executed by the processor of calculating equipment, the calculating equipment is heldThe all or part of the steps of row various embodiments of the present invention the method.
Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present invention., rather than its limitations;To the greatest extentPresent invention has been described in detail with reference to the aforementioned embodiments for pipe, those skilled in the art should understand that: at thisWithin the spirit and principle of invention, it is still possible to modify the technical solutions described in the foregoing embodiments or rightSome or all of the technical features are equivalently replaced;And these are modified or replaceed, and do not make corresponding technical solution de-From protection scope of the present invention.
One side according to an embodiment of the present invention provides a kind of methods of marking of information flow platform author of A1., comprising:
User's original log of information flow platform is obtained from multiple and different channels;
User's original log is parsed by rule parsing engine, the user journal after being parsed, whereinThe rule parsing engine is constructed according to the different respective log resolution rules of channel;
Author's log is obtained from the background data base of the information flow platform;
According to after the parsing user journal and author's log score the author of the information flow platform.
A2. method according to a1, wherein it is described difference channels include mobile device application APP client and/Or PC APP client.
A3. method according to a1 or a2, wherein obtain author in the background data base from the information flow platformAfter log, further includes:
By after the parsing user journal and author's log be saved into Hadoop distributed file system.
A4. method according to a3, wherein user journal and author's log after the parsing are sent out by authorArticle/video uniform resource position mark URL of table is associated.
A5. the method according to any one of A1-A4, wherein score in the author to the information flow platformLater, further includes:
The appraisal result of author to the information flow platform is saved in MySQL tables of data.
A6. the method according to any one of A1-A5, wherein according to the user journal and the work after the parsingPerson's log scores to the author of the information flow platform, comprising:
According to the output of each author in the user journal and author's log statistic special time period after the parsingThe evaluation index of performance, and calculate according to the evaluation index evaluation score of the author.
A7. the method according to A6, wherein the evaluation index includes quality, production capacity, temperature, profession degree and creditDegree;
The evaluation score of the author is calculated according to the evaluation index, comprising:
Summation is weighted to the quality, production capacity, temperature, profession degree and credit rating, obtains the evaluation score of the author.
A8. the method according to A7, wherein united according to article/video user's evaluation parameter that each author deliversCount the quality of the author, wherein the user's evaluation parameter includes reading/viewing duration, user click data, user's sharing numberData are collected according to, user comment data, user and user thumbs up one or more of data.
A9. the method according to A8, wherein count the quality Q (X) of the author according to the following formula:
Q (X)=conversion ratio+reading/viewing duration+log (mean apparent)+log (optimal performance);
Wherein, conversion ratio=clicking rate+sharing rate+comment rate+collection rate+thumbs up rate-and does not like rate,
The amount of thumbing up of the mean apparent=average click volume+average amount of collection of average sharing amount+average review amount++ averagely,
Optimal performance=highest click volume+highest sharing amount+highest comment amount+highest amount of collection+highest amount of thumbing up,
Conversion ratio, reading/viewing duration, log (mean apparent) and log (optimal performance) are normalized.
A10. the method according to A7, wherein the production capacity is used to characterize the output efficiency of author;
Quantity is delivered according to article/video of each author and delivers the production capacity of the Efficiency Statistics author.
A11. the method according to A10, wherein count the production capacity P1 of the author according to the following formula:
P1=log (delivering quantity)+deliver efficiency;
Wherein, the total quantity that quantity is article or video that the author delivers is delivered,
Efficiency is delivered to have delivered the number of days of article or video and the designated time period in the author at the appointed time sectionThe ratio between total number of days and the author have delivered all numbers of article or video and total week of the designated time period in the designated time periodThe adduction of the ratio between number,
It log (delivering quantity) and delivers efficiency and is all normalized.
A12. the method according to A11, wherein the designated time period is monthly.
A13. the method according to A7, wherein article/video user's concern amount, the use delivered according to each authorFamily pageview counts the temperature of the author with user's amount of sharing.
A14. the method according to A13, wherein count the temperature P2 of the author according to the following formula:
P2=log (user's concern amount)+log (user's pageview)+log (user's sharing amount);
Wherein, place is normalized with log (user's amount of sharing) in log (user's concern amount), log (user's pageview)Reason.
A15. the method according to any one of A7-A12, wherein the profession degree is for characterizing author in different necksThe influence power in domain;
The professional degree of the author is counted in the quality and production capacity of different field according to each author.
A16. the method according to A15, wherein count the author according to the following formula in the professional degree in each fieldP3:
The adduction of the quality in a certain field P3=and adduction/all spectra quality of production capacity and production capacity.
A17. the method according to A7, wherein count the credit rating C of each author according to the following formula:
C=100- audits deduction of points-customer complaint deduction of points;
Wherein, the standard of the audit deduction of points includes at least one following:
It is against the form of the statute, violate social ethics, contain flame.
A18. the method according to any one of A6-A17, wherein according to user journal after the parsing and describedThe evaluation index of the output performance of each author in author's log statistic special time period, and being calculated according to the evaluation index shouldThe evaluation score of author, comprising:
According to after the parsing user journal and author's log count each work in the special time period respectivelyThe evaluation index of the article output performance of person and the evaluation index of video output performance;
Respectively according to the evaluation index of the evaluation index of article output performance and video output performance, calculateTo the article overall evaluation score and video overall evaluation score of the author;
Article overall evaluation score and video overall evaluation score to the author are weighted summation, obtain the author'sOverall merit score.
A19. the method according to A18, wherein according to the user journal and author's log difference after the parsingThe evaluation index of the article output performance of each author in the special time period and the evaluation index of video output performance are counted,Include:
According to after the parsing user journal and author's log count each work in the special time period respectivelyThe evaluation index of article output performance of the person in variant field and the evaluation index of video output performance;
Respectively according to the evaluation index of the evaluation index of article output performance and video output performance, calculateTo the article overall evaluation score and video overall evaluation score of the author, comprising:
The evaluation index of article output performance according to the author in each field and the video output tables respectivelyArticle evaluation score and video evaluation score of the author in each field is calculated in existing evaluation index;
Summation is weighted to article evaluation score of the author in each field, obtains the article overall evaluation of the authorScore;
Summation is weighted to video evaluation score of the author in each field, obtains the video overall evaluation of the authorScore.
According to another aspect of an embodiment of the present invention, a kind of scoring apparatus of information flow platform author of B20. is additionally provided,Include:
User journal obtains module, suitable for obtaining user's original log of information flow platform from multiple and different channels;
User journal parsing module is obtained suitable for being parsed by rule parsing engine to user's original logUser journal after parsing, wherein the rule parsing engine is according to the different respective log resolution rules buildings of channel's;
Author's log acquisition module, suitable for obtaining author's log from the background data base of the information flow platform;And
Author score statistical module, suitable for according to after the parsing user journal and author's log to the informationThe author of levelling platform scores.
B21. the device according to B20, wherein the difference channel includes mobile device application APP clientAnd/or PC APP client.
B22. the device according to B20 or B21, wherein further include:
Daily record data preserving module, suitable for author's log of user journal and the acquisition after the parsing to be saved intoIn Hadoop distributed file system.
B23. the device according to B22, wherein user journal and author's log after the parsing pass through authorArticle/video the uniform resource position mark URL delivered is associated.
B24. the device according to any one of B20-B23, wherein further include:
Appraisal result preserving module, suitable for being carried out in author of the author scoring statistical module to the information flow platformAfter scoring, the appraisal result of the author to the information flow platform is saved in MySQL tables of data.
B25. the device according to any one of B20-B24, wherein author's scoring statistical module is further adapted for:
According to the output of each author in the user journal and author's log statistic special time period after the parsingThe evaluation index of performance, and calculate according to the evaluation index evaluation score of the author.
B26. the device according to B25, wherein the evaluation index includes quality, production capacity, temperature, profession degree and letterExpenditure;
Author's scoring statistical module is further adapted for:
Summation is weighted to the quality, production capacity, temperature, profession degree and credit rating, obtains the evaluation score of the author.
B27. the device according to B26, wherein author's scoring statistical module is further adapted for:
The quality of the article/video user's evaluation parametric statistics author delivered according to each author, wherein the useFamily evaluation parameter includes reading/viewing duration, user click data, user's sharing data, user comment data, user's collection numberOne or more of data are thumbed up according to user.
B28. the device according to B27, wherein author's scoring statistical module is further adapted for:
The quality Q (X) of the author is counted according to the following formula:
Q (X)=conversion ratio+reading/viewing duration+log (mean apparent)+log (optimal performance);
Wherein, conversion ratio=clicking rate+sharing rate+comment rate+collection rate+thumbs up rate-and does not like rate,
The amount of thumbing up of the mean apparent=average click volume+average amount of collection of average sharing amount+average review amount++ averagely,
Optimal performance=highest click volume+highest sharing amount+highest comment amount+highest amount of collection+highest amount of thumbing up,
Conversion ratio, reading/viewing duration, log (mean apparent) and log (optimal performance) are normalized.
B29. the device according to B26, wherein the production capacity is used to characterize the output efficiency of author;
Author's scoring statistical module is further adapted for:
Quantity is delivered according to article/video of each author and delivers the production capacity of the Efficiency Statistics author.
B30. the device according to B29, wherein author's scoring statistical module is further adapted for:
The production capacity P1 of the author is counted according to the following formula:
P1=log (delivering quantity)+deliver efficiency;
Wherein, the total quantity that quantity is article or video that the author delivers is delivered,
Efficiency is delivered to have delivered the number of days of article or video and the designated time period in the author at the appointed time sectionThe ratio between total number of days and the author have delivered all numbers of article or video and total week of the designated time period in the designated time periodThe adduction of the ratio between number,
It log (delivering quantity) and delivers efficiency and is all normalized.
B31. the device according to B30, wherein the designated time period is monthly.
B32. the device according to B26, wherein author's scoring statistical module is further adapted for:
Article/video user's concern amount, user's pageview and user's amount of the sharing statistics delivered according to each author shouldThe temperature of author.
B33. the device according to B32, wherein author's scoring statistical module is further adapted for:
The temperature P2 of the author is counted according to the following formula:
P2=log (user's concern amount)+log (user's pageview)+log (user's sharing amount);
Wherein, place is normalized with log (user's amount of sharing) in log (user's concern amount), log (user's pageview)Reason.
B34. the device according to any one of B26-B31, wherein the profession degree is for characterizing author in different necksThe influence power in domain;
Author's scoring statistical module is further adapted for:
The professional degree of the author is counted in the quality and production capacity of different field according to each author.
B35. the device according to B34, wherein author's scoring statistical module is further adapted for:
The author is counted according to the following formula in the professional degree P3 in each field:
The adduction of the quality in a certain field P3=and adduction/all spectra quality of production capacity and production capacity.
B36. the device according to B26, wherein author's scoring statistical module is further adapted for:
The credit rating C of each author is counted according to the following formula:
C=100- audits deduction of points-customer complaint deduction of points;
Wherein, the standard of the audit deduction of points includes at least one following:
It is against the form of the statute, violate social ethics, contain flame.
B37. the device according to any one of B25-B36, wherein author's scoring statistical module is further adapted for:
According to after the parsing user journal and author's log count each work in the special time period respectivelyThe evaluation index of the article output performance of person and the evaluation index of video output performance;
Respectively according to the evaluation index of the evaluation index of article output performance and video output performance, calculateTo the article overall evaluation score and video overall evaluation score of the author;
Article overall evaluation score and video overall evaluation score to the author are weighted summation, obtain the author'sOverall merit score.
B38. the device according to B37, wherein author's scoring statistical module is further adapted for:
According to after the parsing user journal and author's log count each work in the special time period respectivelyThe evaluation index of article output performance of the person in variant field and the evaluation index of video output performance;
The evaluation index of article output performance according to the author in each field and the video output tables respectivelyArticle evaluation score and video evaluation score of the author in each field is calculated in existing evaluation index;
Summation is weighted to article evaluation score of the author in each field, obtains the article overall evaluation of the authorScore;
Summation is weighted to video evaluation score of the author in each field, obtains the video overall evaluation of the authorScore.
It is according to an embodiment of the present invention in another aspect, additionally providing a kind of computer storage medium of C39., the computerStorage medium is stored with computer program code, when the computer program code is run on the computing device, causes describedCalculate the methods of marking that equipment executes the information flow platform author according to any one of A1-A19.
Another aspect according to an embodiment of the present invention additionally provides a kind of calculating equipment of D40., comprising:
Processor;And
It is stored with the memory of computer program code;
When the computer program code is run by the processor, the calculating equipment is caused to execute according to A1-A19Any one of described in information flow platform author methods of marking.

Claims (10)

Translated fromChinese
1.一种信息流平台作者的评分方法,包括:1. A scoring method for authors of an information flow platform, comprising:从多个不同渠道获取信息流平台的用户原始日志;Obtain the original user logs of the information flow platform from multiple different channels;通过规则解析引擎对所述用户原始日志进行解析,得到解析后的用户日志,其中所述规则解析引擎是根据所述不同渠道各自的日志解析规则构建的;The original user log is parsed by a rule parsing engine, and the parsed user log is obtained, wherein the rule parsing engine is constructed according to the respective log parsing rules of the different channels;从所述信息流平台的后台数据库中获取作者日志;Obtain author logs from the background database of the information flow platform;根据所述解析后的用户日志和所述作者日志对所述信息流平台的作者进行评分。The authors of the information flow platform are scored according to the parsed user logs and the author logs.2.根据权利要求1所述的方法,其中,所述不同渠道包括移动设备应用程序APP客户端和/或个人电脑APP客户端。2. The method of claim 1, wherein the different channels include a mobile device application APP client and/or a personal computer APP client.3.根据权利要求1或2所述的方法,其中,在从所述信息流平台的后台数据库中获取作者日志之后,还包括:3. The method according to claim 1 or 2, wherein, after obtaining the author log from the background database of the information flow platform, further comprising:将所述解析后的用户日志和所述作者日志保存入Hadoop分布式文件系统中。The parsed user log and the author log are saved in the Hadoop distributed file system.4.根据权利要求3所述的方法,其中,所述解析后的用户日志和所述作者日志通过作者发表的文章/视频的统一资源定位符URL相关联。4. The method of claim 3, wherein the parsed user log and the author log are associated by a Uniform Resource Locator URL of an article/video published by the author.5.根据权利要求1-4中任一项所述的方法,其中,在对所述信息流平台的作者进行评分之后,还包括:5. The method according to any one of claims 1-4, wherein after scoring the authors of the information flow platform, further comprising:将对所述信息流平台的作者的评分结果保存到MySQL数据表中。The scoring results for the authors of the information flow platform are saved to a MySQL data table.6.根据权利要求1-5中任一项所述的方法,其中,根据所述解析后的用户日志和所述作者日志对所述信息流平台的作者进行评分,包括:6. The method according to any one of claims 1-5, wherein scoring the author of the information flow platform according to the parsed user log and the author log, comprising:根据所述解析后的用户日志和所述作者日志统计特定时间段内每一作者的产出表现的评价指标,并根据所述评价指标计算该作者的评价分数。According to the parsed user log and the author log, the evaluation index of the output performance of each author in a specific time period is calculated, and the evaluation score of the author is calculated according to the evaluation index.7.根据权利要求6所述的方法,其中,所述评价指标包括质量、产能、热度、专业度和信用度;7. The method according to claim 6, wherein the evaluation indicators include quality, productivity, popularity, professionalism and credit;根据所述评价指标计算该作者的评价分数,包括:Calculate the author's evaluation score according to the evaluation indicators, including:对所述质量、产能、热度、专业度和信用度进行加权求和,得到该作者的评价分数。The weighted summation of the quality, productivity, popularity, professionalism and credit is obtained to obtain the author's evaluation score.8.一种信息流平台作者的评分装置,包括:8. A scoring device for authors of an information flow platform, comprising:用户日志获取模块,适于从多个不同渠道获取信息流平台的用户原始日志;The user log acquisition module is suitable for acquiring the original user logs of the information flow platform from multiple different channels;用户日志解析模块,适于通过规则解析引擎对所述用户原始日志进行解析,得到解析后的用户日志,其中所述规则解析引擎是根据所述不同渠道各自的日志解析规则构建的;a user log parsing module, adapted to parse the original user log through a rule parsing engine to obtain a parsed user log, wherein the rule parsing engine is constructed according to the respective log parsing rules of the different channels;作者日志获取模块,适于从所述信息流平台的后台数据库中获取作者日志;以及An author log obtaining module, adapted to obtain author logs from the background database of the information flow platform; and作者评分统计模块,适于根据所述解析后的用户日志和所述作者日志对所述信息流平台的作者进行评分。The author score statistics module is adapted to score the authors of the information flow platform according to the parsed user log and the author log.9.一种计算机存储介质,所述计算机存储介质存储有计算机程序代码,当所述计算机程序代码在计算设备上运行时,导致所述计算设备执行根据权利要求1-7中任一项所述的信息流平台作者的评分方法。9. A computer storage medium storing computer program code which, when executed on a computing device, causes the computing device to perform the execution of any one of claims 1-7 The scoring method of the authors of the news feed platform.10.一种计算设备,包括:10. A computing device comprising:处理器;以及processor; and存储有计算机程序代码的存储器;memory in which computer program code is stored;当所述计算机程序代码被所述处理器运行时,导致所述计算设备执行根据权利要求1-7中任一项所述的信息流平台作者的评分方法。The computer program code, when executed by the processor, causes the computing device to perform the information flow platform author's scoring method of any of claims 1-7.
CN201811299493.8A2018-11-022018-11-02The methods of marking and device of information flow platform authorPendingCN109670855A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201811299493.8ACN109670855A (en)2018-11-022018-11-02The methods of marking and device of information flow platform author

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201811299493.8ACN109670855A (en)2018-11-022018-11-02The methods of marking and device of information flow platform author

Publications (1)

Publication NumberPublication Date
CN109670855Atrue CN109670855A (en)2019-04-23

Family

ID=66141771

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201811299493.8APendingCN109670855A (en)2018-11-022018-11-02The methods of marking and device of information flow platform author

Country Status (1)

CountryLink
CN (1)CN109670855A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110471898A (en)*2019-08-222019-11-19长江师范学院Dissemination method can be traced in a kind of information credit management method and Figures
CN110795658A (en)*2019-09-252020-02-14北京三快在线科技有限公司User scoring method and device, electronic equipment and computer storage medium
CN111104486A (en)*2019-12-252020-05-05郑州师范学院 Comparative explanation system of modern literary works
CN111738608A (en)*2020-06-282020-10-02中国联合网络通信集团有限公司 A channel scoring method and system
CN112785321A (en)*2019-11-072021-05-11北京沃东天骏信息技术有限公司Incentive management method and device
CN113988621A (en)*2021-10-272022-01-28掌阅科技股份有限公司Data processing method, computing device and storage medium for book information producer
CN114118651A (en)*2020-08-282022-03-01腾讯科技(深圳)有限公司Evaluation method, device, equipment and computer storage medium
CN114579868A (en)*2022-03-232022-06-03茅硕Novel app user reading data monitoring and analyzing system based on big data
CN116910628A (en)*2023-09-122023-10-20联通在线信息科技有限公司Creator expertise portrait assessment method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US8762225B1 (en)*2004-09-302014-06-24Google Inc.Systems and methods for scoring documents
CN104657488A (en)*2015-03-052015-05-27中南大学Method for calculating author influence based on citation propagation network
CN106682097A (en)*2016-12-012017-05-17北京奇虎科技有限公司Method and device for processing log data
CN107911721A (en)*2017-12-012018-04-13北京蓝水科技文化有限公司The quantitatively evaluating Index and system of a kind of internet films and television programs
CN108280073A (en)*2017-01-052018-07-13北大方正集团有限公司The influence power analysis method and system of news client

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US8762225B1 (en)*2004-09-302014-06-24Google Inc.Systems and methods for scoring documents
CN104657488A (en)*2015-03-052015-05-27中南大学Method for calculating author influence based on citation propagation network
CN106682097A (en)*2016-12-012017-05-17北京奇虎科技有限公司Method and device for processing log data
CN108280073A (en)*2017-01-052018-07-13北大方正集团有限公司The influence power analysis method and system of news client
CN107911721A (en)*2017-12-012018-04-13北京蓝水科技文化有限公司The quantitatively evaluating Index and system of a kind of internet films and television programs

Cited By (10)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110471898A (en)*2019-08-222019-11-19长江师范学院Dissemination method can be traced in a kind of information credit management method and Figures
CN110795658A (en)*2019-09-252020-02-14北京三快在线科技有限公司User scoring method and device, electronic equipment and computer storage medium
CN112785321A (en)*2019-11-072021-05-11北京沃东天骏信息技术有限公司Incentive management method and device
CN111104486A (en)*2019-12-252020-05-05郑州师范学院 Comparative explanation system of modern literary works
CN111738608A (en)*2020-06-282020-10-02中国联合网络通信集团有限公司 A channel scoring method and system
CN114118651A (en)*2020-08-282022-03-01腾讯科技(深圳)有限公司Evaluation method, device, equipment and computer storage medium
CN113988621A (en)*2021-10-272022-01-28掌阅科技股份有限公司Data processing method, computing device and storage medium for book information producer
CN114579868A (en)*2022-03-232022-06-03茅硕Novel app user reading data monitoring and analyzing system based on big data
CN116910628A (en)*2023-09-122023-10-20联通在线信息科技有限公司Creator expertise portrait assessment method and system
CN116910628B (en)*2023-09-122024-02-06联通在线信息科技有限公司Creator expertise portrait assessment method and system

Similar Documents

PublicationPublication DateTitle
CN109670855A (en)The methods of marking and device of information flow platform author
John et al.Data lake for enterprises
Schomm et al.Marketplaces for data: an initial survey
CN105488216B (en)Recommendation system and method based on implicit feedback collaborative filtering algorithm
SeufertFreemium economics: Leveraging analytics and user segmentation to drive revenue
BjeladinovicA fresh approach for hybrid SQL/NoSQL database design based on data structuredness
Liu et al.Citations with different levels of relevancy: Tracing the main paths of legal opinions
CN117094743B (en)Automatic cigarette retail market data statistical analysis system and method
JP5346816B2 (en) Algorithmic trading
Wang et al.Functional bid landscape forecasting for display advertising
CN107392667A (en)The determination method, apparatus and the network equipment of acceptance of the users
CN111026801A (en) A method and system for assisting the rapid decision-making of insurance e-commerce operations
US20240202205A1 (en)Library information management system
Kornevs et al.Cloud computing evaluation based on financial metrics
CN109033173A (en)It is a kind of for generating the data processing method and device of multidimensional index data
KR102491396B1 (en)Method And Apparatus for Providing Multi-Dimensional Patent Analysis Service Based on Big Data
Wang et al.Toward the health measure for open source software ecosystem via projection pursuit and real-coded accelerated genetic
Kläs et al.Quality evaluation for big data: a scalable assessment approach and first evaluation results
Banica et al.Big data in business environment
US10235336B1 (en)Prescriptive analytics platform and polarity analysis engine
KumarIntegrated benchmarking standard and decision support system for structured, semi structured, unstructured retail data
CN113191922A (en)Litigation decision information request processing method and device
CN112016975A (en)Product screening method and device, computer equipment and readable storage medium
CN111784288A (en) A task management method, device, system, storage medium and device
US11727002B2 (en)Segment trend analytics query processing using event data

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
RJ01Rejection of invention patent application after publication
RJ01Rejection of invention patent application after publication

Application publication date:20190423


[8]ページ先頭

©2009-2025 Movatter.jp