Movatterモバイル変換


[0]ホーム

URL:


CN106383917A - Data processing method based on user logs - Google Patents

Data processing method based on user logs
Download PDF

Info

Publication number
CN106383917A
CN106383917ACN201610997324.6ACN201610997324ACN106383917ACN 106383917 ACN106383917 ACN 106383917ACN 201610997324 ACN201610997324 ACN 201610997324ACN 106383917 ACN106383917 ACN 106383917A
Authority
CN
China
Prior art keywords
user
data set
user journal
journal
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610997324.6A
Other languages
Chinese (zh)
Inventor
许伟刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SUZHOU TIANPING ADVANCED DIGITAL TECHNOLOGIES Co Ltd
Original Assignee
SUZHOU TIANPING ADVANCED DIGITAL TECHNOLOGIES Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SUZHOU TIANPING ADVANCED DIGITAL TECHNOLOGIES Co LtdfiledCriticalSUZHOU TIANPING ADVANCED DIGITAL TECHNOLOGIES Co Ltd
Priority to CN201610997324.6ApriorityCriticalpatent/CN106383917A/en
Publication of CN106383917ApublicationCriticalpatent/CN106383917A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

The invention discloses a data processing method based on user logs. The data processing method comprises the following steps: collecting the user logs in real time, and establishing a first data set; marking, and establishing a second data set of the marked user logs; performing real-time budgeting, and establishing a dynamic budget data set; matching the query field of the user with the data in the dynamic budget data set; pushing the data of successful matching as the query result to the user; if the matching fails, carrying out the next step; extracting the user log data similar to the user query field from the second data set, and establishing a third data set; classifying and establishing a fourth data set; establishing a linear regression model, and calculating the association degree of each query field; establishing a fifth data set; and finally, determining N results as the query results and pushing to the user. According to the data processing method, through real-time calculation, quick retrieval can be performed to obtain the query results.

Description

A kind of data processing method based on user journal
Technical field
The present invention relates to areas of information technology are and in particular to a kind of data processing method based on user journal.
Background technology
Journal file produces in system operation, and it is able to record that the operation row of the operation conditions of system and userFor when system operation is slow or abnormal, system problem can be solved by checking journal file, recover normal and run.UserDaily record is also a kind of important information source, in social network sites or business web site, can be by the excavation to user journalFind out the potential access pattern of user, design the webpage that more convenient user accesses.
User journal is applied in search field, is divided into based on the inquiry of daily record:Correlation rule is recommended, clustering method pushes awayRecommend, Annual distribution is recommended.In the method for correlation rule, query phrase is considered as the item of correlation rule, inquiry log is regarded as meetingThe set of words, thus recommend the high frequency vocabulary in session;Clustering method is query string to be carried out cluster find relevant inquiring, the partyMethod needs the daily record data enriching in a large number to support;Annual distribution is recommended, and needs the search rate considering similar inquiry in the timeIt is that similar, special time point generally has special inquiry and recommendation in distribution, this kind of method can be used as other methodsSupplement.
Traditional inquiry mode is when user inquires about, and server just carries out the calculating of relevant inquiring field it is impossible to realizeCalculate in real time, computationally intensive, inquiry velocity relatively is slower, and the requirement to data base is higher, no longer adapts to present inspectionCable system growth requirement.
Content of the invention
It is an object of the invention to the problem above overcoming prior art to exist, provide a kind of data based on user journalProcessing method, the present invention calculates in real time, can retrieve acquisition Query Result faster.
For realizing above-mentioned technical purpose, reach above-mentioned technique effect, the present invention is achieved through the following technical solutions:
A kind of data processing method based on user journal, comprises the following steps:
S 101 real-time collecting user journal, selects to the user journal of real-time collecting, obtains effective user's dayWill, sets up the first data set;
S 102 is marked to the user journal in described first data set, and the user journal after labelling sets up the second numberAccording to collection;
S 103 carries out real-time budget in described second data set, sets up dynamic budget data set;
The inquiry field of user is mated by S 104 with the data in described dynamic budget data set, what the match is successfulData will be pushed to user as Query Result, if it fails to match, carry out next step;
S 105 extracts from described second data set and inquires about the user journal data that field has similarity, structure with userBuild the 3rd data set;
S 106 classifies to the user journal data in described 3rd data set, by same or analogous inquiry fieldAs query string, or label symbol cluster identical user journal is classified, or enquiry frequency time identical userDaily record is classified, and described sort module builds the 4th data set;
S 107 sets up linear regression model (LRM) according to rule searching, and the user journal with inquiry fields match is put into linearlyIn regression model, composite model after being processed, calculate the degree of association of each inquiry field;
S 108 inquires the user journal conduct matching with the inquiry field of user input in described 4th data setQuery set, builds the 5th data set;
S 109, in described 5th data set, is ranked up according to the degree of association that described first data processing module obtainsProcess, finally determine that N number of result, as Query Result, is pushed to user.
Preferably, the user journal in S101 passes through to collect user journal end real-time collecting.
Preferably, described collection user journal end can self-defined user journal, according to self-defined journal format, daily record classType, log content, daily record key character, selectively collect user journal.
Preferably, the described user journal collected at user journal end of collecting temporarily is stored.
Preferably, in S102, label symbol includes:Historical query field, query string, time, cluster name.
Preferably, in S103 the source of the pre- frequency according to historical query field and user journal at last calculating user againThe probability of secondary inquiry, and the size sequence according to probability.
Preferably, described 1≤N≤10, N is integer.
The invention has the beneficial effects as follows:
The present invention is the mode based on user journal, calculates in real time, can retrieval recommendation results quickly, this systemBudget module can shift to an earlier date budget result, then mated by matching module, if the match is successful, directly pushed to useFamily, in advance budget result improve push result efficiency, without budget in advance to result, then calculated.
Described above is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention,And can be practiced according to the content of description, below with presently preferred embodiments of the present invention and coordinate accompanying drawing describe in detail as after.The specific embodiment of the present invention is shown in detail in by following examples and its accompanying drawing.
Brief description
In order to be illustrated more clearly that the technical scheme in embodiment of the present invention technology, below will be in the description of embodiment technologyThe accompanying drawing of required use do simple introduce it should be apparent that, drawings in the following description be only the present invention some are realApply example, for those of ordinary skill in the art, on the premise of not paying creative work, can also be according to these accompanying drawingsObtain other accompanying drawings.
Fig. 1 is the flow chart of the present invention.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, completeSite preparation description is it is clear that described embodiment is only a part of embodiment of the present invention, rather than whole embodiments.It is based onEmbodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of not making creative workEmbodiment, broadly falls into the scope of protection of the invention.
With reference to shown in Fig. 1, a kind of data processing method based on user journal, the method be based on collect user journal end andData processing end, the Operation Log of above-mentioned collection user journal end real-time collecting user side, and the user journal collected is transmittedTo above-mentioned data processing end.
Above-mentioned collection user journal end can self-defined user journal, according to self-defined journal format, Log Types, daily recordContent, daily record key character, selectively collect user journal, and the user journal that above-mentioned collection user journal end is collected is carried out temporarilyWhen storage.
Above-mentioned data processing end can calculate to the user journal of real-time collecting, and budget result in advance can be very fastPush Query Result, if not having budget to arrive in advance, recalculate.
Specifically, comprise the following steps:
S 101 real-time collecting user journal, selects to the user journal of real-time collecting, obtains effective user's dayWill, sets up the first data set.
S 102 is marked to the user journal in above-mentioned first data set, and the user journal after labelling sets up the second numberAccording to collection;Label symbol includes:Historical query field, query string, time, cluster name.
S 103 carries out real-time budget in above-mentioned second data set, sets up dynamic budget data set;In advance at last according to historyInquire about the frequency of field and the source of user journal to calculate the probability that user inquires about again, and the big float according to probabilitySequence.
The inquiry field of user is mated by S 104 with the data in above-mentioned dynamic budget data set, what the match is successfulData will be pushed to user as Query Result, if it fails to match, carry out next step.
S 105 extracts from above-mentioned second data set and inquires about the user journal data that field has similarity, structure with userBuild the 3rd data set.
S 106 classifies to the user journal data in above-mentioned 3rd data set, by same or analogous inquiry fieldAs query string, or label symbol cluster identical user journal is classified, or enquiry frequency time identical userDaily record is classified, and above-mentioned sort module builds the 4th data set.
S 107 sets up linear regression model (LRM) according to rule searching, and the user journal with inquiry fields match is put into linearlyIn regression model, composite model after being processed, calculate the degree of association of each inquiry field.
S 108 inquires the user journal conduct matching with the inquiry field of user input in above-mentioned 4th data setQuery set, builds the 5th data set.
S 109, in above-mentioned 5th data set, is ranked up according to the degree of association that above-mentioned first data processing module obtainsProcess, finally determine that N number of result, as Query Result, is pushed to user.
Above-mentioned 1≤N≤10, N is integer.
Said method is the mode based on user journal, calculates in real time, can retrieval recommendation results quickly, this isThe budget module of system can shift to an earlier date budget result, then is mated by matching module, if the match is successful, directly pushes toUser, in advance budget result improve push result efficiency, without budget in advance to result, then calculated.
Described above to the disclosed embodiments, makes professional and technical personnel in the field be capable of or uses the present invention.Multiple modifications to these embodiments will be apparent from for those skilled in the art, as defined hereinGeneral Principle can be realized without departing from the spirit or scope of the present invention in other embodiments.Therefore, the present inventionIt is not intended to be limited to the embodiments shown herein, and be to fit to and principles disclosed herein and features of novelty phase oneThe scope the widest causing.

Claims (7)

CN201610997324.6A2016-11-112016-11-11Data processing method based on user logsPendingCN106383917A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201610997324.6ACN106383917A (en)2016-11-112016-11-11Data processing method based on user logs

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201610997324.6ACN106383917A (en)2016-11-112016-11-11Data processing method based on user logs

Publications (1)

Publication NumberPublication Date
CN106383917Atrue CN106383917A (en)2017-02-08

Family

ID=57958502

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201610997324.6APendingCN106383917A (en)2016-11-112016-11-11Data processing method based on user logs

Country Status (1)

CountryLink
CN (1)CN106383917A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN108459921A (en)*2018-02-232018-08-28北京奇艺世纪科技有限公司Collapse file memory method, device and electronic equipment
CN109032909A (en)*2018-07-182018-12-18携程旅游信息技术(上海)有限公司Processing method, system, equipment and the storage medium of application crash log
CN110019333A (en)*2017-09-302019-07-16北京国双科技有限公司The display methods and device of data field
CN114662822A (en)*2020-12-232022-06-24中国移动通信有限公司研究院Audit model determination method and device and electronic equipment
CN116010600A (en)*2023-01-092023-04-25北京天融信网络安全技术有限公司Log classification method, device, equipment and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
JP2003141410A (en)*2001-10-302003-05-16Hitachi Ltd Advertisement management system and method for internet site
CN102609433A (en)*2011-12-162012-07-25北京大学Method and system for recommending query based on user log
CN103838885A (en)*2014-03-312014-06-04苏州大学Advertisement-putting-oriented potential user searching and user model ordering method
CN103838867A (en)*2014-03-202014-06-04网宿科技股份有限公司Log processing method and device
CN105550264A (en)*2015-12-092016-05-04苏州天平先进数字科技有限公司User journal collecting and processing system and method
CN105808685A (en)*2016-03-022016-07-27腾讯科技(深圳)有限公司Promotion information pushing method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
JP2003141410A (en)*2001-10-302003-05-16Hitachi Ltd Advertisement management system and method for internet site
CN102609433A (en)*2011-12-162012-07-25北京大学Method and system for recommending query based on user log
CN103838867A (en)*2014-03-202014-06-04网宿科技股份有限公司Log processing method and device
CN103838885A (en)*2014-03-312014-06-04苏州大学Advertisement-putting-oriented potential user searching and user model ordering method
CN105550264A (en)*2015-12-092016-05-04苏州天平先进数字科技有限公司User journal collecting and processing system and method
CN105808685A (en)*2016-03-022016-07-27腾讯科技(深圳)有限公司Promotion information pushing method and device

Cited By (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110019333A (en)*2017-09-302019-07-16北京国双科技有限公司The display methods and device of data field
CN108459921A (en)*2018-02-232018-08-28北京奇艺世纪科技有限公司Collapse file memory method, device and electronic equipment
CN109032909A (en)*2018-07-182018-12-18携程旅游信息技术(上海)有限公司Processing method, system, equipment and the storage medium of application crash log
CN114662822A (en)*2020-12-232022-06-24中国移动通信有限公司研究院Audit model determination method and device and electronic equipment
CN116010600A (en)*2023-01-092023-04-25北京天融信网络安全技术有限公司Log classification method, device, equipment and medium
CN116010600B (en)*2023-01-092023-09-26北京天融信网络安全技术有限公司Log classification method, device, equipment and medium

Similar Documents

PublicationPublication DateTitle
CN106383917A (en)Data processing method based on user logs
CN107220237A (en)A kind of method of business entity's Relation extraction based on convolutional neural networks
CN102542061B (en)Intelligent product classification method
CN102722709A (en)Method and device for identifying garbage pictures
CN105718585B (en) Document and tag word semantic association method and device
CN102567494A (en)Website classification method and device
CN112149422B (en)Dynamic enterprise news monitoring method based on natural language
CN109145180A (en)A kind of enterprise hot spots event method for digging based on increment cluster
CN106844782B (en)Network-oriented multi-channel big data acquisition system and method
CN116881430A (en)Industrial chain identification method and device, electronic equipment and readable storage medium
CN101339560A (en)Method and device for searching series data, device and search engine system
CN104615734A (en)Community management service big data processing system and processing method thereof
CN105808262A (en)Json format data-based naming matching method
CN101673262B (en)Method for searching audio content
CN116361367A (en)Content identification system and method for efficiently publishing recruitment information
CN110968596A (en)Data processing method based on label system
CN104199947A (en)Important person speech supervision and incidence relation excavating method
CN109740147A (en)A kind of big quantity personnel resume duplicate removal Match Analysis
CN108021657A (en)A kind of similar author's searching method based on document title semantic information
CN107943937A (en)A kind of debtors assets monitoring method and system based on trial open information analysis
CN106528798A (en)Data processing system based on user logs
CN106919686A (en)A kind of electric model searching method
KR101487871B1 (en)Manual Auto-generating device for Crisis Management Response of Online-based.
CN106844539A (en)Real-time data analysis method and system
CN116956930A (en)Short text information extraction method and system integrating rules and learning models

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
RJ01Rejection of invention patent application after publication

Application publication date:20170208

RJ01Rejection of invention patent application after publication

[8]ページ先頭

©2009-2025 Movatter.jp