Movatterモバイル変換


[0]ホーム

URL:


CN103488656A - Data processing method and device - Google Patents

Data processing method and device
Download PDF

Info

Publication number
CN103488656A
CN103488656ACN201210196534.7ACN201210196534ACN103488656ACN 103488656 ACN103488656 ACN 103488656ACN 201210196534 ACN201210196534 ACN 201210196534ACN 103488656 ACN103488656 ACN 103488656A
Authority
CN
China
Prior art keywords
classification
taxon
data
reception data
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201210196534.7A
Other languages
Chinese (zh)
Other versions
CN103488656B (en
Inventor
罗景
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Shiji Guangsu Information Technology Co Ltd
Original Assignee
Shenzhen Shiji Guangsu Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Shiji Guangsu Information Technology Co LtdfiledCriticalShenzhen Shiji Guangsu Information Technology Co Ltd
Priority to CN201210196534.7ApriorityCriticalpatent/CN103488656B/en
Publication of CN103488656ApublicationCriticalpatent/CN103488656A/en
Application grantedgrantedCritical
Publication of CN103488656BpublicationCriticalpatent/CN103488656B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

The invention is suitable for technical field of computers, and provides a data processing method and device. The method comprises the following steps: receiving data, and transmitting the received data to at least one classifying unit; acquiring a processing result of the at least one classifying unit; determining the types of the received data according to the processing result, wherein the classifying unit is used for classifying the received data according to a determined classifying method. According to the method, the processing results of a plurality of classifying units are utilized flexibly, so that data processing in a classifying process is easy and convenient under the condition of dynamic type systems and classified data in a classified data processing process.

Description

A kind of data processing method and device
Technical field
The invention belongs to technical field of data processing, relate in particular to a kind of data processing method and device.
Background technology
Classification refers to some object is classified, and the catalogue under identifying is used and storage to facilitate, and for example, to information classification, as the classification of document, the inquiry of data etc., by can user friendlyly browsing information classification and further data analysis.The target of classification is by study, and then can automatically data be assigned to known classification, sorting technique commonly used, for example: support vector machine sorting algorithm (Support Vector Machine, SVM), K arest neighbors sorting algorithm (k-Nearest Neighbor, KNN), Bayesian Classification Arithmetic etc., these sorting techniques are all by the study to some given datas basically, form disaggregated model, then utilize the classification of model prediction unknown data.
Prior art, in the classification implementation procedure, classified to different objects by different sorters usually, and sorter is a kind of computer program, and its target is to pass through study, and then realizes automatically data being assigned to known class.It can be applied in search engine and various search program, simultaneously also in a large number should be in data analysis and prediction field.
The sorting technique of prior art, in the situation that more stable for fixed class complicated variant system and data, can obtain reasonable classifying quality.But, in classification system and the more dynamic situation of grouped data, former learning outcome is difficult to directly utilize, and need to again demarcate training data, and train new disaggregated model, thereby cause data processing complex in assorting process.
Summary of the invention
The purpose of the embodiment of the present invention is to provide a kind of data processing method, is intended to solve in the grouped data processing procedure of prior art, and in classification system and the more dynamic situation of grouped data, the problem of data processing complex in assorting process.
To achieve these goals, the embodiment of the present invention provides following technical scheme:
The embodiment of the present invention is achieved in that a kind of data processing method, and described method comprises:
Receive data, described data are sent to at least one taxon;
Obtain the result of described at least one taxon;
According to described result, determine the classification of described reception data;
Wherein, described taxon for according to definite sorting technique to the data that the receive processing of classifying.
The embodiment of the present invention also provides a kind of sorter, and described device comprises:
Receiving element, for receiving data;
Transmitting element, for sending at least one taxon by described data;
Acquiring unit, for obtaining the result of described at least one taxon;
Determining unit, for according to described result, determine the classification of described reception data;
Wherein, described taxon for according to definite sorting technique to the data that the receive processing of classifying.
Compared with prior art, beneficial effect is the embodiment of the present invention: receive data, described data are sent to at least one taxon, obtain the result of described at least one taxon, according to described result, determine the classification of described reception data.Due to the result of utilizing flexibly a plurality of taxons, make in the grouped data processing procedure, in classification system and the more dynamic situation of grouped data, in assorting process, data are processed simple, convenient.
The accompanying drawing explanation
In order to be illustrated more clearly in the technical scheme of the embodiment of the present invention, in below describing embodiment, the accompanying drawing of required use is briefly described, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skills, under the prerequisite of not paying creative work, can also obtain according to these accompanying drawings other accompanying drawing.
Fig. 1 is the process flow diagram of the realization of the data processing method that provides of the embodiment of the present invention one;
Fig. 2 is the structural drawing of the taxon that provides of the embodiment of the present invention one;
Fig. 3 is the process flow diagram of the realization of the data processing method that provides of the embodiment of the present invention two;
Fig. 4 is the structural drawing of the data processing equipment that provides of the embodiment of the present invention three;
Fig. 5 is the structural drawing of the data processing equipment that provides of the embodiment of the present invention four.
Embodiment
In order to make purpose of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein, only in order to explain the present invention, is not intended to limit the present invention.
The embodiment of the present invention provides a kind of sorting technique, and described method comprises:
Receive data, described data are sent to at least one taxon;
Obtain the result of described at least one taxon;
According to described result, determine the classification of described reception data;
Wherein, described taxon for according to definite sorting technique to the data that the receive processing of classifying.
The embodiment of the present invention also provides a kind of sorter, and described device comprises:
Receiving element, for receiving data;
Transmitting element, for sending at least one taxon by described data;
Acquiring unit, for obtaining the result of described at least one taxon;
Determining unit, for according to described result, determine the classification of described reception data;
Wherein, described taxon for according to definite sorting technique to the data that the receive processing of classifying.
Below in conjunction with specific embodiment, realization of the present invention is described in detail:
embodiment mono-
Fig. 1 shows the process flow diagram of the realization of the data processing method that the embodiment of the present invention one provides, and details are as follows:
In S101, receive data, described data are sent to at least one taxon;
In the present embodiment, can be according to client's demand, send data to different taxons, concrete, by programming, this is set in advance in user's request, such as, can need to be classified to the structure of data, also can be classified to the content of data according to the user.
In the present embodiment, each taxon receives data, and grouped data is classified, and wherein, described each taxon adopts different sorting algorithms to be processed receiving data.
In the present embodiment, can be using described at least one taxon as a classified body, this classified body is corresponding with a taxonomic hierarchies, to be user-friendly to,, described at least one taxon is corresponding with a taxonomic hierarchies, wherein, when taxon, while being a plurality of, a plurality of taxons are corresponding one by one with multiple sorting algorithm, thereby, can carry out evaluation of classification to the grouped data received by different algorithms.
In S102, obtain the result of described at least one taxon;
In S103, according to described result, determine the classification of described reception data;
In the present embodiment, described taxon for according to definite sorting technique to the data that the receive processing of classifying, the class unit can be online sorter, Fig. 2 has provided the example of a taxon, it can be also the high-quality grouped data of off-line, for example, can be the grouped data of having been classified.
Optionally, can also, according to user's actual demand, introduce new taxon, thereby whole sorting technique is with good expansibility, for example, when grouped data generation dynamic change, can increase in time the variation that new taxon occurs to adapt to data.
In the present embodiment, described result is: classification and the classification confidence value of described at least one taxon to described reception Data classification, and now, described S103 can realize in the following ways:
In the classification of described at least one taxon to described reception Data classification, the classification that the selection sort confidence value is classified over the taxon of preset value is as the classification of described reception data; Perhaps
In the classification of described at least one taxon to described reception Data classification, the classification of the taxon classification that the selection sort confidence value is the highest is as the classification of described reception data.
Wherein, can adopt the classification confidence value of classification of the mode presentation class unit classification of scoring, concrete, each taxon can be calculated classification and such other score value that provides grouped data by corresponding algorithm, provide classification and such other score value of grouped data according to all taxons in this taxonomic hierarchies, can adopt different strategies, determine the final classification results that receives data, S103 is specially:
In the classification of described at least one taxon to described reception Data classification, the classification that the score value of selection classification is classified over the taxon of preset value is as the classification of described reception data; Perhaps
In the classification of described at least one taxon to described reception Data classification, the classification of the taxon classification that the score value of selection classification is the highest is as the classification of described reception data.
Optionally, described taxon serial or parallel described reception data are classified, to meet the requirement of performance under different scenes.When serial is classified to receiving data, after receiving data, send successively grouped data to each taxon, now, the classification confidence value of calculating when a certain taxon surpasses predetermined threshold value, no longer distributing data to remaining taxon, thereby improve the efficiency of classification; When side-by-side docking receipts data are classified, after receiving data, send grouped data to all taxons simultaneously and carry out Data classification.
In the present embodiment, receive data, described data are sent to at least one taxon, obtain the result of described at least one taxon, according to described result, determine that the classification of described reception data, due to the result of utilizing flexibly a plurality of taxons, makes in the grouped data processing procedure, in classification system and the more dynamic situation of grouped data, in assorting process, data are processed simple, convenient.
In addition, because a plurality of taxons are separate, therefore, when a certain disaggregated model changes, without again training new whole disaggregated model, get final product and only need train for the classification changed, thereby can utilize fully existing grouped data, the variation that adaptation taxonomic hierarchies that can be good and data distribute, thereby also just can better adapt to the variation of actual demand.
And, by the classification results of a plurality of taxons, grouped data is carried out to compressive classification, make classification quality and efficiency all increase, promoted user's experience.
embodiment bis-
Fig. 3 shows the process flow diagram of the realization of the data processing method that the embodiment of the present invention two provides, and details are as follows:
In S301, be each taxon configuration quality factor in advance, described quality factor is for adjusting the result of described at least one taxon;
In the present embodiment, the value of quality factor can be set according to actual needs, for example, quality factor can be set for being greater than 0, be less than arbitrary value of 1.
In actual data handling procedure, same taxon may be able to be identified the grouped data of some classification preferably, but the recognition capability to the grouped data of other classifications is just more weak, based on this, for a quality factor Q of each taxon configuration, with the candidate classification score value that taxon is provided, adjusted, when quality factor Q value is larger, the classification confidence value that taxon provides role when determining the classification of described data object is larger, when quality factor Q value hour, the classification confidence value that taxon provides role when determining the classification of described data object is less.
Optionally, S301 specifically can realize in the following ways:
Recall rate and/or accuracy rate according to each taxon data analysis, be each taxon configuration quality factor in advance, be specially: recall rate (Recall) and/or accuracy rate (Precision) are higher, the quality factor value for each taxon configuration is larger, wherein, belong to such data sum in the data number/test set of certain class of recall rate=correctly be divided into; Be divided into such data sum in the data number/test set of certain class of accuracy rate=correctly be divided into.
In S302, receive data, described data are sent to at least one taxon;
In S303, obtain the result of described at least one taxon;
In S304, according to adjusted result, determine the classification of described reception data;
Wherein, described taxon for according to definite sorting technique to the data that the receive processing of classifying.
In the present embodiment, according to different user's requests and practical application scene, different quality factors can be set different taxons, to realize adjusting the flexibly contribution of each taxon in classification results, thereby guarantee the effect of high-quality sorter.
embodiment tri-
Fig. 4 shows the structural drawing of the data processing equipment that the embodiment of the present invention three provides, and for convenience of explanation, only shows the part relevant to the embodiment of the present invention.
This data processing equipment, for a taxonomic hierarchies, completes the classification of grouped data under this taxonomic hierarchies, and described data processing equipment comprises: receivingelement 41, transmittingelement 42, acquiringunit 43 and determiningunit 44.
Receivingelement 41, for receiving data;
Transmittingelement 42, for sending at least one taxon by described data;
Acquiringunit 43, for obtaining the result of described at least one taxon;
Determiningunit 44, for according to described result, determine the classification of described reception data;
Wherein, described taxon for according to definite sorting technique to the data that the receive processing of classifying.
Optionally, described result is: classification and the classification confidence value of described at least one taxon to described reception Data classification, now, described determiningunit 44, specifically in the classification of described at least one taxon to described reception Data classification, the selection sort confidence value surpasses the classification of taxon classification of preset value as the classification of described reception data; Perhaps described determiningunit 44, specifically in the classification of described at least one taxon to described reception Data classification, the classification of the taxon classification that the selection sort confidence value is the highest is as the classification of described reception data.
In the present embodiment, described taxon serial or parallel described reception data are classified.
The data processing equipment that the embodiment of the present invention provides can use in the embodiment of the method one of aforementioned correspondence, and details, referring to the description of above-described embodiment one, do not repeat them here.
embodiment tetra-
Fig. 5 shows the structural drawing of the data processing equipment that the embodiment of the present invention four provides, and for convenience of explanation, only shows the part relevant to the embodiment of the present invention.
In the present embodiment, described data processing equipment comprises: dispensingunit 51, receivingelement 52, transmittingelement 53, at least onetaxon 54, acquiringunit 55 and determiningunit 56.
The difference of the present embodiment and embodiment tri-is:
Dispensingunit 51, for being in advance each taxon configuration quality factor, described quality factor is for adjusting the result of described at least one taxon;
The value of quality factor can be set according to actual needs, for example, quality factor can be set for being greater than 0, be less than arbitrary value of 1.
Described determiningunit 56, specifically for according to adjusted result, determine the classification of described reception data.
Optionally, described determiningunit 56, specifically for the recall rate according to each taxon data analysis and/or accuracy rate, is each taxon configuration quality factor in advance, be specially: recall rate and/or accuracy rate are higher, and the quality factor value for each taxon configuration is larger.
The data processing equipment that the embodiment of the present invention provides can use in the embodiment of the method two of aforementioned correspondence, and details, referring to the description of above-described embodiment two, do not repeat them here.
It should be noted that in said apparatus embodiment, included unit is just divided according to function logic, but is not limited to above-mentioned division, as long as can realize corresponding function; In addition, the concrete title of each functional unit also, just for the ease of mutual differentiation, is not limited to protection scope of the present invention.
In addition, one of ordinary skill in the art will appreciate that all or part of step realized in the various embodiments described above method is to come the hardware that instruction is relevant to complete by program, corresponding program can be stored in a computer read/write memory medium, described storage medium, as ROM/RAM, disk or CD etc.
The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, all any modifications of doing within the spirit and principles in the present invention, be equal to and replace and improvement etc., within all should being included in protection scope of the present invention.

Claims (10)

CN201210196534.7A2012-06-142012-06-14A kind of data processing method and deviceActiveCN103488656B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201210196534.7ACN103488656B (en)2012-06-142012-06-14A kind of data processing method and device

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201210196534.7ACN103488656B (en)2012-06-142012-06-14A kind of data processing method and device

Publications (2)

Publication NumberPublication Date
CN103488656Atrue CN103488656A (en)2014-01-01
CN103488656B CN103488656B (en)2018-11-13

Family

ID=49828894

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201210196534.7AActiveCN103488656B (en)2012-06-142012-06-14A kind of data processing method and device

Country Status (1)

CountryLink
CN (1)CN103488656B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
WO2017124713A1 (en)*2016-01-182017-07-27华为技术有限公司Data model determination method and apparatus
CN107209262A (en)*2014-07-032017-09-26通用汽车环球科技运作有限责任公司Radar for vehicle method and system
CN109598307A (en)*2018-12-062019-04-09北京达佳互联信息技术有限公司Data screening method, apparatus, server and storage medium
CN109670971A (en)*2018-11-302019-04-23平安医疗健康管理股份有限公司Judgment method, device, equipment and the computer storage medium of abnormal medical expenditure
US20230246972A1 (en)*2020-07-012023-08-03Viasat, Inc.Parallel and tiered network traffic classification

Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101316246A (en)*2008-07-182008-12-03北京大学 A spam detection method and system based on classifier dynamic update
US7565369B2 (en)*2004-05-282009-07-21International Business Machines CorporationSystem and method for mining time-changing data streams
CN101901345A (en)*2009-05-272010-12-01复旦大学 A Classification Method for Differential Proteomics

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US7565369B2 (en)*2004-05-282009-07-21International Business Machines CorporationSystem and method for mining time-changing data streams
CN101316246A (en)*2008-07-182008-12-03北京大学 A spam detection method and system based on classifier dynamic update
CN101901345A (en)*2009-05-272010-12-01复旦大学 A Classification Method for Differential Proteomics

Cited By (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN107209262A (en)*2014-07-032017-09-26通用汽车环球科技运作有限责任公司Radar for vehicle method and system
WO2017124713A1 (en)*2016-01-182017-07-27华为技术有限公司Data model determination method and apparatus
CN109670971A (en)*2018-11-302019-04-23平安医疗健康管理股份有限公司Judgment method, device, equipment and the computer storage medium of abnormal medical expenditure
CN109598307A (en)*2018-12-062019-04-09北京达佳互联信息技术有限公司Data screening method, apparatus, server and storage medium
US20230246972A1 (en)*2020-07-012023-08-03Viasat, Inc.Parallel and tiered network traffic classification
US12058053B2 (en)*2020-07-012024-08-06Viasat, Inc.Parallel and tiered network traffic classification

Also Published As

Publication numberPublication date
CN103488656B (en)2018-11-13

Similar Documents

PublicationPublication DateTitle
CN111046286B (en)Object recommendation method and device and computer storage medium
TWI718337B (en)Ssd and article and method of managing stream
US8787682B2 (en)Fast image classification by vocabulary tree based image retrieval
CN103488656A (en)Data processing method and device
CN103365997B (en)A kind of opining mining method based on integrated study
US10438590B2 (en)Voice recognition
CN110192393A (en)Large-scale real-time video analysis
CN105069534A (en)Customer loss prediction method and device
CN103309869B (en)Method and system for recommending display keyword of data object
CN108762686B (en)Data consistency check flow control method and device, electronic equipment and storage medium
CN104615684B (en)A kind of mass data communication concurrent processing method and system
CN112529211A (en)Hyper-parameter determination method and device, computer equipment and storage medium
CN102880879A (en)Distributed processing and support vector machine (SVM) classifier-based outdoor massive object recognition method and system
CN109685104B (en)Determination method and device for recognition model
CN105373853A (en)Stock public opinion index prediction method and device
CN109190674A (en)The generation method and device of training data
EP3449428A1 (en)Machine learning aggregation
CN114332550A (en) A model training method, system, storage medium and terminal device
CN106611021B (en)Data processing method and equipment
CN106102167B (en)Data broadcasting scheduling adaptive channel divides and distribution system and method on demand in real time
KR101158750B1 (en)Text classification device and classification method thereof
CN110059261A (en)Content recommendation method and device
CN110928484B (en)Hybrid cloud storage method based on software defined storage
CN103218419A (en)Network tag clustering method and network tag clustering system
KR20190078692A (en)Apparatus for sampling data considering data distribution and method for the same

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp