Embodiment
In order to make purpose of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein, only in order to explain the present invention, is not intended to limit the present invention.
The embodiment of the present invention provides a kind of sorting technique, and described method comprises:
Receive data, described data are sent to at least one taxon;
Obtain the result of described at least one taxon;
According to described result, determine the classification of described reception data;
Wherein, described taxon for according to definite sorting technique to the data that the receive processing of classifying.
The embodiment of the present invention also provides a kind of sorter, and described device comprises:
Receiving element, for receiving data;
Transmitting element, for sending at least one taxon by described data;
Acquiring unit, for obtaining the result of described at least one taxon;
Determining unit, for according to described result, determine the classification of described reception data;
Wherein, described taxon for according to definite sorting technique to the data that the receive processing of classifying.
Below in conjunction with specific embodiment, realization of the present invention is described in detail:
embodiment mono-
Fig. 1 shows the process flow diagram of the realization of the data processing method that the embodiment of the present invention one provides, and details are as follows:
In S101, receive data, described data are sent to at least one taxon;
In the present embodiment, can be according to client's demand, send data to different taxons, concrete, by programming, this is set in advance in user's request, such as, can need to be classified to the structure of data, also can be classified to the content of data according to the user.
In the present embodiment, each taxon receives data, and grouped data is classified, and wherein, described each taxon adopts different sorting algorithms to be processed receiving data.
In the present embodiment, can be using described at least one taxon as a classified body, this classified body is corresponding with a taxonomic hierarchies, to be user-friendly to,, described at least one taxon is corresponding with a taxonomic hierarchies, wherein, when taxon, while being a plurality of, a plurality of taxons are corresponding one by one with multiple sorting algorithm, thereby, can carry out evaluation of classification to the grouped data received by different algorithms.
In S102, obtain the result of described at least one taxon;
In S103, according to described result, determine the classification of described reception data;
In the present embodiment, described taxon for according to definite sorting technique to the data that the receive processing of classifying, the class unit can be online sorter, Fig. 2 has provided the example of a taxon, it can be also the high-quality grouped data of off-line, for example, can be the grouped data of having been classified.
Optionally, can also, according to user's actual demand, introduce new taxon, thereby whole sorting technique is with good expansibility, for example, when grouped data generation dynamic change, can increase in time the variation that new taxon occurs to adapt to data.
In the present embodiment, described result is: classification and the classification confidence value of described at least one taxon to described reception Data classification, and now, described S103 can realize in the following ways:
In the classification of described at least one taxon to described reception Data classification, the classification that the selection sort confidence value is classified over the taxon of preset value is as the classification of described reception data; Perhaps
In the classification of described at least one taxon to described reception Data classification, the classification of the taxon classification that the selection sort confidence value is the highest is as the classification of described reception data.
Wherein, can adopt the classification confidence value of classification of the mode presentation class unit classification of scoring, concrete, each taxon can be calculated classification and such other score value that provides grouped data by corresponding algorithm, provide classification and such other score value of grouped data according to all taxons in this taxonomic hierarchies, can adopt different strategies, determine the final classification results that receives data, S103 is specially:
In the classification of described at least one taxon to described reception Data classification, the classification that the score value of selection classification is classified over the taxon of preset value is as the classification of described reception data; Perhaps
In the classification of described at least one taxon to described reception Data classification, the classification of the taxon classification that the score value of selection classification is the highest is as the classification of described reception data.
Optionally, described taxon serial or parallel described reception data are classified, to meet the requirement of performance under different scenes.When serial is classified to receiving data, after receiving data, send successively grouped data to each taxon, now, the classification confidence value of calculating when a certain taxon surpasses predetermined threshold value, no longer distributing data to remaining taxon, thereby improve the efficiency of classification; When side-by-side docking receipts data are classified, after receiving data, send grouped data to all taxons simultaneously and carry out Data classification.
In the present embodiment, receive data, described data are sent to at least one taxon, obtain the result of described at least one taxon, according to described result, determine that the classification of described reception data, due to the result of utilizing flexibly a plurality of taxons, makes in the grouped data processing procedure, in classification system and the more dynamic situation of grouped data, in assorting process, data are processed simple, convenient.
In addition, because a plurality of taxons are separate, therefore, when a certain disaggregated model changes, without again training new whole disaggregated model, get final product and only need train for the classification changed, thereby can utilize fully existing grouped data, the variation that adaptation taxonomic hierarchies that can be good and data distribute, thereby also just can better adapt to the variation of actual demand.
And, by the classification results of a plurality of taxons, grouped data is carried out to compressive classification, make classification quality and efficiency all increase, promoted user's experience.
embodiment bis-
Fig. 3 shows the process flow diagram of the realization of the data processing method that the embodiment of the present invention two provides, and details are as follows:
In S301, be each taxon configuration quality factor in advance, described quality factor is for adjusting the result of described at least one taxon;
In the present embodiment, the value of quality factor can be set according to actual needs, for example, quality factor can be set for being greater than 0, be less than arbitrary value of 1.
In actual data handling procedure, same taxon may be able to be identified the grouped data of some classification preferably, but the recognition capability to the grouped data of other classifications is just more weak, based on this, for a quality factor Q of each taxon configuration, with the candidate classification score value that taxon is provided, adjusted, when quality factor Q value is larger, the classification confidence value that taxon provides role when determining the classification of described data object is larger, when quality factor Q value hour, the classification confidence value that taxon provides role when determining the classification of described data object is less.
Optionally, S301 specifically can realize in the following ways:
Recall rate and/or accuracy rate according to each taxon data analysis, be each taxon configuration quality factor in advance, be specially: recall rate (Recall) and/or accuracy rate (Precision) are higher, the quality factor value for each taxon configuration is larger, wherein, belong to such data sum in the data number/test set of certain class of recall rate=correctly be divided into; Be divided into such data sum in the data number/test set of certain class of accuracy rate=correctly be divided into.
In S302, receive data, described data are sent to at least one taxon;
In S303, obtain the result of described at least one taxon;
In S304, according to adjusted result, determine the classification of described reception data;
Wherein, described taxon for according to definite sorting technique to the data that the receive processing of classifying.
In the present embodiment, according to different user's requests and practical application scene, different quality factors can be set different taxons, to realize adjusting the flexibly contribution of each taxon in classification results, thereby guarantee the effect of high-quality sorter.
embodiment tri-
Fig. 4 shows the structural drawing of the data processing equipment that the embodiment of the present invention three provides, and for convenience of explanation, only shows the part relevant to the embodiment of the present invention.
This data processing equipment, for a taxonomic hierarchies, completes the classification of grouped data under this taxonomic hierarchies, and described data processing equipment comprises: receivingelement 41, transmittingelement 42, acquiringunit 43 and determiningunit 44.
Receivingelement 41, for receiving data;
Transmittingelement 42, for sending at least one taxon by described data;
Acquiringunit 43, for obtaining the result of described at least one taxon;
Determiningunit 44, for according to described result, determine the classification of described reception data;
Wherein, described taxon for according to definite sorting technique to the data that the receive processing of classifying.
Optionally, described result is: classification and the classification confidence value of described at least one taxon to described reception Data classification, now, described determiningunit 44, specifically in the classification of described at least one taxon to described reception Data classification, the selection sort confidence value surpasses the classification of taxon classification of preset value as the classification of described reception data; Perhaps described determiningunit 44, specifically in the classification of described at least one taxon to described reception Data classification, the classification of the taxon classification that the selection sort confidence value is the highest is as the classification of described reception data.
In the present embodiment, described taxon serial or parallel described reception data are classified.
The data processing equipment that the embodiment of the present invention provides can use in the embodiment of the method one of aforementioned correspondence, and details, referring to the description of above-described embodiment one, do not repeat them here.
embodiment tetra-
Fig. 5 shows the structural drawing of the data processing equipment that the embodiment of the present invention four provides, and for convenience of explanation, only shows the part relevant to the embodiment of the present invention.
In the present embodiment, described data processing equipment comprises: dispensingunit 51, receivingelement 52, transmittingelement 53, at least onetaxon 54, acquiringunit 55 and determiningunit 56.
The difference of the present embodiment and embodiment tri-is:
Dispensingunit 51, for being in advance each taxon configuration quality factor, described quality factor is for adjusting the result of described at least one taxon;
The value of quality factor can be set according to actual needs, for example, quality factor can be set for being greater than 0, be less than arbitrary value of 1.
Described determiningunit 56, specifically for according to adjusted result, determine the classification of described reception data.
Optionally, described determiningunit 56, specifically for the recall rate according to each taxon data analysis and/or accuracy rate, is each taxon configuration quality factor in advance, be specially: recall rate and/or accuracy rate are higher, and the quality factor value for each taxon configuration is larger.
The data processing equipment that the embodiment of the present invention provides can use in the embodiment of the method two of aforementioned correspondence, and details, referring to the description of above-described embodiment two, do not repeat them here.
It should be noted that in said apparatus embodiment, included unit is just divided according to function logic, but is not limited to above-mentioned division, as long as can realize corresponding function; In addition, the concrete title of each functional unit also, just for the ease of mutual differentiation, is not limited to protection scope of the present invention.
In addition, one of ordinary skill in the art will appreciate that all or part of step realized in the various embodiments described above method is to come the hardware that instruction is relevant to complete by program, corresponding program can be stored in a computer read/write memory medium, described storage medium, as ROM/RAM, disk or CD etc.
The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, all any modifications of doing within the spirit and principles in the present invention, be equal to and replace and improvement etc., within all should being included in protection scope of the present invention.