Disclosure of Invention
In order to realize the purpose of the invention, the following technical scheme is adopted for realizing the purpose:
a natural language classifier system comprising a classification proxy, a classifier, and an artificial classification platform, wherein: the classification proxy server is used for receiving the classifier serial number and the classification source language material and sending the classification source language material to the classifier with the corresponding serial number; the classifier is used for classifying the classified source linguistic data and sending a classification result to the classification proxy server; the classification proxy server is also used for carrying out automatic starting judgment on the classifier after receiving the classification result, and if the classifier is the automatic starting classifier, the classification proxy server outputs the classification result; and if the classifier is not the automatic starting classifier, the classification proxy server sends the classification source material to the manual classification platform and outputs the received classification result returned by the manual classification platform.
The classifier system of, wherein: the system also comprises a classifier configuration library, wherein the classifier configuration library is used for storing information of the background classifier, including classifier number, whether the classifier is automatically started or not, classifier address, address of manual processing queue and accuracy index information; and the classification proxy server inquires a classifier configuration library according to the classifier number, obtains the classifier address and sends the classification source material to the classifier of the corresponding address according to the obtained classifier address.
The classifier system of, wherein: after obtaining the manual classification result, the classification proxy server also compares the manual classification result with the automatic classification result, calculates the classification accuracy of the classifier, and stores the classification accuracy in a classifier configuration library.
The classifier system of, wherein: when the classification accuracy of the classifier reaches a preset threshold value, the classification proxy server sets the classifier as an automatic classification classifier and updates the information of whether the classifier is automatically started or not in the classifier configuration library to automatic starting.
The classifier system of, wherein: the system also comprises a classification result library, and the classifier proxy server also stores the classification result into the classification result library.
The classifier system of, wherein: the classifier is also used for constructing a training set according to historical classification results and continuously training.
The classifier system of, wherein: and the classification proxy server verifies the accuracy of the classifier after the automatic classification of the classifier is started, and starts the manual classification again when the accuracy of the classifier is reduced to be lower than a preset threshold value, namely, the information of whether the classifier is automatically started in the classifier configuration library is updated to be not automatically started.
The classifier system of, wherein: the classifier proxy server comprises a trigger used for setting a classifier accuracy evaluation index threshold value and setting automatic start or automatic stop of the classifier, and when the accuracy of the classifier meets the performance evaluation index threshold value setting, the classifier proxy server sets the classifier as an automatic classification classifier and updates the information of whether the classifier stored in a classifier configuration library is automatically started or not into automatic start.
The classifier system of, wherein: the accuracy evaluation indexes of the classifier comprise: the average accuracy acu of the classifier, the lower limit difference dev of the accuracy of the classifier, the fluctuation level sd of the accuracy of the classifier and the variation trend trd of the accuracy of the classifier.
The classifier system of, wherein:
a. counting the accuracy of classification by using the average value of the classification results of each t-time classification, wherein t is called a statistical period, the correct times of the t times are c, the wrong times are w, c + w is t, and the accuracy of one statistical period is c/t;
b. counting the latest p times of counting periods as a performance evaluation period, namely, classifying operation every t x p times as a performance evaluation period;
c. the average accuracy of the classifier acu is (c1/t1+ c2/t2+ … + cp/tp)/p, wherein t1, t2 and … … tp represent statistical periods; c1, c2 and … … cp indicate the correct times of classification in the statistical period;
d. calculating the lowest accuracy lower limit difference of the classifier:
dev=acu-Min(c1/t1+c2/t2+…+cp/tp)
e. calculating the accuracy fluctuation level of the classifier, namely the variance of the accuracy of the classifier;
f. and performing linear fitting on the linear function at t1 and t2 … tp by using a least square method to obtain the slope of a fitting straight line, namely the accuracy rate change trend trd of the classifier.
The classifier system of, wherein: the evaluation process of the trigger to decide whether the classifier is automatically enabled is as follows:
after an evaluation period is finished, when the average accuracy of the classifier is greater than the average accuracy lower limit of the classifier, the lowest accuracy lower limit difference of the classifier is smaller than the lowest accuracy lower limit difference maximum value of the classifier, the accuracy fluctuation level of the classifier is within the accuracy fluctuation level upper limit of the classifier, and the accuracy change trend of the classifier is greater than or equal to zero, starting automatic classification of the classifier, and setting the evaluation period to be 1.2 times until the preset evaluation period upper limit;
when the average accuracy of the classifier is greater than the average accuracy lower limit of the classifier, the lowest accuracy lower limit difference of the classifier is smaller than the lowest accuracy lower limit difference maximum value of the classifier, the accuracy fluctuation level of the classifier is within the accuracy fluctuation level upper limit of the classifier, but the accuracy change trend of the classifier is smaller than zero, the classifier is continuously forbidden, but the evaluation period is set to be 0.8 times of the previous evaluation period until the lower limit;
when the average accuracy of the classifier is greater than the average accuracy lower limit of the classifier and the lowest accuracy lower limit difference of the classifier is smaller than the lowest accuracy lower limit difference maximum value of the classifier, but the accuracy fluctuation level of the classifier is out of the accuracy fluctuation level upper limit of the classifier, and meanwhile, the accuracy change trend of the classifier is greater than or equal to zero, the classifier is continuously forbidden, but the evaluation period is set to be 0.8 times of the previous evaluation period until the lower limit;
otherwise, the classifier continues to be disabled and the evaluation period is set to a default evaluation period value.
The classifier system of, wherein: the trigger decides whether to stop the automatic classification of the classifier as follows:
when the average accuracy of the classifier is larger than the average accuracy lower limit of the classifier and the lowest accuracy lower limit difference of the classifier is smaller than the lowest accuracy lower limit difference maximum value of the classifier at the end of an evaluation period, the accuracy fluctuation level of the classifier is within the accuracy fluctuation level upper limit of the classifier, and the accuracy change trend of the classifier is larger than or equal to zero, the automatic classification of the classifier is maintained, and the evaluation period is set to be 1.2 times until the preset evaluation period upper limit;
when the average accuracy of the classifier is greater than the average accuracy lower limit of the classifier, the lowest accuracy lower limit difference of the classifier is smaller than the lowest accuracy lower limit difference maximum value of the classifier, the accuracy fluctuation level of the classifier is within the accuracy fluctuation level upper limit of the classifier, but the accuracy change trend of the classifier is smaller than zero, the automatic classification of the classifier is maintained, and the evaluation period is set to be 0.8 times of the previous evaluation period until the lower limit;
when the average accuracy of the classifier is greater than the average accuracy lower limit of the classifier and the lowest accuracy lower limit difference of the classifier is smaller than the lowest accuracy lower limit difference maximum value of the classifier, but the accuracy fluctuation level of the classifier is out of the accuracy fluctuation level upper limit of the classifier, and meanwhile, the accuracy change trend of the classifier is greater than or equal to zero, the automatic classification of the classifier is kept, but the evaluation period is set to be 0.8 times of the previous period until the lower limit;
when the average accuracy of the classifier is greater than the average accuracy lower limit of the classifier and the lowest accuracy lower limit difference of the classifier is smaller than the lowest accuracy lower limit difference maximum value of the classifier, but the accuracy fluctuation level of the classifier is out of the accuracy fluctuation level upper limit of the classifier, and meanwhile the accuracy change trend of the classifier is smaller than zero, the automatic classification of the classifier is maintained, but the evaluation period is set to be 0.5 times of the previous period until the lower limit;
in other cases, automatic classification by the classifier is disabled and the evaluation cycle frequency is set to a default value.
The classifier system of, wherein: after the classifier is automatically started, for each automatic classification, whether the classification is correct or not is judged according to the following verification mechanism:
a. front inspection mechanism
The front-check mechanism is generated before the classification proxy server returns the classification result to the classifier calling unit, and comprises two modes of manual sampling inspection and multi-classifier mutual inspection to judge the classification result is correct; in the manual sampling inspection mode, a classification proxy server randomly selects a classification request and sends the classification request to a human to be inspected, when the manual classification result is the same as the classifier prediction, 1 correct classification is counted, and otherwise, 0 correct classification is counted; the multi-classifier mutual detection prepares equivalent detection classifiers with the number more than or equal to two for each classifier, and for each classification request, when the result of the main classifier is the same as the advantage results of all the classifiers, 1 time of correct classification is counted; otherwise, the classifier counts 0.25 correct classifications.
The classifier system of, wherein: : the equivalent detection classifier uses the same corpus data as the corresponding classifier, but uses a different classification algorithm; or the detection classifier uses the same algorithm as the corresponding classifier, but uses different types of corpus.
The classifier system of, wherein: : after the classifier is automatically started, for each automatic classification, whether the classification is correct or not is judged according to the following verification mechanism:
b. mechanism of posterior
The posterior mechanism is generated after the classification proxy server returns the classification result to the classifier calling unit, and a plurality of most probable classification results reserved in the classification result are sent to customer service personnel or clients for the customer service personnel or the clients to select; after the customer service personnel or the customer selects the corresponding result, the selected result represents the selection of the customer service personnel or the customer service for classification, namely, a feedback is provided for the classification, and the classification proxy server judges whether the automatic classification result of the classifier is correct or not by collecting and using the feedback result.
The classifier system of, wherein: : the classification proxy server records the received feedback to a classifier configuration library after classification occurs for a period of time, and if no feedback is received after the appointed time is exceeded, the feedback result is not recorded any more, namely the posterior mechanism has no result; when the feedback comes from customer service personnel, the feedback is called customer service feedback, when the customer service feedback result is the same as the classification result with the highest probability in the classification results, 1 correct classification is counted, and otherwise, 0 correct classification is counted; when the feedback comes from the client, the feedback is called client feedback, when the result fed back by the client is consistent with the automatic classification result of the classifier, the classifier is correctly classified for 1 time, when the result fed back by the client is inconsistent with the classification result with the highest probability in the automatic classification result of the classifier, the classifier is correctly classified for 0.4 times, and when the result fed back by the client is not consistent, the classifier is correctly classified for 0.5 times.
The classifier system of, wherein: for each automatic classification result, calculating the correct value count of the classification according to the following flow:
if the manual spot check exists, the correct counting of the results of the manual spot check is used;
if the customer service feedback result is not obtained, the customer service feedback result is used as the result of the classification for correct counting;
if no customer service feedback result exists, the calculation method for the correct counting of the classification comprises the following steps: if the client feedback has a value, the correct counting of the classification is as follows: (correctly counting the mutual detection results of the multiple classifiers and correctly counting the feedback results of the clients)/2;
if the feedback result of the client has no value, the correct counting of the current classification is the correct counting of the mutual inspection result of the multiple classifiers.
Detailed Description
As shown in fig. 1, the high-accuracy natural language classifier system of the present invention includes a classifier proxy server, a classifier, an artificial classification platform, a classification result library, a classifier configuration library, and a classifier invoking unit.
The classification proxy server is configured to: a. receiving the natural language classification source corpus and the classifier number parameter from the classifier calling unit, and sending a classification request to the classifier according to the classifier number parameter; b. and determining whether manual classification is needed or not according to the automatic starting information of the classifier in the classifier configuration library B. If necessary, sending the information to a manual classification platform for classification; otherwise, returning the result of the classifier to the classifier calling unit to complete the classification function; c. the result of manual classification and automatic classification is compared, the classifier accuracy information is counted, whether the classifier is automatically started or not can be automatically determined, and the classifier is stored in a classifier configuration library B; d. storing the classification result into a classification result library A, wherein the result can provide a training set for the classifier; e. returning the result of automatic classification or manual classification to the classifier calling unit; f. after the full-automatic classification is started, the classifier proxy server verifies whether the accuracy of the automatic classification is kept in a commercially available range or not through modes of random manual spot check, multi-classifier classification result comparison, user feedback and the like, and adjusts an automatic/manual classification strategy according to the result to ensure the accuracy of the classifier; g. creating one or more classifiers and providing editing and deleting functions of the classifiers; h. for each classifier, one or more training sets and test sets are provided. The training set is used to train the language classifier. The test set is used to verify the initial classifier correctness.
The classifier configuration library is used for: and storing the information of the background classifier, including the serial number of the classifier, whether the classifier is automatically started, the address of the classifier, the address of the manual processing queue, the accuracy index information and the like.
The classifier is for: a. performing natural language classification on the natural language classification source material; b. classifier training and retraining are performed. The information of the classification result library A can be read for retraining.
The manual classification platform is an instant communication platform, and allows a manual classifier to perform classification operation on input classification requests in real time by using the platform. The manual classification platform has the following functions: the user management functions of the manual classifier comprise user creation, user editing, user modification, user deletion, user login and authorization and the like; according to the professional fields of different classifiers, the manual classifiers can be classified; the instant communication platform can realize a multi-queue function and send messages to a group of users according to a specified queue; the instant communication platform can associate the message queue with the manual classifier group, so that a specific manual classification request is sent to a corresponding manual classifier with professional knowledge for classification processing.
FIG. 2 is a schematic workflow diagram of the classifier system of the present invention. As shown in fig. 2:
1. the classifier calling unit sends the classifier number and the classification source corpus to the classification proxy server, and expects to obtain a classification result.
2. And the classification proxy server queries a classifier configuration library according to the classifier number to obtain the classifier address.
3. And the classification proxy server sends the classified source material to the classifier corresponding to the address according to the obtained classifier address.
4. And the corresponding classifier automatically classifies and returns a classification result to the classifier proxy server.
5. And the classification proxy server determines whether the returned result is correct according to the information of whether the classifier in the classifier configuration library is automatically started.
And 6A, if the automatic enabling information is true, the result of the classifier meets the commercial requirement, and the classification proxy server directly returns the result to the classifier calling unit to finish the classification task.
And 6B, if the automatic starting information is false, representing that the result of the classifier temporarily does not meet the commercial requirement, sending the classified source material to a manual classification platform by the classification proxy server for classification, and returning the classification result to the classification calling unit after the classification proxy server obtains the manual classification result to finish the classification task.
7. After obtaining the manual classification result, the classifier proxy server also compares the manual classification result with the automatic classification result to evaluate the accumulated classification accuracy. And stored in a classifier configuration library. The classifier proxy server comprises a trigger which allows a user to set a set of accuracy evaluation index threshold, and when the accuracy of the classifier meets the performance evaluation index threshold, the classifier proxy server sets the classifier as an automatic classification classifier and updates the information of whether the classifier stored in the classifier configuration library is automatically started or not to be automatically started.
The accuracy evaluation indexes of the classifier comprise: the average accuracy acu of the classifier, the lower limit difference dev of the accuracy of the classifier, the fluctuation level sd of the accuracy of the classifier and the variation trend trd of the accuracy of the classifier. The corresponding threshold values are: lower average accuracy limit acu1 for classifier, default 85%; the classifier has the minimum accuracy lower limit difference maximum value dev0, and the default value dev is 5% (namely the minimum accuracy acu0 is acu1-dev, and the default minimum accuracy is 80%); the upper limit sd0 of the accuracy fluctuation level of the classifier is 3% of the default value; the accuracy rate change trend trd0 of the classifier takes positive, zero and negative values.
The calculation method of these indices is as follows:
a. and (4) counting the accuracy of classification by using the average value of the classification results of each t-time-lapse classification, wherein t is called a counting period, and the default t is 10. Of the t times, the correct number is c, the error number is w, and c + w is t. The accuracy of one statistical cycle was c/t 100%.
b. And counting the latest p times of counting periods as a performance evaluation period, which is called a monitoring period. The default p is 10, i.e., every t p (default 10 is 100) sorting operations, which is a performance evaluation cycle. The classification proxy server dynamically shortens or prolongs the evaluation period according to the accuracy evaluation result of the classifier, but the evaluation period has a lower limit value and an upper limit value. The default lower limit is 5 and the default upper limit is 50.
c. The average accuracy of the classifier acu is (c1/t1+ c2/t2+ … + cp/tp)/p, wherein t1, t2 and … … tp represent statistical periods; c1, c2, … … cp show the number of times of correct classification in the statistical period
d. The lowest accuracy lower limit difference dev of the classifier is acu-minium (c1/t1, c2/t2, …, cp/tp)
e. The accuracy fluctuation level of the classifier, namely the variance sd of the accuracy of the classifier, is calculated by the formula:
wherein acu represents the average accuracy of the classifier; where ti represents the statistical period; ci represents the correct times of classification in the counting period; p represents the number of statistical periods
f. And performing linear fitting of a linear function on c1/t1, c2/t2 … cp/tp by using a least square method to obtain the slope of a fitting straight line, namely the accuracy rate change trend trd of the classifier.
the calculation formula for trd is:
y=[c1/t1,c2/t2,...,cp/tp]T
the detection of triggers and the triggering process can refer to the evaluation process automatically enabled by the classifier of fig. 3.
After an evaluation period is finished, when the average accuracy of the classifier is greater than the average accuracy lower limit of the classifier, the lowest accuracy lower limit difference of the classifier is smaller than the lowest accuracy lower limit difference maximum value of the classifier, the accuracy fluctuation level of the classifier is within the accuracy fluctuation level upper limit of the classifier, and the accuracy change trend of the classifier is greater than or equal to zero, starting automatic classification of the classifier, and setting the evaluation period to be 1.2 times until the preset evaluation period upper limit (for example, the evaluation period upper limit can be 2 times).
When the average accuracy of the classifier is greater than the average accuracy lower limit of the classifier, the lowest accuracy lower limit difference of the classifier is smaller than the maximum value of the lowest accuracy lower limit difference of the classifier, the accuracy fluctuation level of the classifier is within the accuracy fluctuation level upper limit of the classifier, but the accuracy change trend of the classifier is smaller than zero, the classifier is continuously disabled, but the evaluation period is set to be 0.8 times of the previous evaluation period until the lower limit (for example, the evaluation period with the evaluation period lower limit of 0.3 times).
When the average accuracy of the classifier is greater than the average accuracy lower limit of the classifier and the lowest accuracy lower limit difference of the classifier is smaller than the maximum value of the lowest accuracy lower limit difference of the classifier, but the accuracy fluctuation level of the classifier is out of the accuracy fluctuation level upper limit of the classifier, and meanwhile the accuracy change trend of the classifier is greater than or equal to zero, the classifier is continuously disabled, but the evaluation period is set to be 0.8 times of the previous evaluation period until the lower limit (for example, the evaluation period lower limit can be 0.3 times of the evaluation period).
Otherwise, the classifier continues to be disabled and the evaluation period is set to a default evaluation period value.
8. The classification proxy server also stores the classification result to a classification result library. The historical classification results accumulated in the classification result library provide labeled training corpora under the real user scene for the training of the classifier, and manpower is not needed to label the training corpora. The classification proxy server can use the reserved historical manual classification results to continuously train the existing classifier, so that the classification accuracy is improved.
9. The classifier uses the "natural language text, classification result" binary corpus data pairs in the historical artificial classification results to expand the currently owned training set, and then reconstructs an updated version of the classifier model using a mathematical method (such as SVM, naive Bayes classifier) by using statistical rules or using a neural network algorithm (such as RNN/LSTM) according to the natural language sequence order rules. Because the updated training set contains richer and more accurate corpus information, the reconstructed classifier has higher accuracy than the old-version classifier.
10. The classifier proxy server can use a plurality of modes, after the automatic classification of the classifier is started, the accuracy of the classifier is verified, when the accuracy of the classifier is reduced to be below a commercial threshold value, the manual classification is started again to ensure the accuracy of the classification, namely, the information whether the classifier stored in the classifier configuration library is automatically started or not is updated to be not automatically started.
After the automatic activation of the classifier, there may be one or more of the following mechanisms for each automatic classification to determine whether it is correct.
a. Front inspection mechanism
The check-ahead mechanism occurs before the classification proxy server returns the classification result to the classifier calling unit, and the classification proxy server updates the result of the check-ahead mechanism to the classifier configuration library in time. There are two ways of judging the correct classification result by two ways of manual sampling inspection and multi-classifier mutual inspection. And in the manual sampling inspection mode, the classification proxy server randomly selects a classification request and sends the classification request to a human to be inspected, when the manual classification result is the same as the classifier prediction, 1 time of correct classification is counted, and otherwise, 0 time of correct classification is counted. Multi-classifier cross-detection N (N > ═ 2) equivalent detection classifiers are prepared for each classifier, using the same corpus data as the corresponding classifier, but using a different classification algorithm. For example, the main classifier is LSTM algorithm with attention model, and the detection classifier uses Bayesian algorithm, and uses the same Chinese vocabulary based training corpus. Alternatively, the detection classifier may use the same algorithm as the corresponding classifier but with different types of corpora, e.g., the main classifier uses a Chinese vocabulary based corpus and the detection classifier uses a corresponding vocabulary Chinese Pinyin symbols and tones corpus (the corpus format for the detected classification request may be manually converted). For each classification request, 1 correct classification is counted when the result of the master classifier (the classifier whose result is to be verified) is the same as the result of all classifier dominance (i.e., the same result obtained by most classifiers including the master classifier). Otherwise, the classifier counts 0.25 correct classifications.
b. Mechanism of posterior
The posterior mechanism occurs after the classification proxy server returns the classification result to the classifier invoking unit. The classification proxy server provides the classifier invocation unit with a unique number representing a classification. The classifier calling unit embeds the unique number in the answer for the user to select, and 2-3 most probable classification results are usually reserved in the answer and sent to the customer service staff or the client for the customer service staff or the client to select. After the customer service staff or the customer selects the corresponding result, the selected result represents the selection of the customer service staff or the customer service on the classification, namely, a feedback is provided for the classification. The classification proxy server judges whether the automatic classification result of the classifier is correct or not by collecting and using the feedback result. The classification proxy server will record the received feedback to the classifier configuration library for a period of time (typically 30-60 seconds) after the classification occurs, and if no feedback is received beyond the appointed time, no feedback result is recorded, i.e. the a posteriori mechanism has no result. When the feedback comes from the customer service staff, the feedback is called customer service feedback, and the result of the feedback is high in credibility. And when the customer service feedback result is the same as the classification result with the highest probability in the classification results, counting 1 time of correct classification, or counting 0 time of correct classification. When the feedback comes from the client, called client feedback, the confidence of the feedback result is low. When the result fed back by the client is consistent with the automatic classification result of the classifier, the current classifier counts 1 correct classification. When the result fed back by the client is inconsistent with the classification result with the highest probability in the automatic classification results of the classifier, the classifier is used for 0.4 times of correct classification. When no client feeds back the result, the classifier counts 0.5 times of correct classification.
For each automatic classification result, the correct value count of the classification is calculated as follows.
● if there is a manual spot check, the correct count of the results of the manual spot check is used.
● if there is no manual spot check, see if there is a result of customer service feedback, if so, use the result of customer service feedback as the result of this classification to count correctly.
● if there is no customer service feedback result, the calculation method for the correct counting of this classification is:
if the client feedback has a value, the correct counting of the classification is as follows:
(correctly counting the mutual inspection results of multiple classifiers + correctly counting the feedback results of clients)/2
If the feedback result of the client has no value, the classification is correctly counted as the correct counting of the mutual inspection result of the multiple classifiers
And after the correct evaluation result is automatically classified every time, evaluating the correct rate index in an evaluation period by using the same method in the step 7. And the flow illustrated in fig. 4 is used to determine whether to stop the automatic classification of the classifier:
at the end of an evaluation period, when the average accuracy of the classifier is greater than the average accuracy lower limit of the classifier, the lowest accuracy lower limit difference of the classifier is smaller than the lowest accuracy lower limit difference maximum value of the classifier, the accuracy fluctuation level of the classifier is within the accuracy fluctuation level upper limit of the classifier, and the accuracy change trend of the classifier is greater than or equal to zero, the automatic classification of the classifier is maintained, and the evaluation period is set to be 1.2 times until the preset evaluation period upper limit (for example, the evaluation period upper limit can be 2 times).
When the average accuracy of the classifier is larger than the average accuracy lower limit of the classifier, the lowest accuracy lower limit difference of the classifier is smaller than the lowest accuracy lower limit difference maximum value of the classifier, the accuracy fluctuation level of the classifier is within the accuracy fluctuation level upper limit of the classifier, but the accuracy change trend of the classifier is smaller than zero, the automatic classification of the classifier is maintained, and the evaluation period is set to be 0.8 times of the previous evaluation period until the lower limit (for example, the evaluation period with the evaluation period lower limit of 0.3 times).
When the average accuracy of the classifier is larger than the average accuracy lower limit of the classifier and the lowest accuracy lower limit difference of the classifier is smaller than the maximum value of the lowest accuracy lower limit difference of the classifier, but the accuracy fluctuation level of the classifier is out of the accuracy fluctuation level upper limit of the classifier, and meanwhile the accuracy change trend of the classifier is larger than or equal to zero, the automatic classification of the classifier is maintained, but the evaluation period is set to be 0.8 times of the previous period until the lower limit (for example, the evaluation period lower limit can be 0.3 times of the evaluation period).
When the average accuracy of the classifier is larger than the average accuracy lower limit of the classifier and the lowest accuracy lower limit difference of the classifier is smaller than the lowest accuracy lower limit difference maximum value of the classifier, but the accuracy fluctuation level of the classifier is out of the accuracy fluctuation level upper limit of the classifier, and meanwhile the accuracy change trend of the classifier is smaller than zero, the automatic classification of the classifier is maintained, but the evaluation period is set to be 0.5 times of the previous period until the lower limit (for example, the evaluation period with the evaluation period lower limit of 0.3 times).
In other cases, automatic classification by the classifier is disabled and the evaluation cycle frequency is set to a default value.
The natural language classifier system realized by the invention realizes thesteps 5, 6A and 6B, so that the natural language classification requirement with high accuracy can be quickly realized by introducing a manual platform when the accuracy of the classifier does not reach the commercial standard. Meanwhile, through the step 8, a large amount of correct real corpus training data sets are generated while manual classification work is finished on classification tasks, and the cost for manually generating the training sets is greatly saved. Through step 9, the system continuously improves the classifier accuracy. Through step 7, when the classifier reaches the commercial standard, the system is allowed to automatically or manually start automatic classification, and finally the construction of the full-automatic natural language classifier is completed quickly and at low cost. By step 10 it is ensured that the correctness of the classifier is kept at a high level.