The present application claims the filing of the divisional application of the chinese patent office, application number 201810801678.8, and the name of the "disambiguation detection method for customer service robot knowledge base" at 2018, month 07 and day 19.
Disclosure of Invention
In order to overcome the problems existing in the related art to at least a certain extent, the application provides a method, a device and related equipment for detecting ambiguity of a customer service robot knowledge base.
In a first aspect, the present application provides a method for detecting ambiguity in a knowledge base of a customer service robot, including:
constructing a knowledge base, wherein the knowledge base is divided according to FAQs, each FAQ is provided with at least one similar question, and each FAQ is of a category;
dividing the knowledge base into a test set and a training set of a deep learning model;
training a deep learning model on a training set, and performing ambiguity detection by using the learned deep learning model;
updating the knowledge base according to the ambiguity detection result;
repeating the steps until the learning effect is not improved, and obtaining the disambiguated knowledge base.
Further, the dividing the knowledge base into a test set and a training set of the deep learning model includes: randomly extracting a preset number of similar questions corresponding to each FAQ as test data of the corresponding FAQ category, and taking the rest similar questions as training data of the corresponding FAQ category. All kinds of test data constitute a test set, and all kinds of training data constitute a training set.
Further, the deep learning model includes: feature extractor, shallow classifier, training the deep learning model on the training set, including:
inputting questions in the training set FAQ as input parts to the deep learning model;
converting the question in the input part into a feature vector by using a feature extractor in the deep learning model;
calculating a prediction result according to the characteristic vector by using a shallow classifier in the deep learning model, wherein the prediction result is a category corresponding to a question in an input part;
optimizing a training model by using an optimizer, and minimizing the average difference between the actual category of question marks in the training set and the prediction result of the deep learning model;
and evaluating the trained model by using the test set, and calculating the consistency ratio of the model prediction result and the actual category marked by the question in the test set to be used as the evaluation of the model learning effect.
Further, the ambiguity detection includes: category ambiguity detection, annotation error detection and annotation ambiguity detection, wherein the ambiguity detection by using the learned deep learning model comprises the following steps:
detecting ambiguity by using a feature extractor in the deep learning model;
and detecting ambiguity by using a shallow classifier in the deep learning model.
Further, the detecting ambiguity by using the feature extractor in the deep learning model includes:
converting similar questions in a data set into feature vectors by using a feature extractor in the deep learning model, wherein the data set comprises a training set or/and a testing set;
combining the feature vectors corresponding to the questions into question feature vector pairs (x, y), wherein the questions corresponding to the feature vectors x and the questions corresponding to the feature vectors y are respectively from different categories;
calculating vector similarity cos (x, y) of each set of question feature vector pairs, the
And sorting all question feature vector pairs according to the vector similarity from high to low, selecting the question feature vector pair with the top vector similarity, and judging whether ambiguity exists according to the question feature vector pair with the top vector similarity.
Further, the determining whether there is ambiguity according to the question feature vector pair with the top rank of the vector similarity includes:
judging whether a labeling ambiguity or a labeling error exists or not: extracting a first preset number of question feature vector pairs with top similarity ranking, and manually checking whether corresponding question pairs have mislabeling and mislabeling;
judging whether category ambiguity exists: and counting the repeated occurrence times of the corresponding category pairs for the first preset number of question feature vector pairs, sorting the corresponding category pairs from high to low according to the occurrence times, and taking the second preset number of category pairs to manually check whether category ambiguity exists.
Further, the detecting ambiguity by using the deep learning model shallow classifier includes:
counting the classification results of the deep learning model and forming a confusion matrix, wherein each row i of the confusion matrix corresponds to the labeled category, each column j corresponds to the category predicted by the deep learning model, and the element x is the element xij Is the number of questions marked as category i and the model predicts as category j, element xji The number of questions marked as category j and predicted as category i by the model;
calculating the number of samples marked as class i in the data set, wherein the number of samples of the class i is
Wherein k is any class;
calculating the number of samples marked as a class j in the data set, wherein the number of samples of the class j is
Wherein k is any class;
calculating the proportion P of samples marked as class i in the data set predicted as class j by the deep learning model
ij Ratio P to prediction of samples labeled Category j to Category i
ji The P is
ij And P
ji The calculation formulas are respectively as follows:
the category i and the category j belong to different categories, and the data set comprises a training set or/and a testing set;
calculating the confusion degree of the category pair (category i, category j), wherein the confusion degree is P
ij And P
ji Is the harmonic mean S of (2)
ij The said
And judging whether the category i and the category j have ambiguity according to the confusion degree.
Further, the determining whether the category i and the category j have ambiguity according to the confusion degree includes:
sorting the calculated confusion;
and extracting a third preset number of categories with the top confusion degree rank, and manually detecting whether category ambiguity exists.
Further, the detecting ambiguity by using the deep learning model shallow classifier further includes: and (3) finding out data with inconsistent actual categories marked in the data set and categories predicted by the deep learning model, and manually checking whether marking errors exist or not, wherein the data set comprises a training set or/and a testing set.
Further, the updating the knowledge base according to the ambiguity detection result includes:
manually rewriting and manually remarking the detected ambiguous question, and deleting the original annotation;
and (3) recombining and distributing similar questions of the detected ambiguity categories, and deleting the original ambiguity categories.
In a second aspect, the present application provides a customer service robot knowledge base ambiguity detection apparatus, including:
the construction module is used for constructing a knowledge base, the knowledge base is divided according to FAQs, each FAQ is provided with at least one similar question sentence, and each FAQ is of a category;
the division module is used for dividing the knowledge base into a test set and a training set of the deep learning model;
the training module is used for training a deep learning model on the training set and carrying out ambiguity detection by utilizing the learned deep learning model; the ambiguity detection includes: category ambiguity detection, annotation error detection and annotation ambiguity detection, said learning using said deep learning modelLine ambiguity detection, comprising: detecting ambiguity using a shallow classifier in a deep learning model, comprising: counting the classification results of the deep learning model and forming a confusion matrix, wherein each row i of the confusion matrix corresponds to the labeled category, each column j corresponds to the category predicted by the deep learning model, and the element x is the element x
ij Is the number of questions marked as category i and the model predicts as category j, element x
ji The number of questions marked as category j and predicted as category i by the model; calculating the number of samples marked as class i in the data set, wherein the number of samples of the class i is
Wherein k is any class; calculating the number of samples marked as class j in the dataset, wherein the number of samples of the class j is +.>
Wherein k is any class; calculating the proportion P of samples marked as class i in the data set predicted as class j by the deep learning model
ij Ratio P to prediction of samples labeled Category j to Category i
ji The P is
ij And P
ji The calculation formulas are respectively as follows:
The category i and the category j belong to different categories, and the data set comprises a training set or/and a testing set; calculating the confusion degree of the category pair (category i, category j), wherein the confusion degree is P
ij And P
ji Is the harmonic mean S of (2)
ij Said->
Judging whether ambiguity exists between the category i and the category j according to the confusion degree;
the updating module is used for updating the knowledge base according to the ambiguity detection result;
and the repeating module is used for repeating the steps until the learning effect is not improved any more, and obtaining a disambiguated knowledge base.
In a third aspect, the present application provides an electronic device, comprising:
at least one memory for storing a program;
at least one processor for loading the program to perform the customer service robot knowledge base ambiguity detection method of any one of the first aspects.
In a fourth aspect, the present application provides a computer-readable storage medium having stored therein a program executable by a processor, comprising:
the processor executable program when executed by a processor is for performing the customer service robot knowledge base ambiguity detection method of any one of the first aspects.
The technical scheme provided by the embodiment of the application can comprise the following beneficial effects:
according to the method, the knowledge base is updated according to the ambiguity detection result, the training steps are repeated until the learning effect reaches the expected standard, the ambiguity of the knowledge base can be found and corrected manually, the disambiguated knowledge base is obtained, data are extracted from the disambiguated knowledge base to serve as a training set and a testing set of the deep learning model, and the learning effect of the deep learning model is further improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Detailed Description
The present invention will be described in detail with reference to the accompanying drawings and examples.
Fig. 1 is a flowchart of a method for detecting ambiguity in a knowledge base of a customer service robot according to an embodiment of the present application.
As shown in fig. 1, the method of the present embodiment includes:
s11: constructing a knowledge base, dividing the knowledge base according to FAQs, wherein each FAQ is provided with an indefinite number of similar questions, and each FAQ is a category.
The knowledge base is developed on the basis of large-scale knowledge processing and is applied to technical industries of large-scale knowledge processing, natural language understanding, knowledge management, automatic question-answering systems, reasoning and the like, and intelligent customer service not only provides fine-grained knowledge management technology for enterprises, but also establishes a quick and effective technical means based on natural language for communication between the enterprises and massive users. Taking a customer service robot knowledge base of an e-commerce enterprise as an example, the knowledge base includes a plurality of FAQs, such as a "refund process" and a "refund process". Taking the "return flow" as an example, the FAQ may include the following similar questions: "what did i buy yesterday return? "how should me operate to return goods? ".
S12: the knowledge base is divided into a test set and a training set of the deep learning model.
And selecting N FAQs needing to detect ambiguity from the knowledge base as N categories. For each FAQ, randomly extracting a preset number of similar questions as test data of the category, and taking the rest similar questions as training data of the category. All kinds of test data constitute a test set, and all kinds of training data constitute a training set.
For example, the knowledge base includes 10 FAQs, each FAQ includes 20 similar questions, a preset amount, for example, 3 similar questions, are randomly extracted from each category of the knowledge base as a test set of the deep learning model, then 30 similar questions are included in the test set, and the remaining 170 similar questions are included in a training set of the deep learning model.
It should be noted that, the number of categories included in the knowledge base and the number of similar questions included in each category are not limited to the examples in the embodiments, and are not repeated here.
S13: training a deep learning model on the training set, and performing ambiguity detection by using the learned deep learning model.
The deep learning model includes: feature extractor, shallow classifier.
The ambiguity detection includes: category ambiguity detection, annotation error detection and annotation ambiguity detection;
the ambiguity includes:
category ambiguity: that is, the meanings of the two categories are very similar, for example, category 1 is an "order problem", category 2 is a "change cancel problem of product", the semantics of category 1 and the semantics of category 2 are coincident, because category 1 can basically cover category 2;
disambiguation: i.e. questions may be given the same time scale as classes, for example: category 1 is "return problem of product", category 2 is "price problem of product", if the question is "this thing is too expensive, i want to return", there is a disambiguation of this sentence, because the question contains the meanings of both categories;
labeling errors: the question corresponds to the wrong category, for example, category 1 is "return problem of product", category 2 is "price problem of product", and if the question is "I do not want to" but is labeled as category 2, a labeling error will occur.
The ambiguity detection is for a test set or/and training set.
The performing ambiguity detection by using the learned deep learning model comprises:
detecting ambiguity by using a feature extractor in the deep learning model;
detecting ambiguity by using a shallow classifier in the deep learning model;
the detecting ambiguity using the feature extractor in the deep learning model includes:
converting similar questions in a data set into feature vectors by using a feature extractor in the deep learning model, wherein the data set comprises a training set or/and a testing set;
combining the feature vectors corresponding to the questions into question feature vector pairs (x, y), wherein the questions corresponding to the feature vectors x and the questions corresponding to the feature vectors y are respectively from different categories;
calculating vector similarity cos (x, y) of each set of question feature vector pairs, the
And sorting all question feature vector pairs according to the vector similarity from high to low, selecting the question feature vector pair with the top vector similarity, and judging whether ambiguity exists according to the question feature vector pair with the top vector similarity.
The judging whether ambiguity exists according to the question feature vector pair with the top rank of the vector similarity comprises the following steps:
judging whether a labeling ambiguity or a labeling error exists or not: extracting first preset number of question feature vector pairs with the top similarity ranking of 30, and manually checking whether corresponding question pairs have mislabeling and mislabeling;
judging whether category ambiguity exists: and counting the repeated occurrence times of the corresponding category pairs for the first preset number of question feature vector pairs, sorting the corresponding category pairs from high to low according to the occurrence times, taking a second preset number of category pairs, for example 20, and manually checking whether category ambiguity exists.
The detecting ambiguity using the deep learning model shallow classifier includes:
counting the classification results of the deep learning model and forming a confusion matrix, wherein each row i of the confusion matrix corresponds to the labeled category, each column j corresponds to the category predicted by the deep learning model, and the element x is the element xij Is the number of questions marked as category i and the model predicts as category j, element xji The number of questions marked as category j and predicted as category i by the model;
calculating the number of samples marked as class i in the data set, wherein the number of samples of the class i is
Wherein k is any of the classes;
Calculating the number of samples marked as a class j in the data set, wherein the number of samples of the class j is
Wherein k is any class;
calculating the proportion P of samples marked as class i in the data set predicted as class j by the deep learning model
ij Ratio P to prediction of samples labeled Category j to Category i
ji The P is
ij And P
ji The calculation formulas are respectively as follows:
the category i and the category j belong to different categories, and the data set comprises a training set or/and a testing set;
calculating the confusion degree of the category pair (category i, category j), wherein the confusion degree is P
ij And P
ji Is the harmonic mean S of (2)
ij The said
And judging whether the category i and the category j have ambiguity according to the confusion degree.
Judging whether the category i and the category j have ambiguity according to the confusion degree comprises the following steps:
sorting the calculated confusion;
extracting a third preset number of category pairs with the top ranking of 5 confusion degrees, and manually detecting whether category ambiguity exists.
The method for detecting ambiguity by using the deep learning model shallow classifier further comprises the following steps: and (3) finding out data with inconsistent actual categories marked in the data set and categories predicted by the deep learning model, and manually checking whether marking errors exist or not, wherein the data set comprises a training set or/and a testing set.
S14: updating the knowledge base according to the ambiguity detection result, including:
manually rewriting and manually remarking the detected ambiguous question, and deleting the original annotation;
and (3) recombining and distributing similar questions of the detected ambiguity categories, and deleting the original ambiguity categories.
S15: repeating the steps until the learning effect is not improved, and obtaining the disambiguated knowledge base.
The learning effect is the consistency rate of the model prediction result and the actual category marked by the question in the test set, and the consistency rate is, for example, the prediction accuracy rate, that is, the number of questions with consistent prediction result divided by the total number of questions. The learning effect is not improved any more, for example, the prediction accuracy is improved by less than 0.5%.
When the model learning effect is not improved any more, the model performance degradation caused by knowledge base ambiguity is eliminated, and the knowledge base can be used for training the model and deployed into a production environment for use.
In this embodiment, the knowledge base is updated according to the ambiguity detection result, and the training steps are repeated until the learning effect reaches the expected standard, so that the knowledge base ambiguity can be found and corrected manually, a disambiguated knowledge base is obtained, and data is extracted from the disambiguated knowledge base as a training set and a testing set of the deep learning model, so that the learning effect of the deep learning model is further improved.
Fig. 2 is a flowchart of a method for detecting ambiguity of a customer service robot knowledge base according to another embodiment of the present application.
As shown in fig. 2, the training of the deep learning model on the training set includes:
the deep learning concept is derived from the research of an artificial neural network and comprises a multi-layer perceptron with multiple hidden layers. Deep learning forms more abstract high-level representation attribute categories or features by combining low-level features to discover distributed feature representations of data. The deep learning model includes a feature extractor and a shallow classifier.
S21: inputting questions in the training set as input parts to the deep learning model;
s22: converting the question in the input part into a feature vector by using a feature extractor in the deep learning model;
the feature extractor is, for example, a recurrent neural network. The model reads in each word in the question sequentially and outputs a feature vector of a fixed dimension. It should be noted that the feature extractor is not limited to the cyclic neural network illustrated, and any method that can convert a question into a feature vector with a fixed dimension may be used as the feature extractor.
S23: calculating a prediction result according to the characteristic vector by using a shallow classifier in the deep learning model, wherein the prediction result is a category corresponding to a question in an input part;
the shallow classifier is, for example, a linear classifier. The classifier reads in a feature vector with fixed dimension, calculates the linear combination of vector elements to obtain the scoring of each category, and takes the category with the highest scoring as the prediction result. It should be noted that the shallow classifier is not limited to the linear classifier, and any method that can convert the feature vector with a fixed dimension into the score of each class may be used as the shallow classifier.
S24: optimizing a training model by using an optimizer, and minimizing the average difference between the actual category of question marks in the training set and the prediction result of the deep learning model;
the average difference is for example a loss function. The loss function is, for example, cross entropy.
The optimizer is, for example, a gradient descent method. The Gradient Descent is one of the iterative methods, and when solving model parameters of a machine learning algorithm, i.e. unconstrained optimization problems, gradient Descent (Gradient device) is one of the most commonly adopted methods. When the minimum value of the loss function is solved, the minimum loss function and corresponding model parameter values can be obtained through one-step iterative solution by a gradient descent method.
S25: and evaluating the trained model by using a test set, and calculating the consistency rate of the model prediction result and the actual category marked by the questions in the test set, wherein the consistency rate is used for evaluating the model learning effect, and is the prediction accuracy rate, i.e. the number of questions with consistent prediction result divided by the total number of questions.
In this embodiment, the deep learning model is used to train the FAQ in the training set, and the optimizer is used to continuously optimize the model during the training process, so as to continuously iterate and improve the learning effect of the deep learning model, and continuously improve the ambiguity detection accuracy.
An embodiment of the present application provides a customer service robot knowledge base ambiguity detection device, including:
the construction module is used for constructing a knowledge base, the knowledge base is divided according to FAQs, each FAQ is provided with at least one similar question sentence, and each FAQ is of a category;
the division module is used for dividing the knowledge base into a test set and a training set of the deep learning model;
the training module is used for training a deep learning model on the training set and carrying out ambiguity detection by utilizing the learned deep learning model; the ambiguity detection includes: category ambiguity detection, annotation error detection and annotation ambiguity detection, wherein the ambiguity detection by using the learned deep learning model comprises the following steps: detecting ambiguity using a shallow classifier in a deep learning model, comprising: counting the classification results of the deep learning model and forming a confusion matrix, wherein each row i of the confusion matrix corresponds to the labeled category, each column j corresponds to the category predicted by the deep learning model, and the element x is the element x
ij Is the number of questions marked as category i and the model predicts as category j, element x
ji The number of questions marked as category j and predicted as category i by the model; calculating the number of samples marked as class i in the data set, wherein the number of samples of the class i is
Wherein k is any class; calculating the number of samples marked as class j in the dataset, wherein the number of samples of the class j is +.>
Wherein k is any class; calculating the proportion P of samples marked as class i in the data set predicted as class j by the deep learning model
ij Ratio to predict samples labeled category j to category iExample P
ji The P is
ij And P
ji The calculation formulas are respectively as follows:
The category i and the category j belong to different categories, and the data set comprises a training set or/and a testing set; calculating the confusion degree of the category pair (category i, category j), wherein the confusion degree is P
ij And P
ji Is the harmonic mean S of (2)
ij Said->
Judging whether ambiguity exists between the category i and the category j according to the confusion degree;
the updating module is used for updating the knowledge base according to the ambiguity detection result;
and the repeating module is used for repeating the steps until the learning effect is not improved any more, and obtaining a disambiguated knowledge base.
In some embodiments, further comprising:
the random extraction module is used for dividing the knowledge base into a test set and a training set of the deep learning model, and comprises the following steps: randomly extracting a preset number of similar questions corresponding to each FAQ as test data of the corresponding FAQ category, and taking the rest similar questions as training data of the corresponding FAQ category; all kinds of test data constitute a test set, and all kinds of training data constitute a training set.
The sorting module is used for sorting the calculated confusion, extracting the category with the highest ranking of the third preset number of the confusion, and manually detecting whether category ambiguity exists.
The labeling module is used for detecting ambiguity by using the deep learning model shallow classifier, and further comprises: and (3) finding out data with inconsistent actual categories marked in the data set and categories predicted by the deep learning model, and manually checking whether marking errors exist or not, wherein the data set comprises a training set or/and a testing set.
One embodiment of the present application provides an electronic device, including:
at least one memory for storing a program;
at least one processor for loading the program to perform the customer service robot knowledge base ambiguity detection method as described in the above embodiments.
One embodiment of the present application provides a computer-readable storage medium having stored therein a program executable by a processor, comprising:
the processor executable program when executed by the processor is for performing the customer service robot knowledge base ambiguity detection method as described in the above embodiments.
It is to be understood that the same or similar parts in the above embodiments may be referred to each other, and that in some embodiments, the same or similar parts in other embodiments may be referred to.
It should be noted that in the description of the present application, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Furthermore, in the description of the present application, unless otherwise indicated, the meaning of "plurality" means at least two.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and further implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.
It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product.
The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Although embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives, and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.
It should be noted that the present invention is not limited to the above-mentioned preferred embodiments, and those skilled in the art can obtain other products in various forms without departing from the scope of the present invention, however, any changes in shape or structure of the present invention, and all technical solutions that are the same or similar to the present application, fall within the scope of the present invention.