Disclosure of Invention
The invention aims to provide a self-adaptive test case selection and optimization method based on a deep neural network, which can provide a test case set with more uniform distribution so as to highlight the specific gravity mispredicted by a DNN model, and the test cases obtained by selection are marked by the method, so that the marking cost can be saved, the DNN test efficiency can be improved, and the robustness of the DNN model can be improved by guiding retraining.
The technical scheme is that the adaptive test case optimizing and selecting method based on the deep neural network is used for processing a test case of a DNN model for N classification tasks as follows:
S1, inputting unmarked test case sets to be selected into a DNN model, dividing the unlabeled test case sets into N sets according to DNN prediction results and image classification targets, and dividing each divided set into a selection set, a candidate set and a removal set through prediction confidence variance;
S2, calculating the similarity degree between any two images by an image similarity calculation method based on model uncertainty aiming at image data sets in a selection set and a candidate set, wherein the method specifically comprises the following steps of:
S21, predicting, namely selecting one picture from the selection set and the candidate set, marking the picture as a picture a and a picture b, inputting the pictures into a DNN model for classified prediction, wherein probability vectors P and Q output by the DNN model represent the prediction probabilities of the picture a and the picture b respectively;
s22, reorganizing and extracting, namely randomly selecting 3 prediction probability values from the probability vectors P and Q respectively, thereby reorganizing the vectors P and Q into vectorsThree-dimensional sub-vectors;
s23, calculating the similarity of images, namely marking any three-dimensional sub-vectors of the vectors P and Q after recombination as P '(Pi,pj,pk) and Q' (Qi,qj,qk), then regarding the two vectors as coordinate points in a three-dimensional space, and projecting the coordinate points on a plane where the Pi and Qi are positioned, so as to obtain three-dimensional coordinate points P 'and Q', forming an angle with an origin 0 according to the points P ', Q', marking the angle as an angle alpha, and then calculating sin (alpha) by using coordinates corresponding to the P 'and Q', and simultaneously obtaining the absolute value of the difference between the Pi and the Qi;
Similarly, the two-dimensional coordinate points P 'and Q' are projected onto planes where Pj, Qj、pk and Qk are located respectively to perform angle beta, angle beta and sine calculation, including absolute values;
On the basis, a formula for measuring the similarity of probability vectors P and Q is defined, and the formula is used for the information of angles and differences obtained in the process, and is as follows:
similarity(i,j,k)=sin(α)*(|pi-qi|)+sin(β)*(|pj-qj|)+sin(γ)*(|pk-qk|)
s24, accumulating the similarity, and recombining to obtainThe similarity of the individual vectors is accumulated, and for a particular image input x1 and x2, the similarity between images x1 and x2 is calculated by the formula when performing the classification tasks for n categories:
Similarity(x1,x2)=∑similarity(i,j,k);
S3, taking the selection set and the candidate set optimized in the step S2 as an initial set, selecting by calculating the similarity between the images in the candidate set and the images in the selection set, and returning to the final selection set.
Further, in step S1, the test case set C is divided into N subsets according to the prediction result output by the DNN model, the test cases in each set Ci are predicted by DNN to be in the same category, then a prediction confidence variance is calculated for the test cases in each subset Ci, the test cases in the subset Ci are ranked according to the prediction confidence variance and the test cases, and then the test case sets are divided into an initial selection set, an initial candidate set and an exclusion set according to the number requirement of target test cases.
Further, the prediction confidence variance is calculated as follows:
In the image classification task, given an input test case x, the predicted output of the DNN model is a vector of n values, denoted as p= < P1,p2,...,pn >, where each Pi represents the probability that the input x is predicted to be the i-th class, andOrdering the vector P results in a vector P ' = < P '1,p′2,...,p′n >, where P '1≥p′2≥...≥p′n' and the probability difference between the n prediction probabilities is defined as follows:
Confidence variance:
Further, step S23 is to evaluate the similarity between each test case x in the candidate set and all test cases in the selection set by using an image similarity calculation method, and to select the test case corresponding to the maximum value in the shortest distance from the test case x in the candidate set to the selection set in the direction after obtaining the minimum similarity values, so as to ensure the uniform distribution of the selected test cases in the set.
The method is based on the idea of test case distribution balance and the uncertainty of the model, a test case set with balanced distribution can be selected, the test case category of the set is distributed uniformly, meanwhile, the probability that the test case induces the DNN model to generate misprediction is higher, the defects of DNN are more likely to be revealed, and the help for improving DNN robustness is larger through retraining. The beneficial effects of the invention include the following three points:
(1) According to the method, the image similarity calculation method is adopted, so that the testing efficiency of the DNN model can be effectively improved;
(2) Through designing a test case selection method based on the idea of test case distribution balance and the uncertainty of the model, the defect detection rate of DNN can be improved, and meanwhile, only the test cases obtained through selection are marked, so that the marking cost can be saved, and the DNN model is optimized;
(3) According to the method, the test cases with higher probability of generating the misprediction by the DNN model are preferentially selected, and then the DNN is retrained by marking the test cases and adding the original training set, so that the robustness of the DNN can be remarkably improved.
Detailed Description
The main technologies used by the invention are a deep neural network (Deep Neural Network) and an adaptive test case selection technology respectively. The following presents a method flow and details of implementation and specific implementation steps of various techniques used in the present invention.
Deep Neural Network (DNN) is an artificial neural network, and comprises an input layer, a plurality of hidden layers and an output layer, wherein the DNN is applied to the field of image recognition processing. The hidden layer internally comprises a convolution layer, a pooling layer, a full connection layer and the like. The main functions of the convolution layers are to extract features from input data, different convolution kernels can be used for capturing different feature information, after a plurality of convolution layers, a pooling layer is usually added, which helps to reduce the total parameter amount of the neural network and also helps to prevent the overfitting phenomenon of the neural network, and the function of the full connection layer is to convert the extracted feature information into a label space of a sample, and then the data is transferred to the output layer. After the convolutional layer and the fully-connected layer, an activation function, such as ReLU or Sigmoid, is typically applied to enhance the representation capabilities of the neural network.
In general, a deep neural network is used to map input image data x (test cases) onto output results y. For example, an N classification task is given to a test case x, after DNN internal neuron processing, an N-dimensional vector p= { P1,p2,p3,...,pN }, and then a softmax function normalization processing is used, so that a set of probability vectors p= { P1,p2,p3,...,pN},pi are obtained to represent the probability that the picture is predicted as the i-th class by the neural network, and the final prediction result of DNN is the class corresponding to the value with the largest probability in P.
The function of the deep neural network is to convert the input data x into an output result y. Taking the N classification task as an example, for a given test input x, the data is processed through multiple layers of neural networks, and finally an N-dimensional vector v= { V1,v2,v3,...,vN } is formed at the output layer. Next, a set of probability vectors p= { P1,p2,p3,...,pN }, where each Pi represents the probability that the neural network predicted input x belongs to the i-th class, is generated by normalizing this vector by applying a softmax function. The final prediction result of the neural network is the class in the vector P with the highest probability value.
Further, the test case selection is a hot research problem in the field of software testing, and how to select test data with better quality from a large amount of data becomes a key of intelligent software testing. The error prediction behavior of the DNN model can be corrected by only marking the test data with better quality obtained by selection, so that the robustness and the reliability of the model are improved. The previous test case selection methods are basically divided into two types, namely a test case selection method based on neuron coverage guidance and a test case selection method based on priority.
Neuron coverage guidance based test case selection method the neuron coverage based method is selected by utilizing neuron coverage in a deep neural network model to guide the selection of test cases. Such methods typically employ greedy algorithms that guide the selection of the next input by using the previous neuron activation value as feedback.
The method for selecting the test cases based on the priority comprises the steps that a plurality of efficient test case selection methods adopt a guiding strategy based on the priority, a group of priority rules are defined, data are ordered according to the priority rules, and data which are ordered in front are selected. Most of the strategies are based on uncertainty of the model, and high-quality test cases are selected according to the predicted behavior of the DNN model.
Based on the above, we further introduce the test case selection method provided by the present invention.
According to the adaptive test case optimizing and selecting method based on the deep neural network, the greater the specific gravity of DNN misprediction is, the more DNN misprediction categories are covered, and the robustness of a DNN model can be effectively improved through retraining the DNN model. For a trained DNN model for N classification tasks, the method comprises the steps of firstly inputting unlabeled test case sets to be selected into the DNN model, dividing the unlabeled test case sets into N sets according to DNN prediction results, further providing a Prediction Confidence (PCV) method, optimizing each set by adopting the method, and dividing each set into a selection set, a candidate set and a removal set. Secondly, the selection set and the candidate set are further processed, namely an image similarity calculation method based on model uncertainty is provided, and the similarity between two images can be calculated. And finally, taking the optimized selection set and the candidate set as an initial set, selecting by calculating the similarity between the images in the candidate set and the images in the selection set, ensuring the distribution balance of the test cases in the selection set, and returning to the final selection set.
Specifically, the adaptive test case optimizing and selecting method based on the deep neural network comprises the following steps:
and step 1, optimizing the test case set.
Firstly, based on the processing of a DNN model on unlabeled test sets, aiming at each subset of N classification tasks of images and N sets output by the DNN model, the calculation of prediction confidence variance (PVC) is introduced, test cases in the sets are ordered according to PCV coefficients, and the test cases are divided into three types, namely a selection set, a candidate set and a removal set. The process is preliminary selection, i.e., the initial selection set is composed of a selection set and a candidate set, the initial selection set includes those test cases that cause a higher probability of model misprediction, and the exclusion set includes those test cases that reduce the probability of model misprediction.
In this process, we introduce a new index, called Prediction Confidence Variance (PCV), for classifying test cases according to the uncertainty of model predictions. The Prediction Confidence Variance (PCV) partitions the set of test cases based on the uncertainty of the model predictions. In the image classification task, given an input x, the predicted output of the DNN model is a vector p= < P1,p2,...,pn > with n values, where each Pi represents the probability that the input x is predicted to be the i-th class, andOrdering vector P results in vector P ' = < P '1,p′2,...,p′n >, where P '1≥p′2≥...≥p′n >. The PCV measures the probability difference between the n predicted probabilities. The definition is as follows:
And 2, calculating the similarity of the images.
This step considers that the selected test cases can maintain the uniformity of the distribution, and therefore we have devised an image similarity evaluation method based on directionality and uncertainty. The image similarity is a set of data images used in the candidate set and the selection set in step 1, i.e., a preliminary set of test cases. The specific method comprises the following steps:
2.1, prediction.
And respectively selecting a picture from the data sets contained in the selection set and the candidate set, namely a picture a and a picture b, and inputting the pictures into a DNN model for classification prediction. If a data set containing 10 classes is selected, the DNN model outputs a probability vector containing 10 elements for each picture, representing the prediction probability of the picture belonging to each class. Let the two probability vectors be p= < P1,p2,...,P10 > and q= < Q1,q2,...,q10 >, respectively, where vector P represents the classification prediction result of picture a and vector Q represents the classification prediction result of picture b. The elements pi and qi in each vector represent the prediction probabilities that picture a and picture b belong to the corresponding categories, respectively.
2.2, Recombination and extraction.
Recombining vectors P and Q intoEach vector contains 3 predicted probability values after reorganization, which are denoted as vectors P 'and Q', both of which existThe resulting sub-vectors are arranged in combinations.
2.3 Image similarity calculation.
For the vector P ', either one of which is denoted as P ' (Pi,pj,pk) and Q ' (Qi,qj,qk), then the two vectors are regarded as coordinate points in three-dimensional space, and projected onto a plane where Pi and Qi are located, thereby obtaining three-dimensional coordinate points P ' and Q ', according to which the points P ', Q ' form an angle with the origin 0, denoted as an angle α, and then sin (α) is calculated using coordinates corresponding to P ' and Q ', while obtaining the absolute value of the difference between Pi and Qi;
Similarly, the two-dimensional coordinate points P 'and Q' are projected onto planes where Pj, Qj、pk and Qk are located respectively to perform angle beta, gamma and sine calculation, including absolute values;
On the basis, a formula for measuring the similarity of probability vectors P and Q is defined, and the formula is used for the information of angles and differences obtained in the process, and is as follows:
similarity(i,j,k)=sin(α)*(|pi-qi|)+sin(β)*(|pj-qj|)+sin(γ)*(|pk-qk|)
Further by way of example, if one of the amounts P and Q is set to A (P1,p2,p3) and B (Q1,q2,q3), respectively. Then, the coordinates of a and B are projected onto the horizontal planes corresponding to p3 and q3, resulting in two-dimensional coordinate points a 'and B'. Accordingly, points A ', B' form an angle with origin 0, denoted as angle α. Using the coordinates of A 'and B', we can calculate sin (α). At the same time we can get the absolute value of the difference between p3 and q3. Similarly, we can project a and B onto the horizontal planes corresponding to p2 and q2 and p1 and q1, resulting in angles β and β. On this basis, we define a formula for measuring the similarity of vectors P and Q, which integrates the information of angles and differences obtained in the above process.
similarity(1,2,3)=sin(α)*(|p3-q3|)+sin(β)
*(|p2-q2|)+sin(β)*(|p1-q1|)
Starting from vectors < p1,p2,p3 > and < q1,q2,q3 >, and so on.
And 2.4, accumulating the similarity.
Similarity accumulation, i.e. recombining to obtainThe similarity of the individual vectors is accumulated, and for a particular image input x1 and x2, the similarity between images x1 and x2 is calculated by the formula when performing the classification tasks for n categories:
Similarity(x1,x2)=∑similarity(i,j,k);
The examples in step 2.3 were pooled and recombined to giveThe similarity of the individual vectors is accumulated. For specific image inputs x1 and x2, when performing the classification task for n categories, we define the following formula to calculate the similarity between images x1 and x2:
Similarity(x1,x2)=similarity(1,2,3)+similarity(1,2,4)+...+similarity(n-2,n-1,n)
And 3, selecting an adaptive test case.
According to step 1, an original selection set and a candidate set are obtained. In the process of selecting test cases, an image similarity calculation method is adopted to evaluate the similarity between each test case x in the candidate set and all test cases in the selection set. Specifically, the minimum similarity between x and any test case in the selection set will be focused on, i.e., the shortest distance of x to the selection set is calculated. After these minimum similarity values are obtained, the maximum of them will be selected to ensure even distribution of the selected test cases in the collection.
And 4, evaluating the self-adaptive test case selection method by using four common data sets and four DNN models, retraining the DNN models by using the selected test case set, and improving the robustness of the models.
The following describes the specific implementation steps of the present invention by way of specific examples:
(1) Data set and model
Four classical data sets, namely MNIST, CIFAR-10, fashion and SVHN, are selected. Meanwhile, four DNN models which are widely applied and different in scale aiming at the picture classification task are selected and respectively are LeNet-1, leNet-5, resNet-20 and VGG-16. Two different DNN models are provided for each data set to conduct experiments, so that the stability of experimental results is ensured, and the experimental results are shown in a table I. The MNIST dataset is a large database of handwritten numbers, typically used to train various image processing systems. The dataset contained 70,000 images of handwritten numbers (0 to 9), each being a greyscale image of 28x 28 pixels in size. CIFAR10 is a reference dataset widely used in machine learning and computer vision research, containing 60,000 color images of 32x32 pixels, and the dataset is divided into 50,000 training images and 10,000 test images. The Fashion-MNIST dataset is intended to provide a dataset that more closely approximates a real-world problem. It contains 70,000 gray scale images from 10 classes, of which 60,000 samples are used to train the model and the remaining 10,000 samples are used to evaluate the performance and generalization ability of the model. SVHN the dataset is a large-scale dataset for digital identification, the source of which is google street view data. SVHN contains over 60 tens of thousands of color images, divided into a training set, a test set and an additional dataset, wherein the training set contains 73,257 images, the test set contains 26,032 images, and the additional dataset contains 531,131 images.
Table 1 dataset and DNN model
(2) And (3) optimizing the test cases, namely inputting test set data into a DNN model after training, and dividing the test set data into 10 sets C= { C1,C2,C3,...,C10 } according to a DNN prediction result, namely predicting the test cases in each set Ci (i E [1,10 ]) into the same category by DNN. Then we calculate PCV coefficients for the test cases in set Ci, sort the test cases in the set according to the PCV coefficients, and divide the test cases into three categories, initial selection set, initial candidate set and exclusion set. The initial selection set includes those test cases that give rise to a higher probability of model misprediction, while the exclusion set includes those test cases that lower the probability of model misprediction. The remaining test cases are classified into candidate sets.
(3) Test case selection process. And (3) reserving the selection set and the candidate set obtained after optimization in the step (2) as an initial selection set and an initial candidate set in the selection process. Firstly, we set the end condition of the selection flow, this condition is that when the number of test cases selected reaches the target set by us, the algorithm stops running and returns the selected test case set. Next, in each prediction classification category, we calculate the minimum image similarity from each test case x in the candidate test case set to the selected test case set in that category, and store this minimum image similarity value into the corresponding similarity queue. Finally, we sort the similarity queues in descending order and then reorder the candidate sets according to this sort order. Next, the first batch elements are selected from the candidate set and added to the selection set. And returning to the test case set after the number of the test cases in the test case set reaches the target number.