CN113283514B

Movatterモバイル変換

Info

Publication number: CN113283514B
Application number: CN202110603719.4A
Authority: CN
Inventors: 刘彪; 洪曙光; 林焕凯; 陈利军; 周谦; 刘双广
Original assignee: Gosuncn Technology Group Co Ltd
Current assignee: Gosuncn Technology Group Co Ltd
Priority date: 2021-05-31
Filing date: 2021-05-31
Publication date: 2024-05-21
Anticipated expiration: 2041-05-31
Also published as: CN113283514A

Abstract

The invention discloses an unknown class classification method based on deep learning, which comprises the following steps: s1, acquiring a picture of a category to be identified; s2, inputting the picture of the class to be identified into a pre-trained unknown class classification model based on deep learning to obtain a prediction result matrix of the picture; the model comprises a main network, a first full-connection layer, a second full-connection layer and an activation function layer, wherein the picture of the category to be identified obtains a feature map through the main network, the feature map obtains a first feature matrix through the first full-connection layer, the first feature matrix obtains a third feature matrix through the second full-connection layer, and the third feature matrix obtains a prediction result matrix through the activation function layer; s3, through voting resultsObtaining a prediction result; Comparison ofAnd (3) withOf (2), wherein. If it isOrder in principle; Otherwise; For a pair ofOrdering in descending order, at which time the maximum probability valueThe corresponding category is the prediction result, and the unknown category classification method based on the deep learning can distinguish the known category from the unknown category to obtain the prediction result; coding the known and unknown class labels, thereby being beneficial to improving the prediction result; through the voting mechanism, the model is more reliable and stable.

Description

Unknown class classification method, device and medium based on deep learning

Technical Field

The invention relates to the field of computer vision, in particular to an unknown class classification method based on deep learning.

Background

With the continuous development and breakthrough of deep learning, computer vision technology based on deep learning is receiving more and more attention. However, classification of unknown classes remains a difficulty in computer vision tasks. The known categories represent categories that appear in the training set; the unknown class represents a class that is not in the training set. For example, for a classification task, the training set contains multiple categories of "cat", "dog" and "bird", which belong to known categories; when a category, such as "car", does not exist in the category of the training set, then that category belongs to an unknown category.

In real life, it is necessary to classify known classes well, and also to exclude unknown classes and prevent them from being misclassified into known classes. By excluding unknown classes, the stability and security of the overall computer vision algorithm can be improved.

Currently, a Bayesian probabilistic neural network is typically used for classification of unknown classes. During prediction, the weights of the neurons are randomly sampled, so that a plurality of different prediction results can be obtained for the same input picture. Repeating the process for a plurality of times, collecting a plurality of results, and carrying out weighting operation to obtain a final prediction result. The method needs repeated operation for the same input picture, has low efficiency and cannot be used for high-efficiency engineering application.

The background description provided herein is for the purpose of generally presenting the context of the disclosure. Unless otherwise indicated herein, the materials described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.

Disclosure of Invention

In order to overcome the defects, the invention provides an unknown class classification method based on deep learning. The unknown class classification method based on deep learning can distinguish the known class from the unknown class to obtain a prediction result; coding the known and unknown class labels, thereby being beneficial to improving the prediction result; through the voting mechanism, the model is more reliable and stable.

In a first aspect, the present embodiment provides a deep learning-based unknown class classification method, which includes the following steps:

S1, acquiring a picture of a category to be identified;

S2, inputting the picture of the class to be identified into a pre-trained unknown class classification model based on deep learning to obtain a prediction result matrix of the picture; the model comprises a main network, a first full-connection layer, a second full-connection layer and an activation function layer, wherein the picture of the category to be identified obtains a feature map through the main network, the feature map obtains a first feature matrix through the first full-connection layer, the first feature matrix obtains a third feature matrix through the second full-connection layer, and the third feature matrix obtains a prediction result matrix through the activation function layer;

S3, by matrix of prediction resultsVoting to obtain prediction result/>The concrete mode of voting is that the prediction result matrix/>, is comparedAnd/>If/>Let/>; Otherwise/>; Pair/>Ordering in descending order, at which time the maximum probability value/>The corresponding category is the prediction result, if/>The prediction result is of unknown class, otherwise is/>The corresponding known category; wherein/>And/>Representing the probability value of the single-heat coding and the inverse single-heat coding of the ith category obtained by the third feature matrix after the sigmoid function, wherein the dimension is 2N; /(I)The classified pictures to be identified are subjected to prediction result matrix/>A prediction result is obtained after voting;

Preferably, the weights of the backbone network in the pre-trained unknown class classification model based on deep learning use the weights of a training network; the training network model has the structure that: the feature map is divided into two branches, and a first branch passes through the full-connection layer to obtain a first feature matrix; the second branch passes through the full connection layer to obtain a first uncertain matrix; after the first uncertain matrix passes through an activation function, a second uncertain matrix is obtained; randomly sampling a number in Gaussian distribution, multiplying the number by the second uncertainty matrix by bits, and adding the number by bits to the first feature matrix to obtain a second feature matrix.

Preferably, the codes of the labels in one category consist of the single thermal codes and the inverse single thermal codes, and the dimension of the labels is 2N.

Preferably, the loss function of the training network is composed of sigmoid cross entropy and KL divergence, and the loss function is shown in the following formula.

；

Wherein,Representing the label after the coding mode; /(I)Representing the predicted outcome; /(I)Representing a second uncertainty matrix; /(I)Representing a first feature matrix; /(I)Representing the weight balance factor.

In a second aspect, the present embodiment provides an unknown class classification device based on deep learning, which includes the following units:

the image acquisition unit is used for acquiring images of the category to be identified;

The feature extraction unit is used for inputting the picture of the class to be identified into a pre-trained unknown class classification model based on deep learning to obtain a prediction result matrix of the picture; the model comprises a main network, a first full-connection layer, a second full-connection layer and an activation function layer, wherein the picture of the category to be identified obtains a feature map through the main network, the feature map obtains a first feature matrix through the first full-connection layer, the first feature matrix obtains a third feature matrix through the second full-connection layer, and the third feature matrix obtains a prediction result matrix through the activation function layer;

category recognition unit for predicting result matrixVoting to obtain prediction result/>The concrete mode of voting is that the prediction result matrix/>, is comparedAnd/>If/>Order in principle; Otherwise/>; Pair/>Ordering in descending order, at which time the maximum probability value/>The corresponding category is the prediction result, if/>The prediction result is of unknown class, otherwise is/>The corresponding known category; wherein the method comprises the steps ofAnd/>Representing the probability value of the single-heat coding and the inverse single-heat coding of the ith category obtained by the third feature matrix after the sigmoid function, wherein the dimension is 2N; /(I)The classified pictures to be identified are subjected to prediction result matrix/>A prediction result is obtained after voting;

Preferably, the weights of the backbone network in the pre-trained unknown class classification model based on deep learning use the weights of a training network; the training network model has the structure that: the feature map is divided into two branches, and a first branch passes through the full-connection layer to obtain a first feature matrix; the second branch passes through the full connection layer to obtain a first uncertain matrix; after the first uncertain matrix passes through an activation function, a second uncertain matrix is obtained; randomly sampling a number in Gaussian distribution, multiplying the number by the second uncertain matrix in a bit manner, adding the number by the first eigenvalue matrix in a bit manner to obtain a second eigenvalue matrix, and obtaining a third eigenvalue matrix after the second eigenvalue matrix passes through a full connection layer.

；

In a third aspect, the present embodiment is a non-volatile storage medium containing instructions that, when executed, are configured to implement a method as described above.

The invention has the following advantages:

1. The unknown class classification method based on deep learning can distinguish the known class from the unknown class to obtain a prediction result;

2. The invention encodes the known and unknown class labels, which is helpful for improving the prediction result;

3. The invention makes the model more reliable and stable through the voting mechanism.

Drawings

FIG. 1 is a flow chart of a deep learning-based unknown class classification method of the present invention;

FIG. 2 is a schematic diagram of a training model based on a deep learning unknown class classification method of the present invention;

FIG. 3 is a schematic diagram of a predictive model based on a deep learning unknown class classification method of the present invention;

FIG. 4 is a schematic diagram of tag coding based on a deep learning unknown class classification method of the present invention;

FIG. 5 is a flow chart of the method for classifying unknown classes based on deep learning according to the present invention;

FIG. 6 is a schematic diagram of a deep learning-based unknown class classification device in accordance with the present invention;

FIG. 7 is a schematic diagram of a deep learning-based unknown class classification device in accordance with the present invention.

Detailed Description

The invention is further described below with reference to the drawings and examples. But should not be used to limit the scope of the invention.

Example 1

Referring to fig. 1, the embodiment provides an unknown class classification method based on deep learning, which includes the following steps:

S1: confirming N categories required by a task, collecting N categories of data, selecting pictures under different scenes, illumination and angles, and classifying the pictures to form a training set of known categories;

S2: collecting pictures in a public data set as a data set of an unknown class to form a training set of the unknown class;

s3: coding the class labels by using the known class training set of S1 and the unknown class training set of S2, training an unknown class classification training model based on deep learning, and optimizing the training model by using a loss function;

s4: transforming the unknown class classification training model based on the deep learning obtained in the step S3 to obtain an unknown class classification prediction model based on the deep learning;

S5: and (3) inputting a picture by using the unknown class classification prediction model based on the deep learning obtained in the step (S4) to obtain a prediction result matrix of the picture, and obtaining a prediction result in a voting mode.

Preferably, the backbone network is constructed based on ResNet as shown in fig. 2. The size of the input image of the backbone network is 224×224×3, and the size of the feature map output by the backbone network is 1×1×512, where 1*1 is the height and width of the feature map, and 512 is the number of channels of the feature map.

In the training model of S3, the feature map is then split into two branches. And the first branch passes through the full-connection layer to obtain a first feature matrix. And the second branch passes through the full connection layer to obtain a first uncertain matrix. After the first uncertainty matrix is subjected to softplus activation functions, a second uncertainty matrix is obtained. Randomly sampling a number in a gaussian distributionAnd carrying out bit-wise multiplication (ELEMENT WISE multiple) with the uncertainty matrix 2, and carrying out bit-wise addition (ELEMENT WISE addition) with the first feature matrix to obtain a second feature matrix.

Preferably, the feature dimensions of the first feature matrix, the first uncertainty matrix, the second uncertainty matrix, and the second feature matrix are each 512.

Preferably, the third feature matrix is obtained after the second feature matrix passes through the full connection layer, and the feature dimension is 2N, where N is the number of known categories.

Preferably, the third feature matrix is subjected to a sigmoid activation function to obtain a prediction result matrix.

Preferably, the predictive model uses the weights of the training model in S4. As shown in fig. 3, when a picture with a size of 224×224×3 is input into the backbone network of the prediction model, a feature map with a size of 1×1×512 is obtained. And after the feature map passes through the full-connection layer, a first feature matrix is obtained. And the first feature matrix is subjected to a full connection layer to obtain a third feature matrix, and the feature dimension is 2N. And obtaining a prediction result matrix after the third feature matrix passes through a sigmoid activation function.

Preferably, the codes of the labels in one category consist of the single thermal codes and the inverse single thermal codes, and the dimension of the labels is 2N, as shown in fig. 4. Assuming that the number of known classes is 3, the one-hot code of a certain known class is (0, 1, 0), indicating the code that it belongs to that class, and the one-hot code is (1, 0, 1), which is the code that it does not belong to that class. Combining the two forms a dimension 6 tag code, (0,1,0,1,0,1). Similarly, an unknown class of codes is (0,0,0,1,1,1).

Preferably, the loss function used in the training model consists of sigmoid cross entropy and KL divergence together, as shown in the following formula:

；

Preferably, when a prediction result matrix is obtainedIn this case, it is necessary to obtain the prediction result/>, by votingThe method comprises the following steps:

Step one, comparingAnd/>Of (3), wherein/>. If/>Let/>; Otherwise/>；

Step two, toOrdering in descending order, at which time the maximum probability value/>The corresponding category is the prediction result.

According to the unknown class classification method based on deep learning, the known class and the unknown class can be distinguished, and a prediction result is obtained; coding the known and unknown class labels, thereby being beneficial to improving the prediction result; through the voting mechanism, the model is more reliable and stable.

Example two

Referring to fig. 5, the embodiment provides an unknown class classification method based on deep learning, which includes the following steps:

S1, acquiring a picture of a category to be identified;

The codes of the labels in one category consist of the single thermal codes and the inverse single thermal codes, and the dimension of the labels is 2N, as shown in fig. 4. Assuming that the number of known classes is 3, the one-hot code of a certain known class is (0, 1, 0), indicating the code that it belongs to that class, and the one-hot code is (1, 0, 1), which is the code that it does not belong to that class. Combining the two forms a dimension 6 tag code, (0,1,0,1,0,1). Similarly, an unknown class of codes is (0,0,0,1,1,1).

Specifically, the weights of the backbone network in the pre-trained unknown class classification model based on deep learning use the weights of the training network.

The training network model has the structure that: and a backbone network, wherein the size of the feature map output by the backbone network is 1×1×512, 1*1 is the height and width of the feature map, and 512 is the channel number of the feature map.

In the training model, the feature map is then split into two branches. And the first branch passes through the full-connection layer to obtain a first feature matrix. And the second branch passes through the full connection layer to obtain a first uncertain matrix. After the first uncertainty matrix is subjected to softplus activation functions, a second uncertainty matrix is obtained. Randomly sampling a number in the Gaussian distribution, carrying out bit-wise multiplication (ELEMENT WISE multiple) with the second uncertain matrix, and carrying out bit-wise addition (ELEMENT WISE addition) with the first feature matrix to obtain a second feature matrix.

Example IV

Referring to fig. 6, the present embodiment provides an unknown class classification device based on deep learning, which includes the following units:

；

Example five

Referring to fig. 7, one embodiment provides a schematic structural diagram of an unknown class identification device 70 for deep learning. The deep learning unknown class identification device 70 of this embodiment comprises a processor 71, a memory 72 and a computer program stored in said memory 72 and executable on said processor 71. The processor 71, when executing the computer program, implements the steps in the unknown class identification method embodiment of deep learning described above, such as step S1 shown in fig. 2. Or the processor 71, when executing the computer program, performs the functions of the modules/units in the above-described device embodiments.

Illustratively, the computer program may be partitioned into one or more modules/units that are stored in the memory 72 and executed by the processor 71 to accomplish the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions for describing the execution of the computer program in the deep learning unknown class identification device 70.

The deep learning unknown class identification device 70 may include, but is not limited to, a processor 71, a memory 72. It will be appreciated by those skilled in the art that the schematic diagram is merely an example of a deep-learned unknown class identification device 70 and does not constitute a limitation of the deep-learned unknown class identification device 70, and may include more or fewer components than illustrated, or may combine certain components, or different components, e.g., the deep-learned unknown class identification device 70 may also include input-output devices, network access devices, buses, etc.

The Processor 71 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (DIGITAL SIGNAL Processor, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), off-the-shelf Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, and the processor 71 is a control center of the deep-learned unknown class identification apparatus 70, and connects the respective parts of the entire deep-learned unknown class identification apparatus 70 using various interfaces and lines.

The memory 72 may be used to store the computer program and/or module, and the processor 71 implements the various functions of the deep learning unknown class identification device 70 by running or executing the computer program and/or module stored in the memory 72 and invoking data stored in the memory 72. The memory 72 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data (such as audio data, phonebook, etc.) created according to the use of the handset, etc. In addition, memory 72 may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart memory card (SMART MEDIA CARD, SMC), secure Digital (SD) card, flash memory card (FLASH CARD), at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.

Wherein the deeply learned unknown class identification device 70 integrated modules/units may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as a stand alone product. Based on such understanding, the present invention may also be implemented by implementing all or part of the flow of the method of the above embodiment, or by instructing the relevant hardware by a computer program, where the computer program may be stored in a computer readable storage medium, and the computer program may implement the steps of each of the method embodiments described above when executed by the processor 71. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.

It should be noted that the above-described apparatus embodiments are merely illustrative, and the units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the device provided by the invention, the connection relation between the modules represents that the modules have communication connection, and can be specifically implemented as one or more communication buses or signal lines. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

The foregoing embodiments have been provided for the purpose of illustrating the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the foregoing embodiments are merely illustrative of the present invention and are not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made without departing from the spirit and scope of the present invention also fall within the scope of the present invention.

Claims

1. An unknown class classification method based on deep learning comprises the following steps:

S1, acquiring a picture of a category to be identified;

by matrix of prediction resultsVoting to obtain prediction result/>The concrete mode of voting is that the prediction result matrix/>, is comparedAnd/>If/>Let/>; Otherwise/>; Pair/>Ordering in descending order, at which time the maximum probability value/>The corresponding category is the prediction result, if/>The prediction result is of unknown class, otherwise is/>The corresponding known category; wherein/>And/>Representing the probability value of the single-heat coding and the inverse single-heat coding of the ith category obtained by the third feature matrix after the sigmoid function, wherein the dimension is 2N; /(I)The classified pictures to be identified are subjected to prediction result matrix/>A prediction result is obtained after voting;

the codes of the labels of one class consist of the single thermal codes and the inverse single thermal codes, and the dimension of the labels is 2N;

The weight of the backbone network in the pre-trained unknown class classification model based on deep learning uses the weight of a training network; the structure of the model is as follows: the feature map is divided into two branches, and a first branch passes through the full-connection layer to obtain a first feature matrix; the second branch passes through the full connection layer to obtain a first uncertain matrix; after the first uncertain matrix passes through an activation function, a second uncertain matrix is obtained; randomly sampling a number in Gaussian distribution, multiplying the number by the second uncertainty matrix by bits, and adding the number by bits to the first feature matrix to obtain a second feature matrix;

and obtaining a third feature matrix after the second feature matrix passes through the full connection layer.

2. The unknown class classification method based on deep learning according to claim 1, wherein the loss function of the training network is composed of sigmoid cross entropy and KL divergence, and the formula is as follows:

；

Wherein,Representing the label after the coding mode; /(I)Representing the predicted outcome; /(I)Representing a second uncertainty matrix; representing a first feature matrix; /(I)Representing the weight balance factor.

3. An unknown class classification device based on deep learning, comprising the following units:

category recognition unit for predicting result matrixVoting to obtain prediction result/>The concrete mode of voting is that the prediction result matrix/>, is comparedAnd/>If/>Let/>; Otherwise/>; Pair/>Ordering in descending order, at which time the maximum probability value/>The corresponding category is the prediction result, if/>The prediction result is of unknown class, otherwise is/>The corresponding known category; wherein/>And/>Representing the probability value of the single-heat coding and the inverse single-heat coding of the ith category obtained by the third feature matrix after the sigmoid function, wherein the dimension is 2N; /(I)The classified pictures to be identified are subjected to prediction result matrix/>A prediction result is obtained after voting;

4. A deep learning based unknown class classification device according to claim 3, wherein the loss function of the training network is composed of sigmoid cross entropy and KL divergence together, as shown in the following formula:

；

5. A non-volatile storage medium containing instructions that, when executed, are to implement the method of any of claims 1-2.