Movatterモバイル変換


[0]ホーム

URL:


CN112990432A - Target recognition model training method and device and electronic equipment - Google Patents

Target recognition model training method and device and electronic equipment
Download PDF

Info

Publication number
CN112990432A
CN112990432ACN202110242083.5ACN202110242083ACN112990432ACN 112990432 ACN112990432 ACN 112990432ACN 202110242083 ACN202110242083 ACN 202110242083ACN 112990432 ACN112990432 ACN 112990432A
Authority
CN
China
Prior art keywords
training
sample
current
loss function
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110242083.5A
Other languages
Chinese (zh)
Other versions
CN112990432B (en
Inventor
张梦琴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Cloud Network Technology Co Ltd
Original Assignee
Beijing Kingsoft Cloud Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Cloud Network Technology Co LtdfiledCriticalBeijing Kingsoft Cloud Network Technology Co Ltd
Priority to CN202110242083.5ApriorityCriticalpatent/CN112990432B/en
Publication of CN112990432ApublicationCriticalpatent/CN112990432A/en
Application grantedgrantedCritical
Publication of CN112990432BpublicationCriticalpatent/CN112990432B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

The application provides a target recognition model training method, a target recognition model training device and electronic equipment, wherein a training sample set and a fitting image set are obtained, samples in a current training sample subset are input into an initial model, and a first feature vector and a prediction label of each sample are obtained; performing feature extraction on the current fitting image through a second intermediate layer of the initial image classification model to obtain a second feature vector corresponding to the current fitting image; calculating a characteristic loss function value according to a first characteristic vector corresponding to the positive sample and a second characteristic vector corresponding to the current fitting image; calculating a cross entropy loss function value according to the prediction label and the real label corresponding to each sample; and performing back propagation training based on the characteristic loss function value and the cross entropy loss function value to obtain a target recognition model. According to the method and the device, the target recognition model which can recognize whether the image contains the target or not can be trained, and the recognition accuracy and the recall rate of the target recognition model are improved.

Description

Target recognition model training method and device and electronic equipment
Technical Field
The present application relates to the field of image recognition technologies, and in particular, to a method and an apparatus for training a target recognition model, and an electronic device.
Background
The current image classification tasks are mainly divided into traditional image classification tasks and fine-grained image classification tasks. In an image recognition scene which only needs to recognize whether a certain target exists in an image and does not need to recognize detailed information such as the type, the position and the like of the target, if a traditional image classification task is adopted for model training, the characteristics of a key small target are easily ignored, and the recognition capability of a model is poor; if the fine-grained classification task is used for model training, the training process and the obtained model are too complex, and the recognition efficiency is influenced.
Disclosure of Invention
The application aims to provide a target recognition model training method, a device and electronic equipment, wherein a characteristic loss function value can be calculated through characteristic extraction of a fitting image, a reverse gradient propagation training is carried out on an initial image classification model through the characteristic loss function value and a cross entropy loss function value, a target recognition model capable of recognizing whether a target is contained in the image can be trained, and the recognition accuracy and the recall rate of the target recognition model are improved.
In a first aspect, an embodiment of the present application provides a target recognition model training method, where the method is applied to an electronic device, and an initial image classification model is prestored in the electronic device; the method comprises the following steps: acquiring a training sample set and a fitting image set; the samples in the training sample set comprise positive samples and negative samples, and the images in the fitting image set are images of which the area occupation ratio of the target is greater than a set threshold value; determining a training sample subset and a current fitting image corresponding to each training round based on a training sample set and a fitting image set, and executing the following operations for each training round: inputting samples in a current training sample subset into an initial image classification model to obtain a first feature vector and a prediction label of each sample; the first feature vector is a vector output by a first middle layer of the initial image classification model; performing feature extraction on the current fitting image through a second intermediate layer of the initial image classification model to obtain a second feature vector corresponding to the current fitting image; calculating a characteristic loss function value of the training of the current round according to a first characteristic vector corresponding to the positive samples in the current training sample subset and a second characteristic vector corresponding to the current fitting image; calculating a cross entropy loss function value of the training of the current round according to a prediction label and a real label corresponding to each sample in the current training sample subset; and determining a total loss value of the training of the current round based on the characteristic loss function value and the cross entropy loss function value of the training of the current round, performing reverse gradient propagation training on the initial image classification model according to the total loss value of the training of the current round, and stopping the training until the training round reaches a preset number of times or the total loss value converges to a preset convergence threshold value to obtain the target recognition model.
Further, the initial image classification model comprises a convolutional neural network, an attention structure, a fusion module and a classifier which are connected in sequence; the fusion module is a first middle layer; inputting samples in a current training sample subset into an initial image classification model to obtain a first feature vector and a prediction label of each sample, wherein the method comprises the following steps: inputting samples in the current training sample subset into a convolutional neural network to obtain an original characteristic diagram corresponding to each sample; inputting the original characteristic diagram corresponding to each sample into an attention structure to obtain an attention diagram corresponding to each sample; inputting the original characteristic diagram and the attention diagram corresponding to each sample into a fusion module to obtain a first characteristic vector corresponding to each sample; and inputting the first feature vector corresponding to each sample into a classifier to obtain a prediction label corresponding to each sample.
Further, the step of inputting the original feature map and the attention map corresponding to each sample into the fusion module to obtain the first feature vector corresponding to each sample includes: for each sample corresponding original feature map and attention map, the following operations are performed: carrying out spatial standardization on the attention diagram corresponding to the sample through a softmax function to obtain a value corresponding to each pixel in the attention diagram; and taking the value corresponding to each pixel in the attention map as a weight value, and carrying out weighted summation on the original feature map corresponding to the sample to obtain a first feature vector corresponding to the sample.
Further, the second intermediate layer is a convolutional neural network; the method comprises the following steps of performing feature extraction on a current fitting image through a second intermediate layer of the initial image classification model to obtain a second feature vector corresponding to the current fitting image, wherein the step comprises the following steps: and inputting the current fitting image into the convolutional neural network to obtain a second feature vector corresponding to the current fitting image.
Further, the step of calculating the feature loss function value of the training in this round according to the first feature vector corresponding to the positive sample in the current training sample subset and the second feature vector corresponding to the current fitting image includes: calculating a first characteristic loss function value corresponding to each positive sample according to a first characteristic vector corresponding to each positive sample in the current training sample subset and a second characteristic vector corresponding to the current fitting image; and carrying out mean value calculation on the first characteristic loss function values corresponding to the positive samples to obtain the characteristic loss function values of the training of the round.
Further, the step of calculating the first feature loss function value corresponding to each positive sample according to the first feature vector corresponding to each positive sample in the current training sample subset and the second feature vector corresponding to the current fitting image includes: calculating a first characteristic loss function value for the positive sample by the following equation:
Figure BDA0002962557840000031
wherein L is2A first characteristic loss function value representing a positive sample; MSE () represents the mean square error function,
Figure BDA0002962557840000032
representing a first feature vector corresponding to the positive sample; v. of2Representing a second feature vector corresponding to the currently fitted image.
Further, the step of calculating the cross entropy loss function value of the current training according to the prediction label and the real label corresponding to each sample in the current training sample subset includes: calculating a first cross entropy loss function value corresponding to each sample according to a prediction label, a real label and a cross entropy loss function corresponding to each sample in the current training sample subset; and carrying out mean value calculation on the first cross entropy loss function values corresponding to the samples to obtain cross entropy loss function values of the training of the current round.
Further, the step of determining the total loss value of the current round of training based on the characteristic loss function value and the cross entropy loss function value of the current round of training includes: and summing the characteristic loss function value and the cross entropy loss function value of the training of the current round to obtain the total loss value of the training of the current round.
Further, the attention structure includes three convolution layers; a BN layer and a linear connection unit are connected behind each convolution layer.
Further, the method further comprises: predicting the appointed image by using a target recognition model obtained by current training every preset training turn; designating the image as a target related image without labeling; and if the confidence of the prediction result exceeds a preset threshold, adding the specified image to the training sample set to perform model training.
Further, the method further comprises: acquiring an image to be identified; and inputting the image to be recognized into the target recognition model to obtain a recognition result corresponding to the image to be recognized.
In a second aspect, an embodiment of the present application further provides a target recognition model training apparatus, where the apparatus is applied to an electronic device, and an initial image classification model is prestored in the electronic device; the device comprises: the image set acquisition module is used for acquiring a training sample set and a fitting image set; the samples in the training sample set comprise positive samples and negative samples, and the images in the fitting image set are images of which the area occupation ratio of the target is greater than a set threshold value; the model training module is used for determining a training sample subset and a current fitting image corresponding to each training round based on the training sample set and the fitting image set, and executing the following operations for each training round: inputting samples in a current training sample subset into an initial image classification model to obtain a first feature vector and a prediction label of each sample; the first feature vector is a vector output by a first middle layer of the initial image classification model; performing feature extraction on the current fitting image through a second intermediate layer of the initial image classification model to obtain a second feature vector corresponding to the current fitting image; calculating a characteristic loss function value of the training of the current round according to a first characteristic vector corresponding to the positive samples in the current training sample subset and a second characteristic vector corresponding to the current fitting image; calculating a cross entropy loss function value of the training of the current round according to a prediction label and a real label corresponding to each sample in the current training sample subset; and determining a total loss value of the training of the current round based on the characteristic loss function value and the cross entropy loss function value of the training of the current round, performing reverse gradient propagation training on the initial image classification model according to the total loss value of the training of the current round, and stopping the training until the training round reaches a preset number of times or the total loss value converges to a preset convergence threshold value to obtain the target recognition model.
In a third aspect, an embodiment of the present application further provides an electronic device, which includes a processor and a memory, where the memory stores computer-executable instructions that can be executed by the processor, and the processor executes the computer-executable instructions to implement the method in the first aspect.
In a fourth aspect, embodiments of the present application further provide a computer-readable storage medium storing computer-executable instructions that, when invoked and executed by a processor, cause the processor to implement the method of the first aspect.
In the target recognition model training method provided by the embodiment of the application, a training sample set and a fitting image set are obtained firstly; the samples in the training sample set comprise positive samples and negative samples, and the images in the fitting image set are images of which the area occupation ratio of the target is greater than a set threshold value; determining a training sample subset and a current fitting image corresponding to each training round based on a training sample set and a fitting image set, and executing the following operations for each training round: inputting samples in a current training sample subset into an initial image classification model to obtain a first feature vector and a prediction label of each sample; the first feature vector is a vector output by a first middle layer of the initial image classification model; performing feature extraction on the current fitting image through a second intermediate layer of the initial image classification model to obtain a second feature vector corresponding to the current fitting image; calculating a characteristic loss function value of the training of the current round according to a first characteristic vector corresponding to the positive samples in the current training sample subset and a second characteristic vector corresponding to the current fitting image; calculating a cross entropy loss function value of the training of the current round according to a prediction label and a real label corresponding to each sample in the current training sample subset; and determining a total loss value of the training of the current round based on the characteristic loss function value and the cross entropy loss function value of the training of the current round, performing reverse gradient propagation training on the initial image classification model according to the total loss value of the training of the current round, and stopping the training until the training round reaches a preset number of times or the total loss value converges to a preset convergence threshold value to obtain the target recognition model. According to the method and the device, the characteristic loss function value can be calculated through characteristic extraction of the fitted image, the initial image classification model is subjected to reverse gradient propagation training through the characteristic loss function value and the cross entropy loss function value, a target identification model which can identify whether an image carries a target or not can be trained, and the identification accuracy rate and the recall rate of the target identification model are improved.
Drawings
In order to more clearly illustrate the detailed description of the present application or the technical solutions in the prior art, the drawings needed to be used in the detailed description of the present application or the prior art description will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart of a target recognition model training method according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a target recognition model training process according to an embodiment of the present disclosure;
fig. 3 is a flowchart of a target identification method according to an embodiment of the present application;
fig. 4 is a block diagram illustrating a structure of a target recognition model training apparatus according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions of the present application will be described clearly and completely with reference to the following embodiments, and it should be understood that the described embodiments are some, but not all embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The current image classification tasks are mainly divided into traditional image classification tasks and fine-grained image classification tasks. In the traditional image classification task, no matter how large the proportion of an important judgment area in an image in the whole image is, only the characteristics of the whole image are extracted at one glance, and then classification is carried out; in fine-grained image classification, the discriminable area in an image to be classified is usually only a small area in the image, so that an area of an object of interest is usually obtained first, and then the object is subjected to fine classification in a plurality of classes with small differences.
And the classification of fine-grained images is divided into strong supervised learning and weak supervised learning. The strong supervised learning needs to add more additional labeling frames to the network for the strong supervised learning, so that the network can learn the position information of the target, which is similar to a target detection task. In the weak supervised learning, the network discriminates the position of a region through unsupervised learning, and then particularly pays Attention to the feature difference of the region to identify the category of a target.
In an image recognition scene which only needs to recognize whether a certain target exists in an image and does not need to recognize detailed information such as the type, the position and the like of the target, if a traditional image classification task is adopted for model training, the characteristics of a key small target are easily ignored, and the recognition capability of a model is poor; if the fine-grained classification task is used for model training, the training process and the obtained model are too complex, and the recognition efficiency is influenced.
Based on this, the embodiment of the application provides a method and a device for training a target recognition model, and an electronic device, wherein a feature loss function value can be calculated through feature extraction of a fitted image, and a reverse gradient propagation training is performed on an initial image classification model through the feature loss function value and a cross entropy loss function value, so that the target recognition model capable of recognizing whether an image carries a target or not can be trained, and the recognition accuracy and the recall rate of the target recognition model are improved.
For the convenience of understanding the present embodiment, a method for training a target recognition model disclosed in the embodiments of the present application will be described in detail first.
Fig. 1 is a flowchart of a target recognition model training method according to an embodiment of the present disclosure, where the method is applied to an electronic device, and an initial image classification model is prestored in the electronic device; the initial image classification model may be implemented in various ways, and is not limited in any way. The target may be a gun, a knife, or other articles, the target recognition model trained by the target recognition model training method provided in this embodiment may quickly determine whether a certain image includes or carries a target, and the target recognition model training method specifically includes the following steps:
and step S11, acquiring a training sample set and a fitting image set.
The samples in the training sample set comprise positive samples and negative samples, the positive samples are images containing targets, and the negative samples do not contain the images of the targets; the images in the fitted image set are images in which the area ratio of the target is greater than a set threshold, for example, only pure samples of the target are contained, or images in which the area ratio of the target is greater than a certain threshold, for example, 95%, and the threshold can be adjusted according to actual conditions.
And step S12, determining a training sample subset and a current fitting image corresponding to each training round based on the training sample set and the fitting image set, and executing the following operations for each training round until the training round reaches a preset number of times or the total loss value converges to a preset convergence threshold value, so as to obtain a target recognition model.
During model training, a training sample subset and a current fitting image corresponding to current training need to be determined from a training sample set and a fitting image set, for example, 20 images are selected from the training sample set as samples in the training sample subset corresponding to the current training, and one fitting image is randomly selected from the fitting image set as the current fitting image. And then, executing a model training process of the following five steps, and stopping training until the training round reaches a preset number (such as 100 times) or the total loss value converges to a preset convergence threshold value to obtain the target recognition model.
The following five steps are performed for each round of training:
step S121, inputting samples in a current training sample subset into an initial image classification model to obtain a first feature vector and a prediction label of each sample; and the first feature vector is a vector output by the first intermediate layer of the initial image classification model.
The above-mentioned process of obtaining the first feature vector may include multiple ways, and the first intermediate layer for extracting the feature vector is also different for the initial image classification models of different structures. In this embodiment, the first intermediate layer may be a fusion module, which outputs a first feature vector of the sample after fusing the feature map extracted by the neural network and the attention map extracted by the attention structure.
On the basis of obtaining the first feature vector of the sample, the classifier may further output a classification result, that is, a prediction label of the sample, for example, the label includes Y and N, Y indicates that the sample is an image containing the target, and N indicates that the sample is an image not containing the target.
And S122, performing feature extraction on the current fitting image through a second intermediate layer of the initial image classification model to obtain a second feature vector corresponding to the current fitting image.
And the second intermediate layer and the first intermediate layer have different structural positions in the initial image classification model, and the current fitting image is input into the initial image classification model, namely a second feature vector can be output through the second intermediate layer.
Step S123, calculating a feature loss function value of the training in the current round according to the first feature vector corresponding to the positive sample in the current training sample subset and the second feature vector corresponding to the current fitting image.
The characteristic loss function value can be calculated by substituting two kinds of characteristic vectors into a preset characteristic loss function. If the number of the positive samples is one, the first feature vector corresponding to the positive sample and the second feature vector corresponding to the current fitting image are directly substituted into a preset feature loss function to be calculated, and generally, the number of the positive samples is multiple, so that the feature loss function value of each positive sample can be calculated respectively, and then the average value of the feature loss function values corresponding to the multiple positive samples is taken as the feature loss function value of the training.
Step S124, calculating a cross entropy loss function value of the training in the current round according to the predicted label and the real label corresponding to each sample in the current training sample subset.
Similarly, the calculation of the cross entropy loss function value can also be realized by adopting a preset calculation formula, and the average value of the cross entropy loss function values corresponding to a plurality of samples can be taken as the cross entropy loss function value of the training in the current round.
And step S125, determining a total loss value of the training of the current round based on the characteristic loss function value and the cross entropy loss function value of the training of the current round, and performing reverse gradient propagation training on the initial image classification model according to the total loss value of the training of the current round.
In this step, the characteristic loss function value and the cross entropy loss function value of the current round of training are added to obtain a total loss value of the current round of training, and then the initial image classification model is subjected to inverse gradient propagation training through the total loss value.
Through a certain number of times of cyclic training processes, an ideal target recognition model can be obtained finally. According to the target recognition model training method provided by the embodiment of the application, the feature vector extraction of the fitting image is added, so that the feature loss function value can be calculated, the reverse gradient propagation training is carried out on the initial image classification model through the feature loss function value and the cross entropy loss function value, the target recognition model which can identify whether the image carries the target or not can be trained, and the recognition accuracy and the recall rate of the target recognition model are improved.
In the following, a preferred embodiment is listed, in which the training process of the target recognition model is implemented by adding an attention mechanism, and as shown in fig. 2, in the embodiment of the present application, the initial image classification model includes a convolutional neural network, an attention structure, a fusion module, and a classifier, which are connected in sequence; the fusion module, i.e. the first intermediate layer, may output a first feature vector of the sample.
The specific model training process is as follows:
(1) and simultaneously performing the characteristic extraction steps on the current training sample subset and the corresponding current fitting image:
the feature extraction process for the current training sample subset is as follows:
A. and inputting the samples in the current training sample subset into a convolutional neural network to obtain an original characteristic diagram corresponding to each sample.
In the embodiment of the present application, a ResNet50(Residual Network) is used to implement a process of extracting a feature map of samples in a current training sample subset, and the process may also be another Network, and currently, mainstream convolutional neural networks may be, for example, VGG, ResNet152, and the like. Model parameters trained on an ImageNet image database are adopted during initialization, and only the last full-connection layer is required to be modified into a binary classification problem whether the current sample set carries a target or not in the training process. The input size of all sample data is first scaled to 224 x 224, and in the embodiment of the present application, the feature map extracted from the last convolutional layer of the ResNet50 model is extracted as the original feature map Vs of the samples in the current training sample subset.
B. Inputting the original characteristic diagram corresponding to each sample into an attention structure to obtain an attention diagram corresponding to each sample; the attention structure comprises three convolution layers; a BN layer and a linear connection unit are connected behind each convolution layer.
After obtaining the feature map Vs from the ResNet50, it is input to the Attention structure learning to obtain the Attention map Vatt. The Attention structure consists of three convolutional layers, the first layer using 1024 convolutional kernels of size 1 x 1, the second layer using 512 convolutional kernels of size 3 x 3, and the third layer using 1 convolutional kernel of size 1 x 1, while each convolution is followed by a BN layer and a modified linear element. The role of the BN layer is mainly three: the training and convergence speed of the network is accelerated; controlling the gradient explosion to prevent the gradient from disappearing; overfitting is prevented.
C. And inputting the original feature map and the attention map corresponding to each sample into a fusion module to obtain a first feature vector corresponding to each sample.
Specifically, the following operations are performed for the original feature map and the attention map corresponding to each sample: carrying out spatial standardization on the attention diagram corresponding to the sample through a softmax function to obtain a value corresponding to each pixel in the attention diagram; and taking the value corresponding to each pixel in the attention map as a weight value, and carrying out weighted summation on the original feature map corresponding to the sample to obtain a first feature vector corresponding to the sample.
The softmax function described above is as follows:
Figure BDA0002962557840000111
wherein, ai,jThe value at the (i, j) position in the attention map Vatt after spatial normalization, that is, the weight value at the (i, j) position in the original feature map;
Figure BDA0002962557840000112
for attention force diagram VattThe median is the value at (i, j).
The first feature vector is calculated as follows:
v1=∑i,jxi,jai,j
wherein v is1Representing a first feature vector corresponding to the sample; x is the number ofi,jRepresents the feature vector at position (i, j) in the original feature map Vs, ai,jThe value at the (i, j) position in the attention map Vatt after spatial normalization, i.e. the weight value at the (i, j) position in the original feature map, is obtained.
The feature extraction process for the current fitted image is as follows:
A. and inputting the current fitting image into the convolutional neural network to obtain a second feature vector corresponding to the current fitting image. The convolutional neural network is the second intermediate layer of the initial image classification model.
The same applies toThe deep convolutional neural network ResNet50 is used for extracting the features of the current fitting image, at the moment, the last full connection layer of the network model is removed, the features of the last convolutional layer are extracted as feature vectors, and the second feature vector v corresponding to the current fitting image is obtained2
(2) And inputting the first feature vector corresponding to each sample into a classifier to obtain a prediction label corresponding to each sample.
Using the corresponding first feature vector v of each sample1To learn a binary linear classifier for target recognition:
Figure BDA0002962557840000121
wherein W and b are linear classifier parameters, and corresponding first feature vector v to each sample1And inputting the classifier to obtain the prediction label corresponding to each sample.
(3) The corresponding feature Loss function value of this round of training is calculated, as shown in Loss2 in fig. 2.
In order to train the Attention structure, in the embodiment of the application, the feature fitting loss needs to be calculated, namely, the second feature vector v of the fitting image is calculated2With the first feature vector v for classification1The feature loss function value corresponding to the training of the current round is calculated by the following steps.
A. And calculating a first characteristic loss function value corresponding to each positive sample according to the first characteristic vector corresponding to each positive sample in the current training sample subset and the second characteristic vector corresponding to the current fitting image.
Specifically, the first characteristic loss function value of the positive sample is calculated by the following formula:
Figure BDA0002962557840000122
wherein L is2Representing positive samplesA first characteristic loss function value; MSE () represents the mean square error function,
Figure BDA0002962557840000123
representing a first feature vector corresponding to the positive sample; v. of2Representing a second feature vector corresponding to the currently fitted image.
B. And carrying out mean value calculation on the first characteristic loss function values corresponding to the positive samples to obtain the characteristic loss function values of the training of the round.
For example, the subset of training samples in the current round includes 20 images, where 7 of the images are positive samples, and then the average of the first characteristic loss function values corresponding to the 7 positive samples can be calculated to obtain the characteristic loss function value of the current round of training.
(4) The cross entropy Loss function value corresponding to this round of training is calculated, as shown in Loss1 in fig. 2.
A. And calculating a first cross entropy loss function value corresponding to each sample according to the prediction label, the real label and the cross entropy loss function corresponding to each sample in the current training sample subset.
Computing predictive labels
Figure BDA0002962557840000131
With respect to authentic labels y, i.e. to minimise losses
Figure BDA0002962557840000132
And the cross entropy loss between y, the formula:
Figure BDA0002962557840000133
where Cross Encopy () is a Cross Entropy loss function. And calculating a first cross entropy loss function value corresponding to each sample through the function.
B. And carrying out mean value calculation on the first cross entropy loss function values corresponding to the samples to obtain cross entropy loss function values of the training of the current round.
Further, taking the above example as an example, for example, if the subset of the training samples of the current round includes 20 images, the average value of the first cross entropy loss function values corresponding to the 20 samples may be calculated to obtain the cross entropy loss function value of the training of the current round.
(5) And calculating the total Loss value corresponding to the training of the current round, such as the Loss total in FIG. 2.
And summing the characteristic loss function value and the cross entropy loss function value of the training of the current round to obtain the total loss value of the training of the current round.
The final loss function of the model is:
Figure BDA0002962557840000134
therefore, the characteristic loss function value and the cross entropy loss function value of the training of the current round are summed, and the total loss value of the training of the current round can be obtained.
(6) And (4) carrying out back propagation training. And carrying out back propagation training based on the total loss value of the training of the current round obtained by the calculation.
And (5) repeating the steps (1) to (6) to train to obtain the target recognition model.
In addition, the samples in the training sample set need to be labeled manually before training, namely, the samples are divided into positive samples and negative samples, due to the fact that data labeling cost is high, training data during training of the preliminary image classification model are few, in order to improve generalization capability of the model, semi-supervised training is further adopted in the embodiment of the application, and a large amount of data which are not labeled and are related to the target are added into training.
Namely: in the model training process, predicting the specified image by using a target recognition model obtained by current training every preset training turn; designating the image as a target related image without labeling; and if the confidence of the prediction result exceeds a preset threshold, adding the specified image to the training sample set to perform model training.
In practical application, a certain threshold value k can be set, firstly, a trained target recognition model is loaded to predict unlabelled data, images with confidence degrees larger than the threshold value k are automatically selected to be added into training, the model automatically reselects the unlabelled data once every n epochs are trained, and the size of the threshold value k is adjusted by observing the selected data volume and a test result in the model training process. Through model fine adjustment, the accuracy and generalization capability of the model can be improved.
According to the target recognition model training method provided by the embodiment of the application, the cross entropy loss predicted by the model is calculated, and meanwhile the fitting capacity between the attention weighted feature vector and the fitting image is calculated to directly train the attention structure, so that the accuracy of model recognition is improved. And the semi-supervised training method of selecting the unlabelled images while training is carried out in the training process, so that the generalization capability of the model can be improved without increasing the labeling cost.
Further, an embodiment of the present application further provides a target identification method, as shown in fig. 3, the method includes the steps of:
step S302, acquiring an image to be identified;
and step S304, inputting the image to be recognized into the target recognition model to obtain a recognition result corresponding to the image to be recognized.
The target recognition model is obtained by training the target recognition model training method in the previous embodiment, and the image to be recognized is input to the target recognition model to obtain the recognition result corresponding to the image to be recognized, that is, the prediction label is obtained through the extraction process of the first feature vector in the previous embodiment and the prediction of the classifier, and the prediction label can represent whether the image to be recognized is the image containing the target. For a specific identification process, reference may be made to the above embodiment, which is not described herein again.
Based on the method embodiment, the embodiment of the application also provides a target recognition model training device, which is applied to electronic equipment, wherein an initial image classification model is prestored in the electronic equipment; referring to fig. 4, the apparatus includes:
an image set obtaining module 41, configured to obtain a training sample set and a fitting image set; the samples in the training sample set comprise positive samples and negative samples, and the images in the fitting image set are images of which the area occupation ratio of the target is greater than a set threshold value; and the model training module 42 is configured to determine a training sample subset and a current fitting image corresponding to each training round based on the training sample set and the fitting image set, and perform the following operations for each training round until the training round reaches a preset number of times or a total loss value converges to a preset convergence threshold value, so as to stop training and obtain a target recognition model.
The model training module 42 includes: the system comprises a feature extraction and identification module 421, a loss value calculation module 422 and a back propagation training module 423, wherein the feature extraction and identification module 421 is used for inputting samples in a current training sample subset into an initial image classification model to obtain a first feature vector and a prediction label of each sample; the first feature vector is a vector output by a first middle layer of the initial image classification model; performing feature extraction on the current fitting image through a second intermediate layer of the initial image classification model to obtain a second feature vector corresponding to the current fitting image; the loss value calculating module 422 is configured to calculate a loss function value of the feature of the training in the current round according to a first feature vector corresponding to each positive sample in the current training sample subset and a second feature vector corresponding to the current fitting image; calculating a cross entropy loss function value of the training of the current round according to a prediction label and a real label corresponding to each sample in the current training sample subset; determining a total loss value of the training of the current round based on the characteristic loss function value and the cross entropy loss function value of the training of the current round; the back propagation training module 423 is configured to perform back gradient propagation training on the initial image classification model according to the total loss value of the training in the current round.
Further, the initial image classification model comprises a convolutional neural network, an attention structure, a fusion module and a classifier which are connected in sequence; the fusion module is a first middle layer; the feature extraction and identification module 421 is further configured to: inputting samples in the current training sample subset into a convolutional neural network to obtain an original characteristic diagram corresponding to each sample; inputting the original characteristic diagram corresponding to each sample into an attention structure to obtain an attention diagram corresponding to each sample; inputting the original characteristic diagram and the attention diagram corresponding to each sample into a fusion module to obtain a first characteristic vector corresponding to each sample; and inputting the first feature vector corresponding to each sample into a classifier to obtain a prediction label corresponding to each sample.
Further, the feature extraction and identification module 421 is further configured to: for each sample corresponding original feature map and attention map, the following operations are performed: carrying out spatial standardization on the attention diagram corresponding to the sample through a softmax function to obtain a value corresponding to each pixel in the attention diagram; and taking the value corresponding to each pixel in the attention map as a weight value, and carrying out weighted summation on the original feature map corresponding to the sample to obtain a first feature vector corresponding to the sample.
Further, the second intermediate layer is a convolutional neural network; the feature extraction and identification module 421 is further configured to: and inputting the current fitting image into the convolutional neural network to obtain a second feature vector corresponding to the current fitting image.
Further, the loss value calculation module 422 is further configured to: calculating a first characteristic loss function value corresponding to each positive sample according to a first characteristic vector corresponding to each positive sample in the current training sample subset and a second characteristic vector corresponding to the current fitting image; and carrying out mean value calculation on the first characteristic loss function values corresponding to the positive samples to obtain the characteristic loss function values of the training of the round.
Further, the loss value calculation module 422 is further configured to: calculating a first characteristic loss function value for the positive sample by the following equation:
Figure BDA0002962557840000161
wherein L is2A first characteristic loss function value representing a positive sample; MSE () represents the mean square error function,
Figure BDA0002962557840000162
representing a first feature vector corresponding to the positive sample; v. of2Representing a second feature vector corresponding to the currently fitted image.
Further, the loss value calculation module 422 is further configured to: calculating a first cross entropy loss function value corresponding to each sample according to a prediction label, a real label and a cross entropy loss function corresponding to each sample in the current training sample subset; and carrying out mean value calculation on the first cross entropy loss function values corresponding to the samples to obtain cross entropy loss function values of the training of the current round.
Further, the loss value calculation module 422 is further configured to: and summing the characteristic loss function value and the cross entropy loss function value of the training of the current round to obtain the total loss value of the training of the current round.
Further, the attention structure includes three convolution layers; a BN layer and a linear connection unit are connected behind each convolution layer.
Further, the model training module 42 is further configured to: in the model training process, predicting the specified image by using a target recognition model obtained by current training every preset training turn; designating the image as a target related image without labeling; and if the confidence of the prediction result exceeds a preset threshold, adding the specified image to the training sample set to perform model training.
Further, the above apparatus further comprises: the image recognition module is used for acquiring an image to be recognized; and inputting the image to be recognized into the target recognition model to obtain a recognition result corresponding to the image to be recognized.
The implementation principle and the generated technical effect of the target recognition model training device provided in the embodiment of the present application are the same as those of the target recognition model training method embodiment, and for brief description, the corresponding contents in the target recognition model training method embodiment may be referred to where the embodiment of the target recognition model training device is not mentioned.
An electronic device is further provided in the embodiments of the present application, as shown in fig. 5, which is a schematic structural diagram of the electronic device, where the electronic device includes aprocessor 51 and amemory 50, thememory 50 stores computer-executable instructions capable of being executed by theprocessor 51, and theprocessor 51 executes the computer-executable instructions to implement the method.
In the embodiment shown in fig. 5, the electronic device further comprises abus 52 and acommunication interface 53, wherein theprocessor 51, thecommunication interface 53 and thememory 50 are connected by thebus 52.
TheMemory 50 may include a high-speed Random Access Memory (RAM) and may also include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. The communication connection between the network element of the system and at least one other network element is realized through at least one communication interface 53 (which may be wired or wireless), and the internet, a wide area network, a local network, a metropolitan area network, and the like can be used. Thebus 52 may be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. Thebus 52 may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 5, but this does not indicate only one bus or one type of bus.
Theprocessor 51 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in theprocessor 51. TheProcessor 51 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and theprocessor 51 reads information in the memory and performs the steps of the method of the previous embodiment in combination with hardware thereof.
Embodiments of the present application further provide a computer-readable storage medium, where computer-executable instructions are stored, and when the computer-executable instructions are called and executed by a processor, the computer-executable instructions cause the processor to implement the method, and specific implementation may refer to the foregoing method embodiments, and is not described herein again.
The method, the apparatus, and the computer program product for training a target recognition model provided in the embodiments of the present application include a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute the method described in the foregoing method embodiments, and specific implementations may refer to the method embodiments and are not described herein again.
Unless specifically stated otherwise, the relative steps, numerical expressions, and values of the components and steps set forth in these embodiments do not limit the scope of the present application.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In the description of the present application, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", and the like indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only for convenience of description and simplicity of description, and do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be construed as limiting the present application. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the exemplary embodiments of the present application, and are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (14)

1. A target recognition model training method is characterized in that the method is applied to electronic equipment, and an initial image classification model is prestored in the electronic equipment; the method comprises the following steps:
acquiring a training sample set and a fitting image set; the samples in the training sample set comprise positive samples and negative samples, and the images in the fitting image set are images of which the area ratio of the target is greater than a set threshold value;
determining a training sample subset and a current fitting image corresponding to each training round based on the training sample set and the fitting image set, and executing the following operations for each training round:
inputting samples in a current training sample subset into the initial image classification model to obtain a first feature vector and a prediction label of each sample; wherein the first feature vector is a vector output by a first intermediate layer of the initial image classification model;
performing feature extraction on the current fitting image through a second intermediate layer of the initial image classification model to obtain a second feature vector corresponding to the current fitting image;
calculating a characteristic loss function value of the training of the current round according to a first characteristic vector corresponding to the positive samples in the current training sample subset and a second characteristic vector corresponding to the current fitting image;
calculating a cross entropy loss function value of the training in the current round according to the prediction label and the real label corresponding to each sample in the current training sample subset;
and determining a total loss value of the training of the current round based on the characteristic loss function value and the cross entropy loss function value of the training of the current round, performing reverse gradient propagation training on the initial image classification model according to the total loss value of the training of the current round, and stopping the training until the training round reaches a preset number of times or the total loss value converges to a preset convergence threshold value to obtain a target recognition model.
2. The method of claim 1, wherein the initial image classification model comprises a convolutional neural network, an attention structure, a fusion module, and a classifier connected in sequence; the fusion module is the first intermediate layer;
inputting the samples in the current training sample subset into the initial image classification model to obtain a first feature vector and a prediction label of each sample, wherein the steps comprise:
inputting the samples in the current training sample subset into the convolutional neural network to obtain an original characteristic diagram corresponding to each sample;
inputting the original feature map corresponding to each sample into the attention structure to obtain an attention map corresponding to each sample;
inputting the original feature map and the attention map corresponding to each sample into the fusion module to obtain a first feature vector corresponding to each sample;
and inputting the first feature vector corresponding to each sample into the classifier to obtain a prediction label corresponding to each sample.
3. The method according to claim 2, wherein the step of inputting the original feature map and the attention map corresponding to each sample into the fusion module to obtain the first feature vector corresponding to each sample comprises:
for each original feature map and attention map corresponding to the sample, the following operations are performed:
spatially normalizing the attention diagram corresponding to the sample through a softmax function to obtain a value corresponding to each pixel in the attention diagram;
and taking the value corresponding to each pixel in the attention map as a weight value, and carrying out weighted summation on the original feature map corresponding to the sample to obtain a first feature vector corresponding to the sample.
4. The method of claim 2, wherein the second intermediate layer is the convolutional neural network;
the step of extracting the features of the current fitting image through a second intermediate layer of the initial image classification model to obtain a second feature vector corresponding to the current fitting image comprises the following steps:
and inputting the current fitting image into the convolutional neural network to obtain a second feature vector corresponding to the current fitting image.
5. The method of claim 1, wherein the step of calculating the feature loss function value of the current training cycle according to the first feature vector corresponding to the positive sample in the current training sample subset and the second feature vector corresponding to the current fitting image comprises:
calculating a first characteristic loss function value corresponding to each positive sample in the current training sample subset according to a first characteristic vector corresponding to each positive sample and a second characteristic vector corresponding to the current fitting image;
and carrying out mean value calculation on the first characteristic loss function values corresponding to the positive samples to obtain the characteristic loss function values of the training of the current round.
6. The method of claim 5, wherein the step of calculating the first feature loss function value corresponding to each positive sample in the current training sample subset according to the first feature vector corresponding to each positive sample and the second feature vector corresponding to the current fitting image comprises:
calculating a first characteristic loss function value for the positive sample by the following equation:
Figure FDA0002962557830000031
wherein L is2A first characteristic loss function value representing a positive sample; MSE () represents the mean square error function,
Figure FDA0002962557830000032
representing a first feature vector corresponding to the positive sample; v. of2Representing a second feature vector corresponding to the currently fitted image.
7. The method of claim 1, wherein the step of calculating the cross-entropy loss function value for the current training run according to the prediction label and the true label corresponding to each of the samples in the current training sample subset comprises:
calculating a first cross entropy loss function value corresponding to each sample according to a prediction label, a real label and a cross entropy loss function corresponding to each sample in the current training sample subset;
and carrying out mean value calculation on the first cross entropy loss function values corresponding to the samples to obtain cross entropy loss function values of the training of the current round.
8. The method of claim 1, wherein the step of determining a total loss value for the current round of training based on the feature loss function values and the cross-entropy loss function values for the current round of training comprises:
and summing the characteristic loss function values and the cross entropy loss function values of the training of the current round to obtain a total loss value of the training of the current round.
9. The method of claim 2, wherein the attention structure comprises three convolutional layers; a BN layer and a linear connection unit are connected behind each convolution layer.
10. The method of claim 1, further comprising:
in the model training process, predicting the specified image by using a target recognition model obtained by current training every preset training turn; the specified image is a target related image without labeling;
and if the confidence of the prediction result exceeds a preset threshold, adding the specified image to the training sample set to perform model training.
11. The method of claim 1, further comprising:
acquiring an image to be identified;
and inputting the image to be recognized into the target recognition model to obtain a recognition result corresponding to the image to be recognized.
12. The device is characterized in that the device is applied to electronic equipment, and an initial image classification model is prestored in the electronic equipment; the device comprises:
the image set acquisition module is used for acquiring a training sample set and a fitting image set; the samples in the training sample set comprise positive samples and negative samples, and the images in the fitting image set are images of which the area ratio of the target is greater than a set threshold value;
a model training module, configured to determine, based on the training sample set and the fitting image set, a training sample subset and a current fitting image corresponding to each round of training, and perform the following operations for each round of training: inputting the samples in the current training sample subset into the initial image classification model to obtain a first feature vector and a prediction label of each sample; wherein the first feature vector is a vector output by a first intermediate layer of the initial image classification model; performing feature extraction on the current fitting image through a second intermediate layer of the initial image classification model to obtain a second feature vector corresponding to the current fitting image; calculating a characteristic loss function value of the training of the current round according to a first characteristic vector corresponding to the positive samples in the current training sample subset and a second characteristic vector corresponding to the current fitting image; calculating a cross entropy loss function value of the training in the current round according to the prediction label and the real label corresponding to each sample in the current training sample subset; and determining a total loss value of the training of the current round based on the characteristic loss function value and the cross entropy loss function value of the training of the current round, performing reverse gradient propagation training on the initial image classification model according to the total loss value of the training of the current round, and stopping the training until the training round reaches a preset number of times or the total loss value converges to a preset convergence threshold value to obtain a target recognition model.
13. An electronic device comprising a processor and a memory, the memory storing computer-executable instructions executable by the processor, the processor executing the computer-executable instructions to implement the method of any of claims 1 to 11.
14. A computer-readable storage medium having stored thereon computer-executable instructions that, when invoked and executed by a processor, cause the processor to implement the method of any of claims 1 to 11.
CN202110242083.5A2021-03-042021-03-04Target recognition model training method and device and electronic equipmentActiveCN112990432B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202110242083.5ACN112990432B (en)2021-03-042021-03-04Target recognition model training method and device and electronic equipment

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202110242083.5ACN112990432B (en)2021-03-042021-03-04Target recognition model training method and device and electronic equipment

Publications (2)

Publication NumberPublication Date
CN112990432Atrue CN112990432A (en)2021-06-18
CN112990432B CN112990432B (en)2023-10-27

Family

ID=76352849

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202110242083.5AActiveCN112990432B (en)2021-03-042021-03-04Target recognition model training method and device and electronic equipment

Country Status (1)

CountryLink
CN (1)CN112990432B (en)

Cited By (65)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113191461A (en)*2021-06-292021-07-30苏州浪潮智能科技有限公司Picture identification method, device and equipment and readable storage medium
CN113254435A (en)*2021-07-152021-08-13北京电信易通信息技术股份有限公司Data enhancement method and system
CN113421212A (en)*2021-06-232021-09-21华侨大学Medical image enhancement method, device, equipment and medium
CN113435348A (en)*2021-06-292021-09-24上海商汤智能科技有限公司Vehicle type identification method and training method, device, equipment and storage medium
CN113435525A (en)*2021-06-302021-09-24平安科技(深圳)有限公司Classification network training method and device, computer equipment and storage medium
CN113468108A (en)*2021-09-062021-10-01辰风策划(深圳)有限公司Enterprise planning scheme intelligent management classification system based on characteristic data identification
CN113486804A (en)*2021-07-072021-10-08科大讯飞股份有限公司Object identification method, device, equipment and storage medium
CN113505820A (en)*2021-06-232021-10-15北京阅视智能技术有限责任公司Image recognition model training method, device, equipment and medium
CN113642481A (en)*2021-08-172021-11-12百度在线网络技术(北京)有限公司 Identification method, training method, device, electronic device and storage medium
CN113657406A (en)*2021-07-132021-11-16北京旷视科技有限公司Model training and feature extraction method and device, electronic equipment and storage medium
CN113762508A (en)*2021-09-062021-12-07京东鲲鹏(江苏)科技有限公司Training method, device, equipment and medium for image classification network model
CN113808021A (en)*2021-09-172021-12-17北京金山云网络技术有限公司Image processing method and device, image processing model training method and device, and electronic equipment
CN113807316A (en)*2021-10-082021-12-17南京恩博科技有限公司Training method and device for smoke concentration estimation model, electronic equipment and medium
CN113822427A (en)*2021-07-292021-12-21腾讯科技(深圳)有限公司 A model training method, image matching method, device and storage medium
CN113850219A (en)*2021-09-302021-12-28广州文远知行科技有限公司Data collection method and device, vehicle and storage medium
CN113962280A (en)*2021-08-122022-01-21京东科技控股股份有限公司Classification model training method and device, emotion data classification method and related equipment
CN113963236A (en)*2021-11-022022-01-21北京奕斯伟计算技术有限公司Target detection method and device
CN113989899A (en)*2021-11-082022-01-28北京百度网讯科技有限公司 Determination method, device and storage medium of feature extraction layer in face recognition model
CN114004963A (en)*2021-12-312022-02-01深圳比特微电子科技有限公司Target class identification method and device and readable storage medium
CN114091594A (en)*2021-11-152022-02-25北京市商汤科技开发有限公司 Model training method and device, equipment and storage medium
CN114186097A (en)*2021-12-102022-03-15北京百度网讯科技有限公司 Method and apparatus for training a model
CN114220041A (en)*2021-11-122022-03-22浙江大华技术股份有限公司Target recognition method, electronic device, and storage medium
CN114241374A (en)*2021-12-142022-03-25百度在线网络技术(北京)有限公司Training method of live broadcast processing model, live broadcast processing method, device and equipment
CN114255381A (en)*2021-12-232022-03-29北京瑞莱智慧科技有限公司Training method of image recognition model, image recognition method, device and medium
CN114266308A (en)*2021-12-212022-04-01浙江网商银行股份有限公司 Detection model training method and device, and image detection method and device
CN114332538A (en)*2021-12-302022-04-12中国农业银行股份有限公司Image classification model training method, image classification method, device and storage medium
CN114399671A (en)*2021-11-302022-04-26际络科技(上海)有限公司Target identification method and device
CN114419391A (en)*2021-12-272022-04-29北京三快在线科技有限公司 Target image recognition method and device, electronic device and readable storage medium
CN114417959A (en)*2021-12-062022-04-29浙江大华技术股份有限公司Correlation method for feature extraction, target identification method, correlation device and apparatus
CN114445662A (en)*2022-01-252022-05-06南京理工大学Robust image classification method and system based on label embedding
CN114494794A (en)*2021-12-162022-05-13苏州安智汽车零部件有限公司Image semi-automatic labeling model training method for automatic driving
CN114496118A (en)*2022-01-262022-05-13郑州安图生物工程股份有限公司 Drug susceptibility result identification method, device, electronic device and readable storage medium
CN114548213A (en)*2021-12-292022-05-27浙江大华技术股份有限公司Model training method, image recognition method, terminal device, and computer medium
CN114565082A (en)*2022-03-012022-05-31浙江工业大学Multi-scale deep learning identification method for sequence data
CN114565016A (en)*2022-01-242022-05-31有米科技股份有限公司 Training of label recognition model, method and device for recognizing image labels
CN114581838A (en)*2022-04-262022-06-03阿里巴巴达摩院(杭州)科技有限公司Image processing method and device and cloud equipment
CN114612717A (en)*2022-03-092022-06-10四川大学华西医院AI model training label generation method, training method, use method and device
CN114648680A (en)*2022-05-172022-06-21腾讯科技(深圳)有限公司Training method, device, equipment, medium and program product of image recognition model
CN114677255A (en)*2022-03-172022-06-28北京中交兴路信息科技有限公司Method and device for identifying vehicle body in truck picture, storage medium and terminal
CN114722826A (en)*2022-04-072022-07-08平安科技(深圳)有限公司Model training method and device, electronic equipment and storage medium
CN114782996A (en)*2022-05-102022-07-22西华师范大学 Image recognition processing method, device, electronic device and storage medium
CN114827460A (en)*2022-04-152022-07-29武汉理工大学Cloud deck image following method and device based on brushless motor control and electronic equipment
CN114866162A (en)*2022-07-112022-08-05中国人民解放军国防科技大学Signal data enhancement method and system and identification method and system of communication radiation source
CN115034327A (en)*2022-06-222022-09-09支付宝(杭州)信息技术有限公司External data application, user identification method, device and equipment
CN115063753A (en)*2022-08-172022-09-16苏州魔视智能科技有限公司Safety belt wearing detection model training method and safety belt wearing detection method
CN115082740A (en)*2022-07-182022-09-20北京百度网讯科技有限公司 Target detection model training method, target detection method, device, electronic device
CN115100717A (en)*2022-06-292022-09-23腾讯科技(深圳)有限公司Training method of feature extraction model, and cartoon object recognition method and device
CN115375978A (en)*2022-10-272022-11-22北京闪马智建科技有限公司Behavior information determination method and apparatus, storage medium, and electronic apparatus
CN115424294A (en)*2022-07-272022-12-02浙江大华技术股份有限公司Training method of wearing detection model, wearing detection method and related equipment
CN115858836A (en)*2022-12-272023-03-28吉林大学 Image retrieval method and device, device, and computer-readable storage medium
CN116091797A (en)*2022-07-252023-05-09网易(杭州)网络有限公司Image similarity determination method and training method and device for model of image similarity determination method
CN116127067A (en)*2022-12-282023-05-16北京明朝万达科技股份有限公司Text classification method, apparatus, electronic device and storage medium
CN116137061A (en)*2023-04-202023-05-19北京睿芯通量科技发展有限公司Training method and device for quantity statistical model, electronic equipment and storage medium
CN116152938A (en)*2021-11-182023-05-23腾讯科技(深圳)有限公司 Identity recognition model training and electronic resource transfer method, device and equipment
CN116912618A (en)*2023-06-162023-10-20平安科技(深圳)有限公司Image classification model training method and device, electronic equipment and storage medium
CN116935363A (en)*2023-07-042023-10-24东莞市微振科技有限公司Cutter identification method, cutter identification device, electronic equipment and readable storage medium
CN116958787A (en)*2023-08-172023-10-27中国人民财产保险股份有限公司Training method of image recognition model, image recognition method and related equipment
CN117034219A (en)*2022-09-092023-11-10腾讯科技(深圳)有限公司Data processing method, device, equipment and readable storage medium
CN117058100A (en)*2023-08-142023-11-14阿里巴巴达摩院(杭州)科技有限公司Image recognition method, electronic device, and computer-readable storage medium
CN117058493A (en)*2023-10-132023-11-14之江实验室Image recognition security defense method and device and computer equipment
CN117171559A (en)*2023-08-112023-12-05深圳数联天下智能科技有限公司Training method and related device for human body activity type recognition model
CN118097595A (en)*2024-02-282024-05-28小米汽车科技有限公司Deceleration strip identification method and device, storage medium and vehicle
CN118506113A (en)*2024-07-192024-08-16武汉数聚速达网络科技有限责任公司Image recognition model training method and system based on deep learning
CN119810562A (en)*2025-01-102025-04-11北京交通大学 Training method, electronic device and storage medium for railway intrusion target classification model
CN120510456A (en)*2025-07-212025-08-19苏州大学 A fine-grained target detection and recognition method, device and readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN108681774A (en)*2018-05-112018-10-19电子科技大学Based on the human body target tracking method for generating confrontation network negative sample enhancing
WO2019184124A1 (en)*2018-03-302019-10-03平安科技(深圳)有限公司Risk-control model training method, risk identification method and apparatus, and device and medium
CN111046959A (en)*2019-12-122020-04-21上海眼控科技股份有限公司Model training method, device, equipment and storage medium
US20200285896A1 (en)*2019-03-092020-09-10Tongji UniversityMethod for person re-identification based on deep model with multi-loss fusion training strategy

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
WO2019184124A1 (en)*2018-03-302019-10-03平安科技(深圳)有限公司Risk-control model training method, risk identification method and apparatus, and device and medium
CN108681774A (en)*2018-05-112018-10-19电子科技大学Based on the human body target tracking method for generating confrontation network negative sample enhancing
US20200285896A1 (en)*2019-03-092020-09-10Tongji UniversityMethod for person re-identification based on deep model with multi-loss fusion training strategy
CN111046959A (en)*2019-12-122020-04-21上海眼控科技股份有限公司Model training method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王博威;潘宗序;胡玉新;马闻;: "少量样本下基于孪生CNN的SAR目标识别", 雷达科学与技术, no. 06*

Cited By (88)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113505820A (en)*2021-06-232021-10-15北京阅视智能技术有限责任公司Image recognition model training method, device, equipment and medium
CN113505820B (en)*2021-06-232024-02-06北京阅视智能技术有限责任公司Image recognition model training method, device, equipment and medium
CN113421212A (en)*2021-06-232021-09-21华侨大学Medical image enhancement method, device, equipment and medium
CN113421212B (en)*2021-06-232023-06-02华侨大学 A medical image enhancement method, device, equipment and medium
CN113435348A (en)*2021-06-292021-09-24上海商汤智能科技有限公司Vehicle type identification method and training method, device, equipment and storage medium
CN113191461A (en)*2021-06-292021-07-30苏州浪潮智能科技有限公司Picture identification method, device and equipment and readable storage medium
CN113435525A (en)*2021-06-302021-09-24平安科技(深圳)有限公司Classification network training method and device, computer equipment and storage medium
CN113486804A (en)*2021-07-072021-10-08科大讯飞股份有限公司Object identification method, device, equipment and storage medium
CN113486804B (en)*2021-07-072024-02-20科大讯飞股份有限公司Object identification method, device, equipment and storage medium
CN113657406B (en)*2021-07-132024-04-23北京旷视科技有限公司Model training and feature extraction method and device, electronic equipment and storage medium
CN113657406A (en)*2021-07-132021-11-16北京旷视科技有限公司Model training and feature extraction method and device, electronic equipment and storage medium
CN113254435B (en)*2021-07-152021-10-29北京电信易通信息技术股份有限公司Data enhancement method and system
CN113254435A (en)*2021-07-152021-08-13北京电信易通信息技术股份有限公司Data enhancement method and system
CN113822427A (en)*2021-07-292021-12-21腾讯科技(深圳)有限公司 A model training method, image matching method, device and storage medium
CN113822427B (en)*2021-07-292025-08-15腾讯科技(深圳)有限公司Model training method, image matching method, device and storage medium
CN113962280A (en)*2021-08-122022-01-21京东科技控股股份有限公司Classification model training method and device, emotion data classification method and related equipment
CN113642481A (en)*2021-08-172021-11-12百度在线网络技术(北京)有限公司 Identification method, training method, device, electronic device and storage medium
CN113468108A (en)*2021-09-062021-10-01辰风策划(深圳)有限公司Enterprise planning scheme intelligent management classification system based on characteristic data identification
CN113762508A (en)*2021-09-062021-12-07京东鲲鹏(江苏)科技有限公司Training method, device, equipment and medium for image classification network model
CN113808021A (en)*2021-09-172021-12-17北京金山云网络技术有限公司Image processing method and device, image processing model training method and device, and electronic equipment
CN113850219A (en)*2021-09-302021-12-28广州文远知行科技有限公司Data collection method and device, vehicle and storage medium
CN113807316A (en)*2021-10-082021-12-17南京恩博科技有限公司Training method and device for smoke concentration estimation model, electronic equipment and medium
CN113807316B (en)*2021-10-082023-12-12南京恩博科技有限公司Training method and device of smoke concentration estimation model, electronic equipment and medium
CN113963236A (en)*2021-11-022022-01-21北京奕斯伟计算技术有限公司Target detection method and device
CN113989899A (en)*2021-11-082022-01-28北京百度网讯科技有限公司 Determination method, device and storage medium of feature extraction layer in face recognition model
CN114220041A (en)*2021-11-122022-03-22浙江大华技术股份有限公司Target recognition method, electronic device, and storage medium
CN114091594A (en)*2021-11-152022-02-25北京市商汤科技开发有限公司 Model training method and device, equipment and storage medium
CN116152938A (en)*2021-11-182023-05-23腾讯科技(深圳)有限公司 Identity recognition model training and electronic resource transfer method, device and equipment
CN114399671A (en)*2021-11-302022-04-26际络科技(上海)有限公司Target identification method and device
CN114417959A (en)*2021-12-062022-04-29浙江大华技术股份有限公司Correlation method for feature extraction, target identification method, correlation device and apparatus
CN114417959B (en)*2021-12-062022-12-02浙江大华技术股份有限公司Correlation method for feature extraction, target identification method, correlation device and apparatus
CN114186097A (en)*2021-12-102022-03-15北京百度网讯科技有限公司 Method and apparatus for training a model
CN114241374A (en)*2021-12-142022-03-25百度在线网络技术(北京)有限公司Training method of live broadcast processing model, live broadcast processing method, device and equipment
CN114241374B (en)*2021-12-142022-12-13百度在线网络技术(北京)有限公司Training method of live broadcast processing model, live broadcast processing method, device and equipment
CN114494794A (en)*2021-12-162022-05-13苏州安智汽车零部件有限公司Image semi-automatic labeling model training method for automatic driving
CN114266308B (en)*2021-12-212025-06-13浙江网商银行股份有限公司 Detection model training method and device, image detection method and device
CN114266308A (en)*2021-12-212022-04-01浙江网商银行股份有限公司 Detection model training method and device, and image detection method and device
CN114255381A (en)*2021-12-232022-03-29北京瑞莱智慧科技有限公司Training method of image recognition model, image recognition method, device and medium
CN114419391A (en)*2021-12-272022-04-29北京三快在线科技有限公司 Target image recognition method and device, electronic device and readable storage medium
CN114548213A (en)*2021-12-292022-05-27浙江大华技术股份有限公司Model training method, image recognition method, terminal device, and computer medium
CN114548213B (en)*2021-12-292025-07-22浙江大华技术股份有限公司Model training method, image recognition method, terminal device and computer medium
CN114332538B (en)*2021-12-302025-07-11中国农业银行股份有限公司 Image classification model training method, image classification method, device and storage medium
CN114332538A (en)*2021-12-302022-04-12中国农业银行股份有限公司Image classification model training method, image classification method, device and storage medium
CN114004963A (en)*2021-12-312022-02-01深圳比特微电子科技有限公司Target class identification method and device and readable storage medium
CN114565016A (en)*2022-01-242022-05-31有米科技股份有限公司 Training of label recognition model, method and device for recognizing image labels
CN114445662A (en)*2022-01-252022-05-06南京理工大学Robust image classification method and system based on label embedding
CN114496118A (en)*2022-01-262022-05-13郑州安图生物工程股份有限公司 Drug susceptibility result identification method, device, electronic device and readable storage medium
CN114565082A (en)*2022-03-012022-05-31浙江工业大学Multi-scale deep learning identification method for sequence data
CN114612717A (en)*2022-03-092022-06-10四川大学华西医院AI model training label generation method, training method, use method and device
CN114677255A (en)*2022-03-172022-06-28北京中交兴路信息科技有限公司Method and device for identifying vehicle body in truck picture, storage medium and terminal
CN114722826B (en)*2022-04-072024-02-02平安科技(深圳)有限公司Model training method and device, electronic equipment and storage medium
CN114722826A (en)*2022-04-072022-07-08平安科技(深圳)有限公司Model training method and device, electronic equipment and storage medium
CN114827460A (en)*2022-04-152022-07-29武汉理工大学Cloud deck image following method and device based on brushless motor control and electronic equipment
CN114581838A (en)*2022-04-262022-06-03阿里巴巴达摩院(杭州)科技有限公司Image processing method and device and cloud equipment
CN114782996A (en)*2022-05-102022-07-22西华师范大学 Image recognition processing method, device, electronic device and storage medium
CN114648680B (en)*2022-05-172022-08-16腾讯科技(深圳)有限公司 Training method, device, equipment and medium for image recognition model
CN114648680A (en)*2022-05-172022-06-21腾讯科技(深圳)有限公司Training method, device, equipment, medium and program product of image recognition model
CN115034327A (en)*2022-06-222022-09-09支付宝(杭州)信息技术有限公司External data application, user identification method, device and equipment
CN115100717A (en)*2022-06-292022-09-23腾讯科技(深圳)有限公司Training method of feature extraction model, and cartoon object recognition method and device
CN114866162B (en)*2022-07-112023-09-26中国人民解放军国防科技大学 Signal data enhancement method and system and communication radiation source identification method and system
CN114866162A (en)*2022-07-112022-08-05中国人民解放军国防科技大学Signal data enhancement method and system and identification method and system of communication radiation source
CN115082740A (en)*2022-07-182022-09-20北京百度网讯科技有限公司 Target detection model training method, target detection method, device, electronic device
CN115082740B (en)*2022-07-182023-09-01北京百度网讯科技有限公司Target detection model training method, target detection device and electronic equipment
CN116091797A (en)*2022-07-252023-05-09网易(杭州)网络有限公司Image similarity determination method and training method and device for model of image similarity determination method
CN115424294A (en)*2022-07-272022-12-02浙江大华技术股份有限公司Training method of wearing detection model, wearing detection method and related equipment
CN115063753A (en)*2022-08-172022-09-16苏州魔视智能科技有限公司Safety belt wearing detection model training method and safety belt wearing detection method
CN117034219A (en)*2022-09-092023-11-10腾讯科技(深圳)有限公司Data processing method, device, equipment and readable storage medium
CN115375978A (en)*2022-10-272022-11-22北京闪马智建科技有限公司Behavior information determination method and apparatus, storage medium, and electronic apparatus
CN115858836A (en)*2022-12-272023-03-28吉林大学 Image retrieval method and device, device, and computer-readable storage medium
CN116127067A (en)*2022-12-282023-05-16北京明朝万达科技股份有限公司Text classification method, apparatus, electronic device and storage medium
CN116127067B (en)*2022-12-282023-10-20北京明朝万达科技股份有限公司Text classification method, apparatus, electronic device and storage medium
CN116137061A (en)*2023-04-202023-05-19北京睿芯通量科技发展有限公司Training method and device for quantity statistical model, electronic equipment and storage medium
CN116912618A (en)*2023-06-162023-10-20平安科技(深圳)有限公司Image classification model training method and device, electronic equipment and storage medium
CN116935363A (en)*2023-07-042023-10-24东莞市微振科技有限公司Cutter identification method, cutter identification device, electronic equipment and readable storage medium
CN116935363B (en)*2023-07-042024-02-23东莞市微振科技有限公司Cutter identification method, cutter identification device, electronic equipment and readable storage medium
CN117171559A (en)*2023-08-112023-12-05深圳数联天下智能科技有限公司Training method and related device for human body activity type recognition model
CN117058100B (en)*2023-08-142024-10-18阿里巴巴达摩院(杭州)科技有限公司 Image recognition method, electronic device and computer readable storage medium
CN117058100A (en)*2023-08-142023-11-14阿里巴巴达摩院(杭州)科技有限公司Image recognition method, electronic device, and computer-readable storage medium
CN116958787A (en)*2023-08-172023-10-27中国人民财产保险股份有限公司Training method of image recognition model, image recognition method and related equipment
CN117058493B (en)*2023-10-132024-02-13之江实验室 A security defense method, device and computer equipment for image recognition
CN117058493A (en)*2023-10-132023-11-14之江实验室Image recognition security defense method and device and computer equipment
CN118097595A (en)*2024-02-282024-05-28小米汽车科技有限公司Deceleration strip identification method and device, storage medium and vehicle
CN118506113A (en)*2024-07-192024-08-16武汉数聚速达网络科技有限责任公司Image recognition model training method and system based on deep learning
CN118506113B (en)*2024-07-192024-10-01武汉数聚速达网络科技有限责任公司Image recognition model training method and system based on deep learning
CN119810562A (en)*2025-01-102025-04-11北京交通大学 Training method, electronic device and storage medium for railway intrusion target classification model
CN119810562B (en)*2025-01-102025-09-26北京交通大学Training method for railway intrusion target classification model, electronic equipment and storage medium
CN120510456A (en)*2025-07-212025-08-19苏州大学 A fine-grained target detection and recognition method, device and readable storage medium
CN120510456B (en)*2025-07-212025-09-12苏州大学Fine-grained target detection and identification method and device and readable storage medium

Also Published As

Publication numberPublication date
CN112990432B (en)2023-10-27

Similar Documents

PublicationPublication DateTitle
CN112990432B (en)Target recognition model training method and device and electronic equipment
CN110020592B (en)Object detection model training method, device, computer equipment and storage medium
CN113469088B (en)SAR image ship target detection method and system under passive interference scene
US11468266B2 (en)Target identification in large image data
CN109886335B (en)Classification model training method and device
CN111563473A (en)Remote sensing ship identification method based on dense feature fusion and pixel level attention
CN111062413A (en)Road target detection method and device, electronic equipment and storage medium
CN111814902A (en) Target detection model training method, target recognition method, device and medium
CN112183153A (en)Object behavior detection method and device based on video analysis
CN112364974B (en)YOLOv3 algorithm based on activation function improvement
CN110135505B (en)Image classification method and device, computer equipment and computer readable storage medium
CN110096938A (en)A kind for the treatment of method and apparatus of action behavior in video
CN112001403A (en) An image contour detection method and system
CN114595352B (en)Image recognition method and device, electronic equipment and readable storage medium
CN113011532A (en)Classification model training method and device, computing equipment and storage medium
Tang et al.An automatic fine-grained violence detection system for animation based on modified faster R-CNN
CN114821022A (en)Credible target detection method integrating subjective logic and uncertainty distribution modeling
CN111539456A (en)Target identification method and device
CN111815582A (en) A two-dimensional code region detection method with improved background prior and foreground prior
CN112597997A (en)Region-of-interest determining method, image content identifying method and device
CN113971737A (en) Object recognition methods, electronic devices, media and program products for use in robots
WO2023154986A1 (en)Method, system, and device using a generative model for image segmentation
CN110490058B (en)Training method, device and system of pedestrian detection model and computer readable medium
CN110135428A (en) Image segmentation processing method and device
CN113469176A (en)Target detection model training method, target detection method and related equipment thereof

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp