Movatterモバイル変換


[0]ホーム

URL:


CN110516512B - Training method of pedestrian attribute analysis model, pedestrian attribute identification method and device - Google Patents

Training method of pedestrian attribute analysis model, pedestrian attribute identification method and device
Download PDF

Info

Publication number
CN110516512B
CN110516512BCN201810488759.7ACN201810488759ACN110516512BCN 110516512 BCN110516512 BCN 110516512BCN 201810488759 ACN201810488759 ACN 201810488759ACN 110516512 BCN110516512 BCN 110516512B
Authority
CN
China
Prior art keywords
pedestrian
attribute
training
image
representing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810488759.7A
Other languages
Chinese (zh)
Other versions
CN110516512A (en
Inventor
王睿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Authenmetric Data Technology Co ltd
Original Assignee
Beijing Authenmetric Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Authenmetric Data Technology Co ltdfiledCriticalBeijing Authenmetric Data Technology Co ltd
Priority to CN201810488759.7ApriorityCriticalpatent/CN110516512B/en
Publication of CN110516512ApublicationCriticalpatent/CN110516512A/en
Application grantedgrantedCritical
Publication of CN110516512BpublicationCriticalpatent/CN110516512B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

The embodiment of the invention discloses a training method and a training device for a pedestrian attribute analysis model, wherein the training method comprises the following steps: inputting a pedestrian image and a probability map corresponding to the pedestrian image into a convolutional neural network to obtain a prediction attribute; the probability map characterizes a set of probability values of each pixel node belonging to a pedestrian component area in the pedestrian image partitioned into at least one pedestrian component area; calculating training loss by using the real attribute corresponding to the pedestrian image and the predicted attribute; and if the training loss converges, determining the current model parameters of the convolutional neural network as the model parameters of the pedestrian attribute analysis model to obtain the pedestrian attribute analysis model. The pedestrian attribute analysis model obtained by training by the training method can accurately identify the pedestrian attribute from the pedestrian attribute analysis model even when the pedestrian attribute analysis model faces to the application scene, the pedestrian gesture and the pedestrian image with larger angle difference of the camera when the pedestrian attribute analysis model is used for identifying the pedestrian image with unknown attribute.

Description

Training method of pedestrian attribute analysis model, pedestrian attribute identification method and device
Technical Field
The invention relates to the technical field of image processing, in particular to a training method and a training system of a pedestrian attribute analysis model, and a pedestrian attribute identification method and an identification device.
Background
Pedestrian attribute identification (Pedestrian attribute recognition) refers to a technique of processing and analyzing pictures to identify pedestrian attributes, where the pedestrian attributes include body appearance characteristics (e.g., height, weight, etc.), wearing characteristics (e.g., type and color of coat, pants, backpack, etc.), facial characteristics (e.g., age, gender, race, etc.).
Currently, pedestrian attribute recognition is mainly based on a neural network model, and generally comprises two stages of model training and model recognition. In the model training stage, the labeled pictures are used as input data of the neural network model for training, and then the model parameters meeting the requirements can be obtained. The neural network model after the parameters are determined is the trained model. In the using stage, the picture to be identified is used as input data of the trained neural network model, output data is obtained, and the attribute of the pedestrian identified from the picture can be obtained.
However, the pedestrian attribute identification method based on the neural network model has some problems in practical application, and one of the problems is that the pedestrian attributes in all the pictures cannot be accurately identified due to the diversity of the pedestrian pictures. Specifically, in the actual application scene, the collected pedestrian pictures are quite diversified due to the differences of the monitoring scene, the camera angle, the wearing of pedestrians, the gesture and the like, one method based on the neural network model is good in effect of identifying some pictures, and the pedestrian attributes cannot be accurately identified from the pictures with larger changes of other application scenes, the camera angle and the like.
Disclosure of Invention
In order to solve the technical problems, the application provides a neural network model training method for identifying pedestrian attributes and a method for identifying pedestrian attributes by using the trained neural network model, so that pedestrian attributes can be accurately identified from various pictures.
In a first aspect, a training method of a pedestrian attribute analysis model is provided, including:
inputting a pedestrian image and a probability map corresponding to the pedestrian image into a convolutional neural network to obtain a prediction attribute; the probability map characterizes a set of probability values of each pixel node belonging to a pedestrian component area in the pedestrian image partitioned into at least one pedestrian component area;
calculating training loss by using the real attribute corresponding to the pedestrian image and the predicted attribute;
and if the training loss converges, determining the current model parameters of the convolutional neural network as the model parameters of the pedestrian attribute analysis model to obtain the pedestrian attribute analysis model.
With reference to the first aspect, in a first possible implementation manner of the first aspect, the calculating of the probability map includes the following steps:
inputting a pedestrian image into a pedestrian analysis model to obtain a probability map corresponding to the pedestrian image; the pedestrian analysis model is a full convolution neural network trained by training images with real probability map labels.
With reference to the first implementation manner of the first aspect, in a second possible implementation manner of the first aspect, the convolutional neural network includes a first sub-network and a second sub-network;
the step of inputting the pedestrian image and the probability map corresponding to the pedestrian image into a convolutional neural network to obtain the prediction attribute comprises the following steps:
extracting pedestrian characteristics from the pedestrian image by using a first subnetwork;
updating the probability map;
obtaining fusion characteristics according to the updated probability map and the pedestrian characteristics;
and inputting the fusion characteristics into a second sub-network to obtain the prediction attribute.
With reference to the first aspect and the foregoing possible implementation manners, in a third possible implementation manner of the first aspect, the step of updating the probability map specifically includes:
wherein x isi Is the ith pedestrian image;
representing that the s-th pixel node in the i-th pedestrian image belongs to the s-th pixel nodeA probability value for a pedestrian component area of class c;
representing the updated probability value;
representing for an s-th pixel node belonging to class C in the C-th pedestrian component area +.>Obtaining the maximum value;
and/or the number of the groups of groups,
the step of obtaining the fusion characteristic according to the updated probability map and the pedestrian characteristic specifically comprises the following steps:
Convolving and fusing the updated probability map and the pedestrian characteristic to obtain a first characteristic;
φ(xi )=[φ(xi )1 ,φ(xi )2 ,…,φ(xi )C ']
wherein phi (x)i )c A first feature representing a c-th channel of an i-th pedestrian image;
a set of probability values representing an updated c-th channel;
representing a set of probability values for the c-th channel after the copy update to match the pedestrian feature fb (xi ) The number of channels is the same;
pixel multiplication representing the corresponding position;
fb (xi ) A pedestrian feature representing an i-th pedestrian image;
φ(xi ) A first feature representing an i-th pedestrian image;
and obtaining a fusion characteristic by utilizing the first characteristic and the pedestrian characteristic.
With reference to the first aspect and the foregoing possible implementation manners, in a fourth possible implementation manner of the first aspect, the step of calculating a training loss by using a real attribute corresponding to the pedestrian image and the predicted attribute specifically includes:
J(θ)=∑j λj J(θj )
wherein J (θ) is the total training loss;
J(θj ) Training loss for the j-th task;
λj task weights representing the j-th task;
a variance representing uncertainty in prediction in the j-th task;
m represents the total number of pedestrian images used for training in the j-th task;
Kj a number of options representing the value of the j-th attribute;
representing the real attribute of the ith pedestrian image in the jth task;
Pij And the predicted attribute of the ith pedestrian image in the jth task is represented.
In a second aspect, a training method of a pedestrian attribute analysis model is provided, including:
inputting the pedestrian image into a convolutional neural network to obtain a prediction attribute;
calculating a training loss using the real attribute corresponding to the pedestrian image and the predicted attribute:
J(θ)=∑j λj J(θj )
wherein J (θ) is the total training loss;
J(θj ) Training loss for the j-th task;
λj task weights representing the j-th task;
a variance representing uncertainty in prediction in the j-th task;
m represents the total number of pedestrian images used for training in the j-th task;
Kj a number of options representing the value of the j-th attribute;
representing the real attribute of the ith pedestrian image in the jth task;
Pij representing the predicted attribute of the ith pedestrian image in the jth task;
and if the training loss converges, determining the current model parameters of the convolutional neural network as the model parameters of the pedestrian attribute analysis model to obtain the pedestrian attribute analysis model.
In a third aspect, a pedestrian attribute identification method is provided, including the steps of:
inputting the pedestrian image to be identified into the pedestrian attribute analysis model trained by the training method in any one of the first aspect or the second aspect to obtain the identified pedestrian attribute.
In a fourth aspect, a pedestrian attribute analysis model training system is provided, comprising:
the first training unit is used for inputting the pedestrian image and the probability map corresponding to the pedestrian image into the convolutional neural network to obtain the prediction attribute; calculating training loss by using the real attribute corresponding to the pedestrian image and the predicted attribute; under the condition that the training loss converges, determining the current model parameters of the convolutional neural network as the model parameters of a pedestrian attribute analysis model to obtain the pedestrian attribute analysis model;
wherein the probability map characterizes a set of probability values for each pixel node belonging to a pedestrian component area in the pedestrian image partitioned out of at least one pedestrian component area.
In a fifth aspect, a pedestrian attribute analysis model training system is provided, including:
the second training unit is used for inputting the pedestrian image into the convolutional neural network to obtain a prediction attribute; calculating training loss by using the real attribute corresponding to the pedestrian image and the predicted attribute; under the condition that the training loss converges, determining the current model parameters of the convolutional neural network as the model parameters of a pedestrian attribute analysis model to obtain the pedestrian attribute analysis model;
The second training unit includes:
the second weight self-updating unit is used for adjusting task weights corresponding to the tasks according to the following formula:
wherein lambda isj Task weights representing the j-th task;
a variance representing uncertainty in prediction in the j-th task;
m represents the total number of pedestrian images used for training in the j-th task;
Kj a number of options representing the value of the j-th attribute;
representing the real attribute of the ith pedestrian image in the jth task;
Pij representing the predicted attribute of the ith pedestrian image in the jth task;
the second training unit is further configured to calculate a training total loss using the task weight:
J(θ)=∑j λj J(θj )
wherein J (θ) is the total training loss;
J(θj ) Training loss for the j-th task;
λj the task weight of the j-th task is represented.
In a sixth aspect, there is provided a pedestrian attribute identification apparatus including:
the prediction unit is configured to input the pedestrian image to be identified into the neural network model trained by the training system according to any one of the fourth aspect or the fifth aspect, and output the identified pedestrian attribute.
In the training method of the pedestrian attribute analysis model in the first aspect, firstly, a pedestrian image and a probability map corresponding to the pedestrian image are input into a convolutional neural network to obtain a predicted attribute, wherein the probability map represents a set of probability values of each pixel node belonging to a pedestrian component area in the pedestrian image divided into at least one pedestrian component area. Then, training loss is calculated using the real attribute corresponding to the pedestrian image and the predicted attribute. And if the training loss converges, determining the current model parameters of the convolutional neural network as the model parameters of the pedestrian attribute analysis model to obtain the pedestrian attribute analysis model. By the method, pedestrian component areas are divided on the pixel level, indication information of the pedestrian component areas is given, and the pedestrian attribute analysis network is guided to learn more specific and robust characteristics, so that the influence of diversity of pedestrian gestures, camera angles and the like can be resisted to a certain extent. When the analysis model obtained by training by the training method is used for identifying pedestrian images with unknown attributes, even if the pedestrian images with larger differences of application scenes, pedestrian gestures and camera angles are faced, the pedestrian attributes can be accurately identified.
In addition, aiming at the situation that the learning difficulty and the convergence rate of each attribute analysis task are different, the technical scheme of the second aspect provides a training method for automatically updating the pedestrian attribute analysis model of the task weight corresponding to the task according to the difference of the tasks, that is, during each training, the training conditions of each attribute analysis task are used for adjusting each task weight so as to increase the contribution degree of a simple task in model training, prevent the model from being dominated by the difficult task, enable a plurality of tasks to coordinate training, and help the feature learning and information exchange of each task. The analysis model obtained through training by the training method can accurately identify a plurality of pedestrian attributes from the pedestrian image.
Drawings
In order to more clearly illustrate the technical solution of the present application, the drawings that are needed in the embodiments will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a flow chart of one implementation of a first embodiment of a training method of a pedestrian attribute analysis model of the present application;
FIG. 2 is a flow chart of a second implementation in a first embodiment of the training method of the pedestrian attribute analysis model of the present application;
FIG. 3 is a flowchart of one implementation of the step S100 in a first embodiment of the training method of the pedestrian attribute analysis model of the present application;
FIG. 4 is a flowchart of a third implementation in the first embodiment of the training method of the pedestrian attribute analysis model of the present application;
FIG. 5 is a flow chart of one implementation of a second embodiment of the training method of the pedestrian attribute analysis model of the present application;
FIG. 6 is a flow chart of a second implementation of the training method of the pedestrian attribute analysis model of the present application;
FIG. 7 is a schematic diagram of a structure of one embodiment of a pedestrian attribute analysis model training system of the present application;
FIG. 8 is a schematic diagram of a second embodiment of a training system for pedestrian attribute analysis model in accordance with the present application;
FIG. 9 is a schematic diagram of a third embodiment of the training system for pedestrian attribute analysis model of the present application;
FIG. 10 is a schematic diagram of a structure of a training system for a pedestrian attribute analysis model in accordance with a fourth embodiment of the present application.
Detailed Description
The embodiments of the present application are described in detail below.
Convolutional neural networks (Convolutional Neural Networks, CNN) are a model of a multi-layer neural network that is adept at dealing with the relevant machine learning problem of images, particularly large images.
In order to solve the problem of accuracy of pedestrian attribute recognition caused by the diversity of pedestrian pictures, please refer to fig. 1, in a first embodiment of the present application, a training method of a pedestrian attribute analysis model is provided, which includes:
s100: and inputting the pedestrian image and the probability map corresponding to the pedestrian image into a convolutional neural network to obtain the prediction attribute.
The pedestrian image refers to an image including pedestrians, and may be an original picture, such as a picture from a monitoring video; or may be a preprocessed picture or the like. The pedestrian images in this step belong to a first training set, which is a set of training samples for training the pedestrian attribute analysis model, and each of the pedestrian images in the first training set is one training sample. Each pedestrian image is provided with a corresponding real attribute label for marking the real attribute of the pedestrian image. The real property tags herein may be manually annotated.
In one implementation manner, the pedestrian image is obtained by preprocessing an original picture, and specifically the method comprises the following steps:
detecting whether pedestrians are contained in the original picture;
if the pedestrian is contained, acquiring the pedestrian position of the pedestrian in the original picture;
and cutting out a pedestrian image from the original picture according to the pedestrian position.
Alternatively, the size of the clipped pedestrian image may be preset here, for example, the pixel size of the preset clipped pedestrian image is 224×224.
And if the original picture does not contain pedestrians, discarding the original picture, and carrying out the step of preprocessing on the next original picture. If a plurality of pedestrians are detected in one original picture, a plurality of pedestrian images can be obtained by clipping.
A pedestrian image may include S 'pixel nodes, e.g., cut to size 224×224, then S' = 50176. The predetermined pedestrian component area may include hair, face, upper body, lower body, etc. The probability map characterizes a set of probability values of each pixel node belonging to a preset pedestrian component area in the pedestrian image partitioned into at least one pedestrian component area. Thus, the probability map may be represented as a matrix in which each probability value represents a probability value that the pixel node belongs to a certain pedestrian component area.
The preset pedestrian component area is provided with C 'types, and the ith pedestrian image comprises S' pixel nodes. s (lowercase) indicates the serial number of the pixel node, i.e., the s-th pixel node. c (lower case) indicates the serial number of the pedestrian component area, i.e., the c-th pedestrian component area.A probability value indicating that the s-th pixel node in the i-th pedestrian image belongs to the c-th pedestrian component area,a probability map representing the i-th pedestrian image. Here, in the present application, "C" in C' is a capital letter indicating the total number of pedestrian component areas divided in the i-th pedestrian image; the lower case C is the number of the pedestrian component area, representing the C-th of the C' pedestrian component areas; "S" in S' is a capital letter representing the total number of pixel nodes in the ith pedestrian image; lower case S is the number of the pixel node, representing the S 'th of the S' pixel nodes.
For example, for a pedestrian image, the pedestrian component area divided from the pedestrian image includes hair, face and upper body 3 types, and the pedestrian image includes 224×224= 50176 pixel nodes, the probability map of the pedestrian image may be represented as a 3×224×224 three-dimensional matrix. The probability map is represented in two dimensions, i.e., pixel nodes are numbered sequentially, and the correspondence relationship shown in fig. 1 can be represented.
Table 1 correspondence of numerical meanings in matrix
Wherein,,probability value indicating that 1 st pixel node belongs to hair,/->The probability value that the 1 st pixel node belongs to the face is represented, and other values have similar meanings, and specific corresponding relations are shown in table 1.
Referring to fig. 2, the probability map corresponding to the pedestrian image can be obtained by the following steps:
s400: and inputting the pedestrian image into a pedestrian analysis model to obtain a probability map corresponding to the pedestrian image. The pedestrian analysis model is a full convolution neural network trained by training images with real probability map labels.
In this step, the pedestrian analytical model is trained, i.e., a fully convolutional neural network (Fully Convolutional Networks, FCN) for which model parameters have been determined.
The training image here is also a pedestrian image, belonging to a second training set, which is a set of training samples for training the pedestrian analytical model. Each training image is provided with a corresponding real probability map label, and the real probability map label is used for labeling the real probability map of the training image.
For a certain training image, the true probability icon carried by the training image characterizes the actual pedestrian component area divided by the pedestrian image, and the probability value of each pixel node in the pedestrian image belonging to a certain actual pedestrian component area. For a certain pixel, the probability value of the pixel belonging to a certain actual pedestrian component area is usually 1 or 0,1 indicates that the pixel belongs to the actual pedestrian component area, and 0 indicates that the pixel does not belong to the actual pedestrian component area.
The main process of training the pedestrian analysis model is as follows: and inputting the training images in the second training set into the full convolution neural network, dividing the training images into at least one predicted pedestrian component area, and outputting a predicted probability value of each pixel node belonging to the predicted pedestrian component area to obtain a predicted probability map. And calculating the training loss of the full convolution neural network by using the predicted probability map and the true probability map of the training image. If the training loss converges, determining the model parameters in the current full convolution neural network as the model parameters of the pedestrian analysis model. If the training loss does not converge, the model parameters in the fully convolutional neural network are updated, and then the foregoing training steps are repeated. And determining the latest model parameter in the full convolutional neural network as the parameter of the pedestrian analysis model until the training loss of the calculated full convolutional neural network converges.
Pedestrian attributes may include body appearance characteristics (e.g., height, weight, etc.), wear characteristics (e.g., type and color of coat, pants, backpack, etc.), facial characteristics (e.g., age, gender, race), etc.
When the pedestrian attribute analysis model is trained, pedestrian images of the convolutional neural network are input in the first training set, and each pedestrian image can correspondingly obtain corresponding predicted pedestrian attributes, namely predicted attributes. When there is only one pedestrian attribute to predict, such training may be referred to as single task attribute analysis training. When there are a plurality of pedestrian attributes to be predicted, then the multi-task attribute analysis training is called.
By way of example of a single task attribute analysis training, for a pedestrian image, the predicted pedestrian attribute is only the hair color, and the hair color has 6 possible colors, and table 2 shows the probability of the pedestrian image being input into the convolutional neural network, and the hair being the corresponding color.
Table 2 prediction attribute example
Yellow colourBrown colorWhite colorRed colorGreen colourBlack color
Hair color0.50.30.010.050.040.1
In particular, in one implementation, the attention mechanism may also be introduced simultaneously when introducing the probability map into the method of training the pedestrian attribute analysis model. Specifically, referring to fig. 3, the convolutional neural network includes a first sub-network and a second sub-network; the step of S100 may include:
s110: extracting pedestrian characteristics from the pedestrian image by using a first subnetwork;
s120: updating the probability map;
s130: obtaining fusion characteristics according to the updated probability map and the pedestrian characteristics;
s140: and inputting the fusion characteristics into a second sub-network to obtain the prediction attribute.
In the step of S110, the pedestrian feature may be represented as fb (xi ) Extracting pedestrian features from the pedestrian image using the first subnetwork may be implemented using existing implementations. More specifically, the first subnetwork may comprise several convolution groups, each convolution group comprising one convolution layer and one pooling layer. Inputting the pedestrian image into a first subnetwork, extracting and downsampling the characteristics of a plurality of convolution groups, and finally obtaining the pedestrian characteristic f of the pedestrian imageb (xi )。
In step S120, updating probability values in the probability map specifically includes:
wherein x isi For the ith pedestrian imageThe method comprises the steps of carrying out a first treatment on the surface of the i is the index number of the pedestrian image;
representing a probability value that an s-th pixel node in an i-th pedestrian image belongs to a c-th pedestrian component area;
representing the updated probability value;
representing for an s-th pixel node belonging to class C in the C-th pedestrian component area +.>The maximum value is taken. That is, when this condition is satisfied, +.>The value of (2) is updated to 1. Otherwise, go (L)>Whether to take the original +.>Is a value of (2).
For example, take the s-th pixel node in Table 1 as an example, which is prior to updatingThe values of (2) are shown in Table 3, respectively, and after updating, the values of (2) are +.>The values of (2) are shown in Table 4.
Table 3 probability value of s-th pixel node before update
Hair (c=1)Face (c=2)Upper body (c=3)
The s-th pixel node0.70.20.1
Table 4 probability value of the s-th pixel node after update
Hair (c=1)Face (c=2)Upper body (c=3)
The s-th pixel node10.20.1
If the probability value that the s-th pixel node belongs to the C-th pedestrian component area is the largest, the C-th pedestrian component area is considered as the prediction type of the s-th pixel node, and the other (C' -1) categories are the non-prediction types of the s-th pixel node. By updating the probability value in the probability map, the probability of the prediction category is set to 1, so that the information of the pixel node can be kept as much as possible in the subsequent feature fusion step. Meanwhile, the probability of the non-prediction type is also kept, so that information loss caused by the prediction error of the s-th pixel node is prevented.
In step S130, obtaining a fusion feature according to the updated probability map and the pedestrian feature may specifically include:
s131: convolving and fusing the updated probability map and the pedestrian characteristic to obtain a first characteristic;
s132: abstracting the first feature to obtain a second feature;
s133: abstracting the pedestrian features to obtain third features;
s134: and adding and fusing the second feature and the third feature to obtain a fused feature.
In one implementation, the step of convolutionally fusing of S131 may include:
φ(xi )=[φ(xi )1 ,φ(xi )2 ,…,φ(xi )C' ]
wherein,,
φ(xi )c a first feature representing a c-th channel of an i-th pedestrian image;
probability value representing updated c-th channelA collection; i.e. a set of probability values (updated) that each pixel node in the pedestrian image belongs to a class c pedestrian component area.
Representing a set of probability values for the c-th channel after the copy update to match the pedestrian feature fb (xi ) The number of channels is the same.
The pixels representing the corresponding positions are multiplied.
fb (xi ) Representing the pedestrian characteristics of the i-th pedestrian image.
φ(xi ) A first feature of the i-th pedestrian image is represented.
For example, assume that a pedestrian feature f is inputb (xi ) Size (128, 56, 56), wherein 128 refers to the number of passages of pedestrian characteristics, 56 and 56 refer to height and width, respectively; assume an updated probability mapThe size is (9, 56, 56), where 9 refers to the number of channels in the probability map, i.e., the total number of pedestrian component areas, and 56 refer to height and width, respectively. For updated probability map->For the first feature of the c-th lane of (2), it is duplicated 128 times to match it with the pedestrian feature fb (xi ) The number of channels is the same, i.e.)>Is (128, 56, 56). Then add->And fb (xi ) The corresponding elements are multiplied, the convolution fusion is performed,obtaining the first characteristic phi (xi )c
Since the convolution fusion operation is performed separately for each channel in the probability map, i.e., for each pedestrian component region, the first feature that results is: phi (x)i )=[φ(xi )1 ,φ(xi )2 ,…,φ(xi )C ']Can be expressed as a [ (9X 128) ×56×56 ]]Is a matrix of (a) in the matrix.
After the updated probability map and the pedestrian features are subjected to convolution fusion, the features of each semantic region in the pedestrian image are separated independently, so that the convolution layer of the second sub-network is emphasized during learning in the subsequent steps, namely, the convolution layer of the second sub-network can know which semantic regions should be emphasized more according to specific numerical values in the first features during learning, and how to combine the features of each semantic region.
In the steps S132 to S134, the first feature is abstracted by several convolution layers to obtain a second feature, and the number of channels of the second feature is the same as the number of channels of the pedestrian feature. The pedestrian feature is abstracted by a plurality of convolution layers to obtain a third feature, and the number of channels of the third feature is the same as that of the pedestrian feature. And finally, adding and fusing the second characteristic and the third characteristic to obtain a more comprehensive fusion characteristic, so that the convolutional neural network can be trained more accurately, and the prediction accuracy is improved.
In step S140, the second sub-network and the first sub-network may be regarded as part of a convolutional neural network, which together form a convolutional neural network. The second subnetwork may comprise a plurality of convolution groups and a plurality of fully-connected layers, each convolution group comprising a convolution layer and a pooling layer, the plurality of convolution groups being connected in sequence, the last convolution group being connected with the fully-connected layer. And inputting the fusion characteristics into a second sub-network, extracting and downsampling the characteristics of a plurality of convolution groups, and finishing prediction through a full-connection layer to obtain the prediction attribute.
For example, table 5 illustrates predicted attribute tags for a pedestrian image in a first training set. Wherein each numerical value represents a probability that the value of the attribute is the corresponding option. That is, 0.1 indicates that the probability value of the pedestrian image that the hair color is yellow is 0.1, the probability value of the pedestrian image that it is brown is 0.6, and the rest of the meanings are similar. The option with the highest probability is selected and can be used as the predicted value of the attribute which is finally output.
Table 5 prediction attribute example
Yellow colourBrown colorWhite colorRed colorGreen colourBlack color
Hair color0.10.60.050.150.050.05
S200: and calculating training loss by using the real attribute corresponding to the pedestrian image and the predicted attribute.
As mentioned above, each pedestrian image in the first training set is respectively provided with a real attribute label corresponding to the pedestrian image, and the real attribute labels are used for labeling the real attributes of the pedestrian images. The real property tags herein may be manually annotated.
For example, table 6 illustrates the true attribute tags for a pedestrian image in the first training set shown in table 5. Wherein, 1 indicates that the attribute is a corresponding value, and 0 indicates that the attribute is not the value. That is, the hair color of the pedestrian image is brown, not the other five colors.
TABLE 6 true Attribute example
Yellow colourBrown colorWhite colorRed colorGreen colourBlack color
Hair color010000
The training loss may be calculated using existing loss functions, such as a square error loss function, an SVM loss function, a softmax loss function, and the like.
As previously described, such training may be referred to as single task attribute analysis training when there is only one pedestrian attribute to predict. When there are a plurality of pedestrian attributes to be predicted, then the multi-task attribute analysis training is called. Assuming that the T attributes are required to be predicted, the T training tasks are correspondingly shared, wherein the training task corresponding to the j-th attribute is predicted as the j-th task. The total training loss is:
J(θ)=∑j λj J(θj )
Wherein J (θ) is the total training loss;
J(θj ) Training loss for the j-th task;
λj task weights representing the j-th task; generally lambdaj May be set to a fixed value.
Training loss J (θ) for jth taskj ) Can be obtained by existing calculation means, such as, alternatively, J (θj ) The softmax loss function was used to calculate, as follows:
wherein m represents the total number of pedestrian images used for training in the jth task, namely the total number of training samples of the jth task, and i is the index number of the ith pedestrian image in the m pedestrian images.
Kj A number of options representing the value of the j-th attribute; k represents Kj Index number of k in the options; for example, for the pedestrian attribute of "sex", the value may be two options of "male" and "female", then Kj Equal to 2.
And the real attribute of the ith pedestrian image in the jth task is represented.
Representing a dirac function; if and only if->At the value of the kth option +.>Otherwise
Penalty coefficient representing anti-unbalanced data, +.>Wherein (1)>The value of the jth attribute in the training samples representing the jth task is the ratio of the number of samples of the kth option to the total number of training samples of the jth task. In performing multi-task attribute analysis training, the number of training samples for each value in each attribute is often unbalanced, e.g., there may be significantly fewer pedestrians in the training samples who wear the sunglasses than those who do not. In order to make training more efficient, a penalty factor of anti-unbalanced data is introduced in the face of this situation >When the sample proportion of the kth option containing the jth attribute becomes large, then +.>And becomes smaller, thereby penalizing.
The probability value indicating that the value of the jth attribute is the kth option is predicted for the ith pedestrian image. More specifically, the->
S300: and if the training loss converges, determining the current model parameters of the convolutional neural network as parameters of a pedestrian attribute analysis model to obtain the pedestrian attribute analysis model.
Referring to fig. 4, if the training loss does not converge, S301 is performed: updating model parameters in the convolutional neural network. And then repeating the steps from S100 to S200 for training until the calculated training loss is converged, and determining the model parameters in the latest convolutional neural network as the parameters of the pedestrian attribute analysis model. Here, the parameters of the convolutional neural network may be updated by using an existing algorithm such as a random gradient descent method (Stochastic Gradient Descent, SGD).
In addition to the aforementioned variety of pedestrian pictures that can affect the accuracy of pedestrian attribute recognition, the neural network model-based pedestrian attribute recognition method has another problem in practical application: the multitasking attribute analyzes the incompatibility of training. In particular, in practical applications, it is often necessary to identify a plurality of pedestrian attributes from one picture. However, when training the neural network model, learning difficulty and task convergence are not the same for different pedestrian attributes, for example, it is often more difficult to identify the age of a person than to identify the color of clothing. In the conventional model training method, the task weights of different tasks are often set to be fixed values, so that the problem of different difficulty and convergence of different tasks is ignored, and coordinated training is difficult to form when models with multiple pedestrian attributes are required to be identified in training. Also because of this, it is difficult for the trained analytical model to identify all of the pedestrian attributes more accurately when used to identify multiple pedestrian attributes in a pedestrian image.
For this purpose, a second embodiment of the application also proposes a further training method for a pedestrian attribute analysis model, in which the task weight λj Or can be adjusted according to the training situation.
Specifically, referring to fig. 5, a training method of a pedestrian attribute analysis model is provided, which includes steps S500 to S700.
S500: and inputting the pedestrian image into a convolutional neural network to obtain the prediction attribute.
The convolutional neural network can utilize pedestrian images as input data, and the obtained output data is a prediction attribute. In one implementation, a convolutional neural network may include a first subnetwork and a second subnetwork. The first subnetwork is used for extracting pedestrian characteristics f from the pedestrian imageb (xi ). In particular, the first subnetwork may comprise several convolution groups, each convolution group comprising one convolution layer and one pooling layer. Inputting the pedestrian image into a first subnetwork, extracting and downsampling the characteristics of a plurality of convolution groups, and finally obtaining the pedestrian characteristic f of the pedestrian imageb (xi ). The second sub-network is used for utilizing the pedestrian characteristic fb (xi ) As input data, a prediction attribute is obtained. In particular, the second subnetwork may comprise a plurality of convolution groups and a plurality of fully-connected layers, each convolution group comprising a convolution layer and a pooling layer, the plurality of convolution groups being connected in sequence, the last convolution group being connected with the fully-connected layer. And inputting the fusion characteristics into a second sub-network, extracting and downsampling the characteristics of a plurality of convolution groups, and finishing prediction through a full-connection layer to obtain the prediction attribute.
S600: calculating a training loss using the real attribute corresponding to the pedestrian image and the predicted attribute:
J(θ)=∑j λj J(θj ) (5)
wherein J (θ) is the total training loss;
J(θj ) Training loss for the j-th task;
λj task weights representing the j-th task;
a variance representing uncertainty in prediction in the j-th task;
m represents the total number of pedestrian images used for training in the j-th task;
Kj a number of options representing the value of the j-th attribute;
representing the real attribute of the ith pedestrian image in the jth task;
Pij and the predicted attribute of the ith pedestrian image in the jth task is represented.
Task weight lambdaj The derivation process of (2) is as follows:
for the problem of identifying and classifying the pedestrian attribute, the problem of classifying can be firstly regarded as a regression problem, namely for the ith pedestrian image, the real attribute of the ith pedestrian image in the jth task is assumed to beConverting the label marked with the real attribute into a vector form:Wherein (1)>Kj Number of options representing the value of the jth attribute, K being Kj Index number of k in the options.Representing a dirac function; if and only if->In the time-course of which the first and second contact surfaces,otherwise->For example, taking the real attributes of Table 5 as an example, Kj There are 6 possible values for this attribute, 6, i.e. hair color. For the ith pedestrian image, its jth attribute hair color, the true attribute is brown, which is expressed as +. >
After considering the classification problem as a fitting regression problem, its training goals remain consistent, i.e. let the predicted properties Pij Near real attributesTaking the uncertainty and Gaussian distribution of isomorphism into consideration, the prediction process of the ith pedestrian image in the jth task can be modeled as follows:
wherein,,is the variance of uncertainty in the j-th task. Assuming that there are m training samples and T prediction tasks in the jth task, the overall process can be modeled as:
the negative log likelihood function of equation (2) is written as:
in a common regression problem of the fit,is its training objective function. Whereas in the above formula (3), +.>Fitting regression problem with multi-attribute analysis task weight adaptive learning, and +.>The task weight of the j-th task. And the latter term K in formula (3)j logσj Is a regular term, restrict σj Neither too large nor too small. Considering that the classification problem and the regression problem described above are essentially one problem, and the target agreement is such that the predicted properties Pij Approximation of real Properties->The task weight of each task can be estimated and applied to the classification problem. />
In the above formula (3), uncertaintyCan be estimated according to the maximum likelihood method, let ∈Tek- >Obtain->Thus, the task weight for the jth task may be set to:
from the above equation (4), in the process of automatically updating the task weights, the weight λ of each taskj As model training increases, but the speed of simple tasks increases relatively faster (simple tasksFast descent and corresponding loss is also smaller), while the difficult task increases relatively slowly. The method can increase the contribution degree of simple tasks in model training to a certain extent, prevent the model from being dominated by difficult tasks, and therefore enable training coordination of multi-task attribute analysis to be better.
S700: and if the training loss converges, determining the current model parameters of the convolutional neural network as the model parameters of a pedestrian attribute analysis model.
If the training loss is not converged, updating the model parameters in the convolutional neural network, repeating the steps from S500 to S600 for training until the calculated training loss is converged, and determining the model parameters in the latest convolutional neural network as the parameters of the pedestrian attribute analysis model.
It should be noted that, in the first embodiment of the present application, the method of introducing a probability map to perform feature fusion to train an analysis model and the method of automatically updating task weights in the second embodiment may be combined with each other.
Thus, alternatively, referring to FIG. 6, the step of S500 in the second embodiment may be replaced with the step of S800, i.e
S800: and inputting the pedestrian image and the probability map corresponding to the pedestrian image into a convolutional neural network to obtain the prediction attribute. Wherein the probability map characterizes a set of probability values for each pixel node belonging to a pedestrian component area in the pedestrian image partitioned out of at least one pedestrian component area.
The steps of S800 may refer to the descriptions related to S100 of the first embodiment, and the same specific implementation manner is adopted, which is not described herein.
In a third embodiment of the present application, there is provided a pedestrian attribute identification method including the steps of:
and (3) training the obtained pedestrian attribute analysis model by adopting the training method in the first embodiment or the second embodiment, and inputting the pedestrian image to be identified into the pedestrian attribute analysis model to obtain the identified pedestrian attribute.
Here, the pedestrian image to be recognized is also an image including a pedestrian, except that the pedestrian attribute of the pedestrian is unknown. The pedestrian image to be identified can also be an original picture, such as a picture from a surveillance video or the like; or may be a preprocessed picture or the like.
In one implementation manner, the pedestrian image to be identified is obtained by preprocessing an original picture, and specifically the method includes the following steps:
detecting whether pedestrians are contained in the original picture;
if the pedestrian is contained, acquiring the pedestrian position of the pedestrian in the original picture;
and cutting out the pedestrian image to be identified from the original picture according to the pedestrian position.
Alternatively, the size of the cut-out pedestrian image to be recognized may be preset here, for example, the pixel size is 224×224. In general, the size of the pedestrian image to be recognized may be identical to the size of the pedestrian image employed when training the pedestrian attribute analysis model.
And if the original picture does not contain pedestrians, discarding the original picture, and carrying out the step of preprocessing on the next original picture. If a plurality of pedestrians are detected in one original picture, a plurality of pedestrian images to be identified can be obtained by clipping.
The pedestrian attribute directly output from the pedestrian attribute analysis model can also be expressed as a matrix, similar to the way the predicted attribute is expressed in the training process. For example, the probability of the hair color being yellow or the like is 0.01 for the other 5 colors, and the probability of the hair color being brown is 0.95. Taking the attribute when outputting the result to the user The highest probability of the options is taken as the final predicted value of the attribute, namelyStill by way of example, as described above, when output to a user, the predicted pedestrian image to be recognized is output with the hair color "brown".
In a fourth embodiment of the present application, corresponding to the training method in the first embodiment, please refer to fig. 7, there is provided a pedestrian attribute analysis model training system including:
the first training unit 1 is used for inputting a pedestrian image and a probability map corresponding to the pedestrian image into a convolutional neural network to obtain a prediction attribute; calculating training loss by using the real attribute corresponding to the pedestrian image and the predicted attribute; and under the condition that the training loss converges, determining the current model parameters of the convolutional neural network as the model parameters of the pedestrian attribute analysis model to obtain the pedestrian attribute analysis model. Wherein the probability map characterizes a set of probability values for each pixel node belonging to a pedestrian component area in the pedestrian image partitioned out of at least one pedestrian component area.
Optionally, referring to fig. 8, the training system may further include:
And the pedestrian analysis unit 2 is used for inputting the pedestrian image into the pedestrian analysis model to obtain a probability map corresponding to the pedestrian image. The pedestrian analysis model is a full convolution neural network trained by training images with real probability map labels.
Optionally, the first training unit 1 may further include:
a first pedestrian parsing auxiliary module 11 for updating the probability map; and obtaining a fusion characteristic according to the updated probability map and the pedestrian characteristic.
The first training unit 1 is further configured to extract pedestrian features from the pedestrian image by using a first sub-network; and inputting the fusion characteristic into a second sub-network to obtain the prediction attribute.
Optionally, the step of updating the probability map of the first pedestrian analysis assisting module 11 specifically includes:
wherein x isi Is the ith pedestrian image;
representing a probability value that an s-th pixel node in an i-th pedestrian image belongs to a c-th pedestrian component area;
representing the updated probability value;
representing for an s-th pixel node belonging to class C in the C-th pedestrian component area +.>The maximum value is taken.
Optionally, the step of obtaining the fusion feature by the first pedestrian analysis assisting module 11 according to the updated probability map and the pedestrian feature specifically includes:
And carrying out convolution fusion on the updated probability map and the pedestrian characteristic to obtain a first characteristic:
φ(xi )=[φ(xi )1 ,φ(xi )2 ,…,φ(xi )C' ]
wherein phi (x)i )c Represents the i Zhang Hangren thA first feature of a c-th channel of the image;
a set of probability values representing an updated c-th channel;
representing a set of probability values for the c-th channel after the copy update to match the pedestrian feature fb (xi ) The number of channels is the same;
pixel multiplication representing the corresponding position;
fb (xi ) A pedestrian feature representing an i-th pedestrian image;
φ(xi ) A first feature representing an i-th pedestrian image;
and obtaining a fusion characteristic by utilizing the first characteristic and the pedestrian characteristic.
Optionally, the first training unit 1 further comprises:
a weight self-updating unit 12, configured to adjust a task weight corresponding to a task according to the following formula:
wherein lambda isj Task weights representing the j-th task;
a variance representing uncertainty in prediction in the j-th task;
m represents the total number of pedestrian images used for training in the j-th task;
Kj a number of options representing the value of the j-th attribute;
representing the real attribute of the ith pedestrian image in the jth task;
Pij and the predicted attribute of the ith pedestrian image in the jth task is represented.
The first training unit 1 is further configured to calculate a training total loss using the task weight:
J(θ)=∑j λj J(θj )
Wherein J (θ) is the total training loss;
J(θj ) Training loss for the j-th task;
λj the task weight of the j-th task is represented.
Optionally, the training system further comprises:
a preprocessing unit 3, configured to detect whether the original picture contains a pedestrian; if the pedestrian is contained, acquiring the pedestrian position of the pedestrian in the original picture; and cutting out the pedestrian image from the original picture according to the pedestrian position.
In a fifth embodiment of the present application, referring to fig. 9, corresponding to the training method in the second embodiment, a pedestrian attribute analysis model training system is provided, including:
the second training unit 4 is used for inputting the pedestrian image into the convolutional neural network to obtain a prediction attribute; calculating training loss by using the real attribute corresponding to the pedestrian image and the predicted attribute; under the condition that the training loss converges, determining the current model parameters of the convolutional neural network as the model parameters of a pedestrian attribute analysis model to obtain the pedestrian attribute analysis model;
the second training unit 4 comprises:
a second weight self-updating unit 42, configured to adjust a task weight corresponding to the task according to the following formula:
Wherein lambda isj Task weights representing the j-th task;
a variance representing uncertainty in prediction in the j-th task;
m represents the total number of pedestrian images used for training in the j-th task;
Kj a number of options representing the value of the j-th attribute;
representing the real attribute of the ith pedestrian image in the jth task;
Pij and the predicted attribute of the ith pedestrian image in the jth task is represented.
The second training unit 4 is further configured to calculate a training total loss using the task weights:
J(θ)=∑j λj J(θj )
wherein J (θ) is the total training loss;
J(θj ) Training loss for the j-th task;
λj the task weight of the j-th task is represented.
Optionally, the second training unit 4 is further configured to input a pedestrian image and a probability map corresponding to the pedestrian image into a convolutional neural network to obtain a prediction attribute; wherein the probability map characterizes a set of probability values for each pixel node belonging to a pedestrian component area in the pedestrian image partitioned out of at least one pedestrian component area.
Optionally, referring to fig. 10, the training system may further include:
and the pedestrian analysis unit 2 is used for inputting the pedestrian image into the pedestrian analysis model to obtain a probability map corresponding to the pedestrian image. The pedestrian analysis model is a full convolution neural network trained by training images with real probability map labels.
Optionally, the second training unit 4 may further include:
a second pedestrian resolution assistance module 41 for updating the probability map; and obtaining a fusion characteristic according to the updated probability map and the pedestrian characteristic.
The second training unit 4 is further configured to extract pedestrian features from the pedestrian image by using a first sub-network; and inputting the fusion characteristic into a second sub-network to obtain the prediction attribute.
Optionally, the step of updating the probability map by the second pedestrian parsing auxiliary module 41 specifically includes:
wherein x isi Is the ith pedestrian image;
representing a probability value that an s-th pixel node in an i-th pedestrian image belongs to a c-th pedestrian component area;
representing the updated probability value;
representing for an s-th pixel node belonging to class C in the C-th pedestrian component area +.>The maximum value is taken. />
Optionally, the step of obtaining the fusion feature by the second pedestrian analysis assisting module 41 according to the updated probability map and the pedestrian feature specifically includes:
and carrying out convolution fusion on the updated probability map and the pedestrian characteristic to obtain a first characteristic:
φ(xi )=[φ(xi )1 ,φ(xi )2 ,…,φ(xi )C' ]
wherein phi (x)i )c A first feature representing a c-th channel of an i-th pedestrian image;
a set of probability values representing an updated c-th channel;
representing a set of probability values for the c-th channel after the copy update to match the pedestrian feature fb (xi ) The number of channels is the same;
pixel multiplication representing the corresponding position;
fb (xi ) A pedestrian feature representing an i-th pedestrian image;
φ(xi ) A first feature representing an i-th pedestrian image;
and obtaining a fusion characteristic by utilizing the first characteristic and the pedestrian characteristic.
Optionally, the training system further comprises:
a preprocessing unit 3, configured to detect whether the original picture contains a pedestrian; if the pedestrian is contained, acquiring the pedestrian position of the pedestrian in the original picture; and cutting out the pedestrian image from the original picture according to the pedestrian position.
In a sixth embodiment of the present application, there is also provided a pedestrian attribute identification apparatus, corresponding to the identification method in the third embodiment, including:
and the prediction unit is used for inputting the pedestrian image to be identified into the neural network model trained by the training system in the fourth or fifth embodiment, and outputting the identified pedestrian attribute.
The same or similar parts between the various embodiments in this specification are referred to each other. The embodiments of the present application described above do not limit the scope of the present application.

Claims (6)

CN201810488759.7A2018-05-212018-05-21Training method of pedestrian attribute analysis model, pedestrian attribute identification method and deviceActiveCN110516512B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201810488759.7ACN110516512B (en)2018-05-212018-05-21Training method of pedestrian attribute analysis model, pedestrian attribute identification method and device

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201810488759.7ACN110516512B (en)2018-05-212018-05-21Training method of pedestrian attribute analysis model, pedestrian attribute identification method and device

Publications (2)

Publication NumberPublication Date
CN110516512A CN110516512A (en)2019-11-29
CN110516512Btrue CN110516512B (en)2023-08-25

Family

ID=68621912

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201810488759.7AActiveCN110516512B (en)2018-05-212018-05-21Training method of pedestrian attribute analysis model, pedestrian attribute identification method and device

Country Status (1)

CountryLink
CN (1)CN110516512B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111178252A (en)*2019-12-272020-05-19东北大学Multi-feature fusion identity recognition method
CN111291632B (en)*2020-01-172023-07-11厦门熵基科技有限公司Pedestrian state detection method, device and equipment
CN113469932A (en)*2020-03-312021-10-01日本电气株式会社Information processing method, electronic device, and medium
CN113159144B (en)*2021-04-062023-06-16新疆爱华盈通信息技术有限公司Pedestrian attribute classification method, device, electronic equipment and storage medium
CN114239754B (en)*2022-02-242022-05-03中国科学院自动化研究所Pedestrian attribute identification method and system based on attribute feature learning decoupling

Citations (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN106529402A (en)*2016-09-272017-03-22中国科学院自动化研究所Multi-task learning convolutional neural network-based face attribute analysis method
CN106778682A (en)*2017-01-112017-05-31厦门中控生物识别信息技术有限公司A kind of training method and its equipment of convolutional neural networks model
CN106815566A (en)*2016-12-292017-06-09天津中科智能识别产业技术研究院有限公司A kind of face retrieval method based on multitask convolutional neural networks
CN107330396A (en)*2017-06-282017-11-07华中科技大学A kind of pedestrian's recognition methods again based on many attributes and many strategy fusion study
WO2017190574A1 (en)*2016-05-042017-11-09北京大学深圳研究生院Fast pedestrian detection method based on aggregation channel features
CN107563279A (en)*2017-07-222018-01-09复旦大学The model training method adjusted for the adaptive weighting of human body attributive classification
CN107862300A (en)*2017-11-292018-03-30东华大学A kind of descending humanized recognition methods of monitoring scene based on convolutional neural networks

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
WO2017190574A1 (en)*2016-05-042017-11-09北京大学深圳研究生院Fast pedestrian detection method based on aggregation channel features
CN106529402A (en)*2016-09-272017-03-22中国科学院自动化研究所Multi-task learning convolutional neural network-based face attribute analysis method
CN106815566A (en)*2016-12-292017-06-09天津中科智能识别产业技术研究院有限公司A kind of face retrieval method based on multitask convolutional neural networks
CN106778682A (en)*2017-01-112017-05-31厦门中控生物识别信息技术有限公司A kind of training method and its equipment of convolutional neural networks model
CN107330396A (en)*2017-06-282017-11-07华中科技大学A kind of pedestrian's recognition methods again based on many attributes and many strategy fusion study
CN107563279A (en)*2017-07-222018-01-09复旦大学The model training method adjusted for the adaptive weighting of human body attributive classification
CN107862300A (en)*2017-11-292018-03-30东华大学A kind of descending humanized recognition methods of monitoring scene based on convolutional neural networks

Also Published As

Publication numberPublication date
CN110516512A (en)2019-11-29

Similar Documents

PublicationPublication DateTitle
CN110516512B (en)Training method of pedestrian attribute analysis model, pedestrian attribute identification method and device
CN108596277B (en)Vehicle identity recognition method and device and storage medium
CN110163236B (en)Model training method and device, storage medium and electronic device
CN107679491B (en) A 3D Convolutional Neural Network Sign Language Recognition Method Fusion Multimodal Data
CN108520226B (en)Pedestrian re-identification method based on body decomposition and significance detection
CN106203395B (en)Face attribute recognition method based on multitask deep learning
CN112529178A (en)Knowledge distillation method and system suitable for detection model without preselection frame
CN107016413B (en)A kind of online stage division of tobacco leaf based on deep learning algorithm
CN113095263A (en)Method and device for training heavy identification model of pedestrian under shielding and method and device for heavy identification of pedestrian under shielding
CN105138998B (en)Pedestrian based on the adaptive sub-space learning algorithm in visual angle recognition methods and system again
CN106127197B (en)Image saliency target detection method and device based on saliency label sorting
CN106446015A (en)Video content access prediction and recommendation method based on user behavior preference
CN109214263A (en)A kind of face identification method based on feature multiplexing
CN113705596A (en)Image recognition method and device, computer equipment and storage medium
CN107203775A (en)A kind of method of image classification, device and equipment
CN115439887A (en)Pedestrian re-identification method and system based on pseudo label optimization and storage medium
CN111967930A (en)Clothing style recognition recommendation method based on multi-network fusion
CN109117823A (en)A kind of across the scene pedestrian based on multilayer neural network knows method for distinguishing again
CN117456588B (en)Wisdom museum user management system
CN116740384B (en)Intelligent control method and system of floor washing machine
CN112232147B (en)Method, device and system for self-adaptive acquisition of super-parameters of face model
JP2009093490A (en) Age estimation apparatus and program
Jaimes et al.Unsupervised semantic segmentation of aerial images with application to UAV localization
CN119152581B (en)Pedestrian re-identification method, device and equipment based on multi-mode semantic information
CN113850762B (en) Eye disease recognition method, device, equipment and storage medium based on anterior segment image

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp