Movatterモバイル変換


[0]ホーム

URL:


CN112434654A - Cross-modal pedestrian re-identification method based on symmetric convolutional neural network - Google Patents

Cross-modal pedestrian re-identification method based on symmetric convolutional neural network
Download PDF

Info

Publication number
CN112434654A
CN112434654ACN202011430914.3ACN202011430914ACN112434654ACN 112434654 ACN112434654 ACN 112434654ACN 202011430914 ACN202011430914 ACN 202011430914ACN 112434654 ACN112434654 ACN 112434654A
Authority
CN
China
Prior art keywords
pedestrian
visible light
infrared light
feature vector
sample feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011430914.3A
Other languages
Chinese (zh)
Other versions
CN112434654B (en
Inventor
张艳
相旭
唐俊
王年
屈磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Original Assignee
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui UniversityfiledCriticalAnhui University
Priority to CN202011430914.3ApriorityCriticalpatent/CN112434654B/en
Publication of CN112434654ApublicationCriticalpatent/CN112434654A/en
Application grantedgrantedCritical
Publication of CN112434654BpublicationCriticalpatent/CN112434654B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

The invention discloses a cross-modal pedestrian re-identification method based on a symmetric convolutional neural network, which comprises the following steps of: 1, acquiring pedestrian photos under two different modes of visible light and infrared light, constructing a cross-mode pedestrian re-identification data set, and constructing a search library; 2, establishing a symmetrical convolution neural network cross-modal pedestrian re-identification method model by using a neural network; training a cross-modal pedestrian re-identification method model based on a symmetric convolutional neural network by using a data set; and 4, forecasting is realized by utilizing the established model so as to achieve the purpose of cross-modal pedestrian re-identification. The method can greatly relieve the problem that the existing pedestrian re-identification method is inaccurate in detection in the cross-mode state, and still has higher detection precision under the condition of larger modal difference.

Description

Cross-modal pedestrian re-identification method based on symmetric convolutional neural network
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to a cross-modal pedestrian re-identification method based on a symmetric convolutional neural network.
Background
Cross-modal pedestrian re-identification is an important subject in the field of computer vision. Cross-modal pedestrian re-identification plays a crucial role in target tracking, video monitoring and public security, and is concerned by more and more scholars.
For a traditional single-mode pedestrian re-identification task, data only comprises visible light images, and the difficulty is mainly that the visual angle of a camera is changed, the camera is shielded, the posture of a pedestrian is changed, the illumination is changed, the background is complex, and the like. The cross-mode pedestrian re-identification data includes not only a visible light image but also an infrared light image, and image retrieval needs to be performed in the two modes. At night, the visible camera has difficulty capturing enough pedestrian appearance information due to weak lighting, and the pedestrian appearance information is mainly acquired by the infrared camera or the depth camera. Because the imaging mechanisms of the two cameras are different, two modes are formed, and huge mode difference exists between the two images. The visible light image is different from the infrared light image, and as shown in fig. 1, it can be seen that the visible light image contains more color information than the infrared light image. In addition to the intra-modal differences, inter-modal differences also present another significant problem to be solved for cross-modal pedestrian re-identification.
The inter-modal differences between the visible light modality and the infrared light modality can be subdivided into characteristic differences and appearance differences. To reduce the impact of feature variation, some approaches reduce feature variation by aligning cross-modal features using a uniform embedding space, but this ignores the large apparent difference between the two modalities. Other methods use generation of a countermeasure network (GAN) to effect image conversion between visible and infrared light images in such a way as to reduce the effect of appearance differences. Although the virtual image generated by GAN is similar to the original image, it is not guaranteed that the detail information related to the identity is generated, and the generated information is not guaranteed to be completely reliable.
Disclosure of Invention
The invention provides a cross-modal pedestrian re-identification method based on a symmetric convolutional neural network aiming at the problem of inter-modal and intra-modal differences, so as to reduce the inter-modal and intra-modal differences and improve the re-identification effect and precision.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention relates to a cross-modal pedestrian re-identification method based on a symmetric convolutional neural network, which is characterized by comprising the following steps of:
step 1, collecting a visible light image set V of N pedestrians, wherein j visible light images of the ith pedestrian are recorded as ViAnd V isi={Vi1,Vi2,...,Vij},VijA jth visible light picture representing the ith pedestrian and giving ith identity information y to the ith pedestriani;i=1,2,…,N;
Collecting an infrared light image set T of N pedestrians by using an infrared light camera or a depth camera, wherein m infrared light images of the ith pedestrian are recorded as TiAnd T isi={Ti1,Ti2,...,Tim},TimAn m-th infrared light image representing an i-th pedestrian;
constructing a search library by visible light pictures and infrared light images of other pedestrians with known identity information;
step 2, constructing a symmetrical convolutional neural network consisting of a generator and a discriminator;
the generator consists of two independent columns of ResNet50 networks, wherein the ResNet50 network consists of d residual sub-modules, and a column of full connection layers S is added after the d-1 residual sub-module1Adding a column of fully-connected layers S after the d-th residual sub-module2
The discriminator consists of a visible light image classifier and an infrared light image classifier;
initializing network weights for the ResNet50 network;
initializing parameters of the full connection layer and the discriminator by adopting a random initialization mode;
step 3, respectively inputting the visible light image set V and the infrared light image set T of the N pedestrians into two independent ResNet50 networks, and outputting a d-1 group of visible light characteristic information V after a d-1 residual sub-moduled-1And d-1 group infrared light characteristic information td-1Respectively inputting the d-th residual error sub-module and outputting a d-th group of visible light characteristic information vdAnd d group infrared light characteristic information td
Step 4, constructing the d-1 th sample feature space Xd-1
Selecting visible light characteristic information and infrared light characteristic information of P pedestrians from all the characteristic information output by the d-1 th residual error submodule, wherein the visible light characteristic information v of each pedestriani,d-1And infrared light characteristic information ti,d-1Respectively selecting K pieces of feature information to construct a d-1 th sample feature space Xd-1
The d-1 th sample feature space Xd-1Are input together to the subsequent fully-connected layer S1D-1 group visible light feature vector v 'is output'd-1And infrared light feature vector t'd-1
Step 5, constructing the d sample feature space Xd
Selecting visible light characteristic information and infrared light characteristic information of P pedestrians from all the characteristic information output by the d residual error submodule, wherein the visible light characteristic information v of each pedestriani,dAnd infrared light characteristic information ti,dRespectively selecting K pieces of feature information to construct a d-th sample feature space Xd
Then the d sample feature space X is useddAre input together to the subsequent fully-connected layer S2And outputs a d-th group visible light feature vector v'dAnd infrared light feature vector t'd
Step 6, the d-1 group of visible light feature vectors v'd-1Inputting the infrared light characteristic vector into the visible light image classifier, outputting an initial probability distribution GV of the visible light, and converting the d-1 th group of infrared light characteristic vectors t'd-1Inputting the infrared light image classifier and outputting an initial probability distribution GT of infrared light;
construction of identity loss function L using equation (1)ID
Figure BDA0002820577290000031
Step 7, from the d-1 sample feature space Xd-1To select the kth characteristic information of the a-th pedestrian
Figure BDA0002820577290000032
If the feature vector is recorded as the anchor sample feature vector, the anchor sample feature vector is recorded together with the anchor sample feature vector
Figure BDA0002820577290000033
The z-th characteristic information of the a-th pedestrian with the same identity information
Figure BDA0002820577290000034
Is denoted as the z-th positive sample feature vector, and
Figure BDA0002820577290000035
the c characteristic information of the f pedestrian with different identity information
Figure BDA0002820577290000036
And (4) establishing a mixed ternary loss function L by using the formula (2) after the c negative sample feature vector is recordedTRI1(Xd-1):
Figure BDA0002820577290000037
In the formula (2), the reaction mixture is,
Figure BDA0002820577290000038
representing anchor sample feature vectors
Figure BDA0002820577290000039
And the z-th positive sample feature vector
Figure BDA00028205772900000310
The Euclidean distance of (a) is,
Figure BDA00028205772900000311
representing anchor sample feature vector and the c negative sample feature vector of the f pedestrian
Figure BDA00028205772900000312
Euclidean distance of (p)1Is a mixed ternary loss function LTRI1(Xd-1) A predefined minimum interval;
step 8, from the d sample characteristic space XdIn the method, the s characteristic information of the r pedestrian is selected
Figure BDA00028205772900000313
If the feature vector is recorded as the anchor sample feature vector, the anchor sample feature vector is recorded together with the anchor sample feature vector
Figure BDA00028205772900000314
B-th characteristic information of r-th pedestrian with same identity information
Figure BDA00028205772900000315
Is denoted as the b-th positive sample feature vector, and
Figure BDA00028205772900000316
qth characteristic information of the h pedestrian with different identity information
Figure BDA00028205772900000317
And (4) establishing a mixed ternary loss function L by using the formula (3) after the q negative sample feature vector is recordedTRI2(Xd):
Figure BDA0002820577290000041
In the formula (3), the reaction mixture is,
Figure BDA0002820577290000042
representing anchor sample feature vectors
Figure BDA0002820577290000043
And the b-th positive sample feature vector
Figure BDA0002820577290000044
The Euclidean distance of (a) is,
Figure BDA0002820577290000045
representing the anchor sample feature vector and the qth negative sample feature vector of the h pedestrian
Figure BDA0002820577290000046
Euclidean distance of (p)2Is a mixed ternary loss function LTRI2(Xd) A predefined minimum interval;
step 9, establishing a mixed ternary loss function L by using the formula (4)TRI
LTRI=LTRI1+LTRI2 (4)
Establishing a global penalty function L using equation (5)ALL
LALL=LID+βLTRI (5)
In the formula (5), β represents a mixed ternary loss function LTRIThe coefficient of (a);
carrying out optimization solution on the formula (5) by a random gradient descent method, carrying out gradient back propagation, training each parameter of the symmetric convolutional neural network, and obtaining a preliminarily trained symmetric convolutional neural network model;
step 10, the d-1 group visible light feature vector v'd-1Inputting the visible light image classifier in the preliminarily trained symmetric convolutional neural network model, and outputting visible lightD-1 group infrared light feature vector t'd-1Inputting the infrared light image classifier in the preliminarily trained symmetric convolutional neural network model, and outputting the probability distribution GT' of infrared light; d-1 group visible light feature vector v'd-1Inputting the pseudo visible light probability distribution GV 'into an infrared light classifier in the preliminarily trained symmetric convolutional neural network model to obtain a pseudo visible light probability distribution GV';
constructing a divergence loss function L between the pseudo visible light eigenvector GV 'and the visible light probability distribution GV' by using the formula (6)KL
LKL=KL(GV″,GV′) (6)
In the formula (6), KL (·,) represents the difference value of the probability distributions of the two;
establishing a discriminator loss function L using equation (7)DIS
LDIS=LID-αLKL (7)
In the formula (7), α represents LKLThe coefficient of (a);
step 11, establishing a generator loss function L by using the formula (8)GEN
LGEN=αLKL+βLTRI (8)
And 12, sequentially optimizing and solving the formula (5), the formula (7) and the formula (8) by a gradient descent method:
firstly, carrying out optimization solution on the formula (5) and training all parameters of the network;
secondly, carrying out optimization solution on the formula (7), in the gradient back propagation process, only carrying out back propagation on the gradient of the discriminator, and setting the gradient of the generator to zero, thereby freezing the generator parameters and training the discriminator parameters;
finally, carrying out optimization solution on the formula (8), in the gradient back propagation process, only carrying out back propagation on the gradient of the generator, and setting the gradient of the discriminator to zero, thereby freezing the parameters of the discriminator and training the parameters of the generator;
after training in turn, make LALL,LDIS,LGENConverge to optimum in antagonistic learning when LDISReach the optimumWhen the discriminator is optimal, when LGENWhen the optimal condition is reached, the generator is optimal, so that a final cross-modal pedestrian re-identification model of the symmetric convolutional neural network is obtained;
step 13, utilizing the final symmetrical convolutional neural network model to query and match the cross-modal pedestrian re-identification;
inputting the pedestrian image to be inquired into a final symmetrical convolutional neural network model to extract features, then carrying out similarity comparison with the features of the pedestrians in a search library, and finding corresponding pedestrian identity information from the ranking list according to the sequence of the similarity, thereby obtaining an identification result.
Compared with the prior art, the invention has the beneficial effects that:
1. aiming at the difference between the modes, the invention combines the mode confusion idea based on probability distribution with counterstudy to construct a symmetrical convolutional neural network, the symmetrical convolutional neural network is composed of a generator and a discriminator, and the network generates the mode invariant feature by minimizing the output probability distribution difference of a classifier in the discriminator, thereby achieving the purpose of mode confusion and realizing higher detection precision under the conditions of shielding, pedestrian posture change, illumination change and mode change.
2. In order to solve the two problems of the difference between the modes and the difference in the modes, the invention combines the ternary loss with the counterstudy and provides a mixed ternary loss to reduce the difference between the modes and the difference in the modes. When mode confusion is achieved through counterlearning, the positive sample and the negative sample are selected under the condition that modes are not distinguished to conduct feature alignment and reduce mode difference, so that high detection precision is achieved under the condition that the mode difference is large, and the adaptability of the method is high.
3. According to the capability of describing structure and spatial information of the hidden layer convolution characteristics, the invention adopts d-1 layer hidden layer convolution characteristics (namely characteristics from ResNet50 network residual submodule) as a rear full-connection layer S1And the input of the following discriminator, so that the network can learn more space structure information and reduce the color differenceThe influence is reduced, and the difference between the two modes is reduced, so that the detection precision of the invention is improved, and the invention has stronger applicability in the fields of target tracking, video monitoring, public security and the like.
4. The invention aligns the features at different depths of the symmetrical convolutional neural network, so that the network can learn more deep information, the robustness of the network is improved, the problem of inaccurate detection of the existing pedestrian re-identification method in a cross-mode state can be greatly solved, and the accurate detection can be realized under the condition of appearance difference and other problems.
Drawings
FIG. 1 is a schematic view of two modes of a prior art mid-modal pedestrian;
FIG. 2 is a network architecture proposed by the present invention;
FIG. 3 is a schematic view of inter-modal and intra-modal losses to which the present invention relates;
FIG. 4 is a graph of the results of the alpha variable on the RegDB data set in the present invention;
FIG. 5 is a graph of the results of the alpha variable on the SYSU-MM01 dataset in accordance with the present invention;
FIG. 6 is a graph of the results of the beta variable on the RegDB data set in the present invention;
FIG. 7 is a graph of the results of the beta variable on the SYSU-MM01 dataset in accordance with the present invention.
Detailed Description
In the embodiment, a cross-modal pedestrian re-identification method based on a symmetric convolutional neural network mainly reduces inter-modal and intra-modal differences by using the symmetric convolutional neural network and counterstudy; and the network is optimized on different network depths, and the appearance difference is reduced by utilizing the shallow feature with more space structure information. Referring to fig. 1, there is shown a schematic diagram of images in two different modalities, the detailed steps are as follows:
step 1, collecting a visible light image set V of N pedestrians, wherein j visible light images of the ith pedestrian are recorded as ViAnd V isi={Vi1,Vi2,...,Vij},VijRepresenting the i-th pedestrianThe jth visible light picture and endows ith identity information y to the ith pedestriani;i=1,2,…,N;
Collecting an infrared light image set T of N pedestrians by using an infrared light camera or a depth camera, wherein m infrared light images of the ith pedestrian are recorded as TiAnd T isi={Ti1,Ti2,...,Tim},TimAn m-th infrared light image representing an i-th pedestrian;
constructing a search library by visible light pictures and infrared light images of other pedestrians with known identity information;
this embodiment utilizes the RegDB dataset and the SYSU-MM01 dataset. The SYSU-MM01 is a large-scale cross-modal pedestrian re-identification dataset collected by four visible light cameras and two infrared light cameras. The data set has two different scenes, namely indoor and outdoor, and the training set comprises 395 pieces of pedestrian identity data information, wherein 11909 infrared light pedestrian images and 22258 visible light pedestrian images are shared.
The RegDB data set contains 412 pedestrian identity information, which are captured by the dual camera system. Each pedestrian ID contains 10 visible light images and 10 infrared light images in total. The invention adopts a recognized data set processing method, randomly divides all data in the data set into two parts, and randomly selects a part of data for training.
Step 2, constructing a symmetrical convolutional neural network consisting of a generator and a discriminator;
the generator consists of two independent columns of ResNet50 networks, where the ResNet50 network consists of d residual sub-modules, a column of fully connected layers S is added after the d-1 th residual sub-module1Adding a column of fully-connected layers S after the d-th residual sub-module2;S1,S2For extracting modal sharing information; the ResNet50 network adopted by the invention is composed of 4 residual submodules, wherein d is 4, and d-1 is 3; full connection layer S1,S2The number of the neurons is set to 1024;
the discriminator consists of a visible light image classifier and an infrared light image classifier, which are shown in fig. 2;
initializing network weights for the ResNet50 network;
initializing parameters of the full connection layer and the discriminator by adopting a random initialization mode;
step 3, respectively inputting the visible light image set V and the infrared light image set T of N pedestrians into two independent ResNet50 networks for extracting the characteristic information of the pedestrians, and outputting d-1 group of visible light characteristic information V after the d-1 residual sub-moduled-1And d-1 group infrared light characteristic information td-1Respectively inputting the d-th residual error sub-module and outputting the d-th group of visible light characteristic information vdAnd d group infrared light characteristic information td
Step 4, constructing the d-1 th sample feature space Xd-1
Selecting visible light characteristic information and infrared light characteristic information of P pedestrians from all the characteristic information output by the d-1 th residual error submodule, wherein the visible light characteristic information v of each pedestriani,d-1And infrared light characteristic information ti,d-1Respectively selecting K pieces of feature information to construct a d-1 th sample feature space Xd-1(ii) a In the invention, P is 16, K is 4;
the d-1 th sample feature space Xd-1Are input together to the subsequent fully-connected layer S1And d-1 group visible light feature vector v 'is output for extracting modal sharing information'd-1And infrared light feature vector t'd-1
Step 5, constructing the d sample feature space Xd
Selecting visible light characteristic information and infrared light characteristic information of P pedestrians from all the characteristic information output by the d residual error submodule, wherein the visible light characteristic information v of each pedestriani,dAnd infrared light characteristic information ti,dRespectively selecting K pieces of feature information to construct a d-th sample feature space Xd;P=16,K=4;
Then the d sample feature space XdAre input together to the subsequent fully-connected layer S2For extracting modal sharing information and outputting d-th group visible light feature vector v'dAnd infrared light feature vector t'd
Step 6, setting the d-1 group visible light feature vector v'd-1Inputting into a visible light image classifier, outputting an initial probability distribution GV of visible light, and converting the d-1 th group of infrared light feature vectors t'd-1Inputting the infrared light image into an infrared light image classifier, and outputting an initial probability distribution GT of the infrared light;
construction of identity loss function L using equation (1)ID
Figure BDA0002820577290000081
Step 7, from the d-1 th sample feature space Xd-1To select the kth characteristic information of the a-th pedestrian
Figure BDA0002820577290000082
If the feature vector is recorded as the anchor sample feature vector, the anchor sample feature vector is recorded together with the anchor sample feature vector
Figure BDA0002820577290000083
The z-th characteristic information of the a-th pedestrian with the same identity information
Figure BDA0002820577290000084
Is denoted as the z-th positive sample feature vector, and
Figure BDA0002820577290000085
the c characteristic information of the f pedestrian with different identity information
Figure BDA0002820577290000086
And (4) establishing a mixed ternary loss function L by using the formula (2) after the c negative sample feature vector is recordedTRI1(Xd-1):
Figure BDA0002820577290000087
In the formula (2), the reaction mixture is,
Figure BDA0002820577290000088
representing anchor sample feature vectors
Figure BDA0002820577290000089
And the z-th positive sample feature vector
Figure BDA00028205772900000810
The Euclidean distance of (a) is,
Figure BDA00028205772900000811
representing anchor sample feature vector and the c negative sample feature vector of the f pedestrian
Figure BDA00028205772900000812
Euclidean distance of (p)1Is a mixed ternary loss function LTRI1(Xd-1) A predefined minimum interval; is set to rho10.5. The distance between the anchor sample feature vector and the positive sample feature vector can be reduced by optimizing equation (2) and the distance between the anchor sample feature vector and the negative sample feature vector can be increased. As shown in fig. 3;
step 8, from the d sample characteristic space XdIn the method, the s characteristic information of the r pedestrian is selected
Figure BDA00028205772900000813
If the feature vector is recorded as the anchor sample feature vector, the anchor sample feature vector is recorded together with the anchor sample feature vector
Figure BDA00028205772900000814
B-th characteristic information of r-th pedestrian with same identity information
Figure BDA00028205772900000815
Is denoted as the b-th positive sample feature vector, and
Figure BDA00028205772900000816
qth characteristic information of the h pedestrian with different identity information
Figure BDA00028205772900000817
And (4) establishing a mixed ternary loss function L by using the formula (3) after the q negative sample feature vector is recordedTRI2(Xd):
Figure BDA0002820577290000091
In the formula (3), the reaction mixture is,
Figure BDA0002820577290000092
representing anchor sample feature vectors
Figure BDA0002820577290000093
And the b-th positive sample feature vector
Figure BDA0002820577290000094
The Euclidean distance of (a) is,
Figure BDA0002820577290000095
representing the anchor sample feature vector and the qth negative sample feature vector of the h pedestrian
Figure BDA0002820577290000096
Euclidean distance of (p)2Is a mixed ternary loss function LTRI2(Xd) A predefined minimum interval; is set to rho2=0.5。
Step 9, establishing a mixed ternary loss function L by using the formula (4)TRI
LTRI=LTRI1+LTRI2 (4)
Establishing a global penalty function L using equation (5)ALL
LALL=LID+βLTRI (5)
In the formula (5), β represents a mixed ternary loss function LTRIThe coefficient of (a). The coefficient β is set to β 1.4.
Carrying out optimization solution on the formula (5) by a random gradient descent method, carrying out gradient back propagation, training each parameter of the symmetric convolutional neural network, and obtaining a preliminarily trained symmetric convolutional neural network model;
step 10, setting a d-1 group visible light feature vector v'd-1Inputting the infrared light characteristic vectors t ' of the d-1 group into a visible light image classifier in a symmetrical convolutional neural network model after preliminary training, outputting the probability distribution GV ' of visible light 'd-1Inputting the infrared light image into an infrared light image classifier in a preliminarily trained symmetric convolutional neural network model, and outputting the probability distribution GT' of infrared light; d-1 group visible light feature vector v'd-1Inputting the pseudo visible light probability distribution GV' into an infrared light classifier in the preliminarily trained symmetric convolutional neural network model;
construction of divergence loss function L between pseudo visible light eigenvector GV 'and visible light probability distribution GV' using equation (6)KL
LKL=KL(GV″,GV′) (6)
In the formula (6), the reaction mixture is,
Figure BDA0002820577290000097
to represent
Figure BDA0002820577290000099
And
Figure BDA0002820577290000098
difference values of probability distributions;
establishing a discriminator loss function L using equation (7)DIS
LDIS=LID-αLKL (7)
In the formula (7), α represents LKLThe coefficient of (a). The coefficient α is set to α ═ 1.
Step 11, establishing a generator loss function L by using the formula (8)GEN
LGEN=αLKL+βLTRI (8)
The invention has performed a verification experiment on the setting of α, β, and fig. 4 is the effect of the coefficient α on the RegDB data set in the invention; FIG. 5 is a graph of the effect of the coefficient α on the SYSU-MM01 data set; the performance of the invention is proved to be better when alpha is 1;
FIG. 6 is a graph of the effect of the coefficient β on the RegDB data set in the present invention; FIG. 7 is a graph of the effect of the coefficient β on the SYSU-MM01 data set; when alpha is 1 and beta is 1.4, the performance of the invention is optimal, and experiments prove that good results can be obtained in a wider value range of alpha and beta, which reflects the superiority of the invention.
And 12, sequentially carrying out optimization solving on the formula (5), the formula (7) and the formula (8) by a gradient descent method. The invention optimizes a network model using an adaptive gradient optimizer (Adam).
Firstly, carrying out optimization solution on the formula (5) and training all parameters of the network;
secondly, carrying out optimization solution on the formula (7), in the gradient back propagation process, only carrying out back propagation on the gradient of the discriminator, and setting the gradient of the generator to zero, thereby freezing the generator parameters and training the discriminator parameters;
finally, carrying out optimization solution on the formula (8), in the gradient back propagation process, only carrying out back propagation on the gradient of the generator, and setting the gradient of the discriminator to zero, thereby freezing the parameters of the discriminator and training the parameters of the generator;
after training in turn, make LALL,LDIS,LGENConverge to optimum in antagonistic learning when LDISWhen the optimum is reached, the discriminator is optimum, when L isGENWhen the optimal condition is reached, the generator is optimal, so that a final cross-modal pedestrian re-identification model of the symmetric convolutional neural network is obtained;
step 13, utilizing the final symmetrical convolutional neural network model to query and match the cross-modal pedestrian re-identification;
inputting the pedestrian image to be inquired into a final symmetrical convolutional neural network model to extract features, then carrying out similarity comparison with the features of the pedestrians in a search library, and finding corresponding pedestrian identity information from the ranking list according to the sequence of the similarity, thereby obtaining an identification result.
Example (b):
in order to prove the effectiveness of the invention, some comparative tests are carried out with other methods, and as shown in table 1, compared with other methods in the prior art, the effect of the invention is obviously better, and the effectiveness of the invention is proved. Ablation experiments were also performed on each module of the network of the present invention, and the results of the experiments are shown in table 2, demonstrating the effectiveness of each module of the present invention.
Table 1 is a graph comparing the effectiveness of the present invention with other methods
Figure BDA0002820577290000111
Table 2 shows the related ablation experimental graphs of the present invention
Figure BDA0002820577290000112
Experiments prove that the method can greatly relieve the problem that the existing pedestrian re-identification method is inaccurate in detection in the cross-mode, and still has higher detection precision under the condition of larger modal difference.

Claims (1)

Translated fromChinese
1.一种基于对称卷积神经网络的跨模态行人重识别方法,其特征是按如下步骤进行:1. a cross-modal pedestrian re-identification method based on symmetric convolutional neural network is characterized in that carrying out as follows:步骤1、采集N个行人的可见光图像集V,其中第i个行人的j张可见光图像记为Vi,且Vi={Vi1,Vi2,...,Vij},Vij表示第i个行人的第j张可见光图片,并为第i个行人赋予第i身份信息yi;i=1,2,…,N;Step 1. Collect the visible light image set V of N pedestrians, wherein j visible light images of the ith pedestrian are denoted as Vi , and Vi ={Vi1 ,Vi2 ,...,Vij },Vij represents the jth visible light picture of the ith pedestrian, and gives the ith identity information yi to theith pedestrian; i=1,2,...,N;用红外光相机或者深度相机采集N个行人的红外光图像集T,其中第i个行人的m张红外光图像记为Ti,且Ti={Ti1,Ti2,...,Tim},Tim表示第i个行人的第m个红外光图像;Use an infrared light camera or a depth camera to collect the infrared light image set T of N pedestrians, wherein m infrared light images of the ith pedestrian are denoted as Ti , and Ti ={Ti1 ,Ti2 ,... , Tim }, Tim represents the m-th infrared light image of the i-th pedestrian;由其他已知身份信息的行人的可见光图片和红外光图像来构建检索库;Build a retrieval library from visible and infrared images of pedestrians with other known identity information;步骤2、构建由生成器和鉴别器组成的对称卷积神经网络;Step 2. Build a symmetric convolutional neural network consisting of a generator and a discriminator;所述生成器由两列独立的ResNet50网络构成,其中ResNet50网络由d个残差子模块构成,在第d-1个残差子模块之后添加一列全连接层S1,在第d个残差子模块之后添加一列全连接层S2The generator is composed of two independent ResNet50 networks, where the ResNet50 network is composed of d residual sub-modules, and a column of fully connected layer S1 is added after the d-1 th residual sub-module. A column of fully connected layer S2 is added after the sub-module;所述鉴别器由可见光图像分类器和红外光图像分类器组成;The discriminator is composed of a visible light image classifier and an infrared light image classifier;初始化所述ResNet50网络的网络权重;Initialize the network weights of the ResNet50 network;采用随机初始化方式来初始化所述全连接层与鉴别器的参数;A random initialization method is used to initialize the parameters of the fully connected layer and the discriminator;步骤3、将所述N个行人的可见光图像集V和红外光图像集T分别输入到两列独立的ResNet50网络中,并在第d-1个残差子模块后输出第d-1组可见光特征信息vd-1,以及第d-1组红外光特征信息td-1,再分别输入所述第d个残差子模块后,输出第d组可见光特征信息vd以及第d组红外光特征信息tdStep 3. Input the visible light image set V and infrared light image set T of the N pedestrians into two independent ResNet50 networks respectively, and output the d-1th group of visible light after the d-1th residual sub-module The characteristic information vd-1 and the d-1 group of infrared light characteristic information td-1 are respectively input to the d-th residual sub-module, and the d-th group of visible light characteristic information vd and the d-th group of infrared light are output. light characteristic information td ;步骤4、构建第d-1个样本特征空间Xd-1Step 4, constructing the d-1 th sample feature space Xd-1 ;从第d-1个残差子模块输出的所有特征信息中,选取P个行人的可见光特征信息和红外光特征信息,每个行人的可见光特征信息vi,d-1和红外光特征信息ti,d-1各选取K个特征信息,构建第d-1个样本特征空间Xd-1From all the feature information output by the d-1th residual sub-module, select the visible light feature information and infrared light feature information of P pedestrians, the visible light feature information vi, d-1 and the infrared light feature information t of each pedestrian Each ofi, d-1 selects K pieces of feature information to construct the d-1 th sample feature space Xd-1 ;将所述第d-1个样本特征空间Xd-1一起输入到后续的全连接层S1中,并输出第d-1组可见光特征向量v′d-1和红外光特征向量t′d-1The d-1 th sample feature space Xd-1 is input together into the subsequent fully connected layer S1 , and the d-1 th group of visible light feature vector v'd-1 and infrared light feature vector t'd are output-1 ;步骤5、构建第d个样本特征空间XdStep 5. Construct the d-th sample feature space Xd ;从第d个残差子模块输出的所有特征信息中,选取P个行人的可见光特征信息和红外光特征信息,每个行人的可见光特征信息vi,d和红外光特征信息ti,d各选取K个特征信息,构建第d个样本特征空间XdFrom all the feature information output by the d-th residual sub-module, select the visible light feature information and infrared light feature information of P pedestrians, and the visible light feature information vi,d and infrared light feature information ti,d of each pedestrian are respectively Select K pieces of feature information to construct the d-th sample feature space Xd ;再将所述第d个样本特征空间Xd一起输入到后续的全连接层S2中,并输出第d组可见光特征向量v′d和红外光特征向量t′dThen input the d-th sample feature space Xd into the subsequent fully connected layer S2 , and output the d-th group of visible light feature vectors v′d and infrared light feature vectors t′d ;步骤6、将所述第d-1组可见光特征向量v′d-1输入所述可见光图像分类器中,输出可见光的初始概率分布GV,将第d-1组红外光特征向量t′d-1输入到所述红外光图像分类器中,并输出红外光的初始概率分布GT;Step 6. Input the d-1th group of visible light feature vectors v′d-1 into the visible light image classifier, output the initial probability distribution GV of visible light, and put the d-1th group of infrared light feature vectors t′d- 1 is input into the infrared light image classifier, and the initial probability distribution GT of infrared light is output;利用式(1)构建身份损失函数LIDUse formula (1) to construct the identity loss function LID :
Figure FDA0002820577280000021
Figure FDA0002820577280000021
步骤7、从所述第d-1个样本特征空间Xd-1中选择第a个行人的第k个特征信息
Figure FDA0002820577280000022
记为锚点样本特征向量,则与锚点样本特征向量
Figure FDA0002820577280000023
具有相同身份信息的第a个行人的第z个特征信息
Figure FDA0002820577280000024
记为第z个正样本特征向量,与
Figure FDA0002820577280000025
具有不同身份信息的第f个行人的第c个特征信息
Figure FDA0002820577280000026
记为第c个负样本特征向量,则利用式(2)建立混合三元损失函数LTRI1(Xd-1):
Step 7. Select the k-th feature information of the a-th pedestrian from the d-1-th sample feature space Xd-1
Figure FDA0002820577280000022
Denoted as the anchor point sample feature vector, then the same as the anchor point sample feature vector
Figure FDA0002820577280000023
The z-th feature information of the a-th pedestrian with the same identity information
Figure FDA0002820577280000024
Denoted as the zth positive sample feature vector, and
Figure FDA0002820577280000025
The c-th feature information of the f-th pedestrian with different identity information
Figure FDA0002820577280000026
Denoted as the c-th negative sample feature vector, then use formula (2) to establish a mixed ternary loss function LTRI1 (Xd-1 ):
Figure FDA0002820577280000027
Figure FDA0002820577280000027
式(2)中,
Figure FDA0002820577280000028
代表锚点样本特征向量
Figure FDA0002820577280000029
与第z个正样本特征向量
Figure FDA00028205772800000210
的欧式距离,
Figure FDA00028205772800000211
代表锚点样本特征向量与第f个行人的第c个负样本特征向量
Figure FDA00028205772800000212
的欧式距离,ρ1为混合三元损失函数LTRI1(Xd-1)预定义的最小间隔;
In formula (2),
Figure FDA0002820577280000028
Represents the anchor sample feature vector
Figure FDA0002820577280000029
with the zth positive sample feature vector
Figure FDA00028205772800000210
the Euclidean distance,
Figure FDA00028205772800000211
Represents the anchor point sample feature vector and the c-th negative sample feature vector of the f-th pedestrian
Figure FDA00028205772800000212
The Euclidean distance of , ρ1 is the minimum interval predefined by the hybrid ternary loss function LTRI1 (Xd-1 );
步骤8、从所述第d个样本特征空间Xd中选择第r个行人的第s个特征信息
Figure FDA00028205772800000213
记为锚点样本特征向量,则与锚点样本特征向量
Figure FDA00028205772800000214
具有相同身份信息的第r个行人的第b个特征信息
Figure FDA00028205772800000215
记为第b个正样本特征向量,与
Figure FDA00028205772800000216
具有不同身份信息的第h个行人的第q个特征信息
Figure FDA00028205772800000217
记为第q个负样本特征向量,则利用式(3)建立混合三元损失函数LTRI2(Xd):
Step 8. Select the s-th feature information of the r-th pedestrian from the d-th sample feature space Xd
Figure FDA00028205772800000213
Denoted as the anchor point sample feature vector, then the same as the anchor point sample feature vector
Figure FDA00028205772800000214
The b-th feature information of the r-th pedestrian with the same identity information
Figure FDA00028205772800000215
Denoted as the b-th positive sample feature vector, and
Figure FDA00028205772800000216
The qth feature information of the hth pedestrian with different identity information
Figure FDA00028205772800000217
Denote it as the qth negative sample feature vector, then use formula (3) to establish a mixed ternary loss function LTRI2 (Xd ):
Figure FDA0002820577280000031
Figure FDA0002820577280000031
式(3)中,
Figure FDA0002820577280000032
代表锚点样本特征向量
Figure FDA0002820577280000033
与第b个正样本特征向量
Figure FDA0002820577280000034
的欧式距离,
Figure FDA0002820577280000035
代表锚点样本特征向量与第h个行人的第q个负样本特征向量
Figure FDA0002820577280000036
的欧式距离,ρ2为混合三元损失函数LTRI2(Xd)预定义的最小间隔;
In formula (3),
Figure FDA0002820577280000032
Represents the anchor sample feature vector
Figure FDA0002820577280000033
with the b-th positive sample feature vector
Figure FDA0002820577280000034
the Euclidean distance,
Figure FDA0002820577280000035
Represents the anchor point sample feature vector and the qth negative sample feature vector of the hth pedestrian
Figure FDA0002820577280000036
The Euclidean distance of , ρ2 is the minimum interval predefined by the hybrid ternary loss function LTRI2 (Xd );
步骤9、利用式(4)建立混合三元损失函数LTRIStep 9. Use formula (4) to establish a mixed ternary loss function LTRI :LTRI=LTRI1+LTRI2 (4)LTRI =LTRI1 +LTRI2 (4)利用式(5)建立全局损失函数LALLUse formula (5) to establish the global loss function LALL :LALL=LID+βLTRI (5)LALL = LID + βLTRI (5)式(5)中,β表示混三元损失函数LTRI的系数;In formula (5), β represents the coefficient of the mixed ternary loss function LTRI ;通过随机梯度下降法对式(5)进行优化求解,并进行梯度反向传播,训练所述对称卷积神经网的各个参数,得到初步训练后的对称卷积神经网络模型;Formula (5) is optimized and solved by the stochastic gradient descent method, and gradient back-propagation is performed, and each parameter of the symmetric convolutional neural network is trained to obtain the symmetric convolutional neural network model after preliminary training;步骤10、将所述第d-1组可见光特征向量v′d-1输入所述初步训练后的对称卷积神经网络模型中可见光图像分类器中,输出可见光的概率分布GV′,将第d-1组红外光特征向量t′d-1输入到所述初步训练后的对称卷积神经网络模型中红外光图像分类器中,并输出红外光的概率分布GT′;将所述第d-1组可见光特征向量v′d-1输入到所述初步训练后的对称卷积神经网络模型中红外光分类器中得到伪可见光概率分布GV″;Step 10: Input the visible light feature vector v'd-1 of the d-1th group into the visible light image classifier in the symmetric convolutional neural network model after preliminary training, output the probability distribution GV' of visible light, and put the d-th -1 group of infrared light feature vectors t'd-1 are input into the infrared light image classifier in the symmetric convolutional neural network model after preliminary training, and the probability distribution GT' of infrared light is output; the d- A group of visible light feature vectors v′d-1 are input into the infrared light classifier in the symmetric convolutional neural network model after preliminary training to obtain a pseudo visible light probability distribution GV″;利用式(6)构建所述伪可见光特征向量GV″与可见光概率分布GV′之间的散度损失函数LKLUse formula (6) to construct the divergence loss function LKL between the pseudo visible light feature vector GV″ and the visible light probability distribution GV′:LKL=KL(GV″,GV′) (6)LKL =KL(GV″,GV′) (6)式(6)中,KL(·,·)表示两者概率分布的差异值;In formula (6), KL(·,·) represents the difference between the two probability distributions;利用式(7)建立鉴别器损失函数LDISUse equation (7) to establish the discriminator loss function LDIS :LDIS=LID-αLKL (7)LDIS =LID -αLKL (7)式(7)中,α代表LKL的系数;In formula (7), α represents the coefficient of LKL ;步骤11、利用式(8)建立生成器损失函数LGENStep 11. Use formula (8) to establish the generator loss function LGEN :LGEN=αLKL+βLTRI (8)LGEN = αLKL + βLTRI (8)步骤12、通过梯度下降法依次对式(5)、式(7)、式(8)进行优化求解:Step 12: Optimizing and solving equations (5), (7), and (8) in turn by gradient descent method:首先对式(5)进行优化求解,训练网络所有参数;First, optimize and solve equation (5), and train all parameters of the network;其次对式(7)进行优化求解,在梯度反向传播过程中,仅对鉴别器的梯度进行反向传播,将生成器的梯度置零,从而冻结生成器参数,训练鉴别器参数;Secondly, formula (7) is optimized and solved. In the process of gradient backpropagation, only the gradient of the discriminator is backpropagated, and the gradient of the generator is set to zero, thereby freezing the generator parameters and training the discriminator parameters;最后对式(8)进行优化求解,在梯度反向传播过程中,仅对生成器的梯度进行反向传播,将鉴别器的梯度置零,从而冻结鉴别器参数,训练生成器参数;Finally, formula (8) is optimized and solved. In the process of gradient backpropagation, only the gradient of the generator is backpropagated, and the gradient of the discriminator is set to zero, thereby freezing the discriminator parameters and training the generator parameters;依次训练后使得LALL,LDIS,LGEN在对抗学习中收敛到最优,当LDIS达到最优时,鉴别器达到最优,当LGEN达到最优时,生成器达到最优,从而获得最终的对称卷积神经网络跨模态行人重识别模型;After training in sequence, LALL , LDIS , and LGEN converge to the optimum in adversarial learning. When LDIS reaches the optimum, the discriminator reaches the optimum, and when LGEN reaches the optimum, the generator reaches the optimum, thus Obtain the final symmetric convolutional neural network cross-modal person re-identification model;步骤13、利用最终的对称卷积神经网络模型对跨模态行人重识别进行查询匹配;Step 13, using the final symmetric convolutional neural network model to query and match cross-modal pedestrian re-identification;将待查询的行人图像输入最终的对称卷积神经网络模型中提取特征,然后与检索库中行人的特征进行相似度比对,并按照相似度的高低排序,从排序列表中找到对应的行人身份信息,从而得到识别结果。Input the pedestrian image to be queried into the final symmetric convolutional neural network model to extract features, then compare the similarity with the features of the pedestrians in the retrieval database, and sort them according to the similarity, and find the corresponding pedestrian identity from the sorted list. information to obtain identification results.
CN202011430914.3A2020-12-072020-12-07Cross-modal pedestrian re-identification method based on symmetric convolutional neural networkActiveCN112434654B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202011430914.3ACN112434654B (en)2020-12-072020-12-07Cross-modal pedestrian re-identification method based on symmetric convolutional neural network

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202011430914.3ACN112434654B (en)2020-12-072020-12-07Cross-modal pedestrian re-identification method based on symmetric convolutional neural network

Publications (2)

Publication NumberPublication Date
CN112434654Atrue CN112434654A (en)2021-03-02
CN112434654B CN112434654B (en)2022-09-13

Family

ID=74692582

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202011430914.3AActiveCN112434654B (en)2020-12-072020-12-07Cross-modal pedestrian re-identification method based on symmetric convolutional neural network

Country Status (1)

CountryLink
CN (1)CN112434654B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113033438A (en)*2021-03-312021-06-25四川大学Data feature learning method for modal imperfect alignment
CN113112534A (en)*2021-04-202021-07-13安徽大学Three-dimensional biomedical image registration method based on iterative self-supervision
CN113610180A (en)*2021-08-172021-11-05湖南工学院 Ship classification method and device based on deep learning fusion of visible light image and infrared image
CN113627272A (en)*2021-07-192021-11-09上海交通大学Serious misalignment pedestrian re-identification method and system based on normalization network
CN114550210A (en)*2022-02-212022-05-27中国科学技术大学 Person Re-identification Method Based on Modality Adaptive Mixing and Invariant Convolution Decomposition
CN114817655A (en)*2022-03-172022-07-29北京达佳互联信息技术有限公司Cross-modal retrieval method, network training method, device, equipment and medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN105608450A (en)*2016-03-012016-05-25天津中科智能识别产业技术研究院有限公司Heterogeneous face identification method based on deep convolutional neural network
KR101908481B1 (en)*2017-07-242018-12-10동국대학교 산학협력단Device and method for pedestraian detection
CN110580460A (en)*2019-08-282019-12-17西北工业大学 Pedestrian re-identification method based on joint identification and verification of pedestrian identity and attribute features
CN110956094A (en)*2019-11-092020-04-03北京工业大学 A RGB-D Multimodal Fusion Person Detection Method Based on Asymmetric Two-Stream Network
CN111325115A (en)*2020-02-052020-06-23山东师范大学Countermeasures cross-modal pedestrian re-identification method and system with triple constraint loss
CN111539255A (en)*2020-03-272020-08-14中国矿业大学Cross-modal pedestrian re-identification method based on multi-modal image style conversion
CN111597876A (en)*2020-04-012020-08-28浙江工业大学 A Cross-modal Person Re-identification Method Based on Difficult Quintuple
US20200302176A1 (en)*2019-03-182020-09-24Nvidia CorporationImage identification using neural networks
CN111767882A (en)*2020-07-062020-10-13江南大学 A Multimodal Pedestrian Detection Method Based on Improved YOLO Model
CN111898510A (en)*2020-07-232020-11-06合肥工业大学 A Cross-modal Pedestrian Re-identification Method Based on Progressive Neural Networks
CN111931637A (en)*2020-08-072020-11-13华南理工大学Cross-modal pedestrian re-identification method and system based on double-current convolutional neural network
CN111985313A (en)*2020-07-092020-11-24上海交通大学Multi-style pedestrian re-identification method, system and terminal based on counterstudy

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN105608450A (en)*2016-03-012016-05-25天津中科智能识别产业技术研究院有限公司Heterogeneous face identification method based on deep convolutional neural network
KR101908481B1 (en)*2017-07-242018-12-10동국대학교 산학협력단Device and method for pedestraian detection
US20200302176A1 (en)*2019-03-182020-09-24Nvidia CorporationImage identification using neural networks
CN110580460A (en)*2019-08-282019-12-17西北工业大学 Pedestrian re-identification method based on joint identification and verification of pedestrian identity and attribute features
CN110956094A (en)*2019-11-092020-04-03北京工业大学 A RGB-D Multimodal Fusion Person Detection Method Based on Asymmetric Two-Stream Network
CN111325115A (en)*2020-02-052020-06-23山东师范大学Countermeasures cross-modal pedestrian re-identification method and system with triple constraint loss
CN111539255A (en)*2020-03-272020-08-14中国矿业大学Cross-modal pedestrian re-identification method based on multi-modal image style conversion
CN111597876A (en)*2020-04-012020-08-28浙江工业大学 A Cross-modal Person Re-identification Method Based on Difficult Quintuple
CN111767882A (en)*2020-07-062020-10-13江南大学 A Multimodal Pedestrian Detection Method Based on Improved YOLO Model
CN111985313A (en)*2020-07-092020-11-24上海交通大学Multi-style pedestrian re-identification method, system and terminal based on counterstudy
CN111898510A (en)*2020-07-232020-11-06合肥工业大学 A Cross-modal Pedestrian Re-identification Method Based on Progressive Neural Networks
CN111931637A (en)*2020-08-072020-11-13华南理工大学Cross-modal pedestrian re-identification method and system based on double-current convolutional neural network

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BO LI等: "Visible Infrared Cross-Modality Person Re-Identification Network Based on Adaptive Pedestrian Alignment", 《IEEE ACCESS》*
JIN KYU KANG等: "Person Re-Identification Between Visible and Thermal Camera Images Based on Deep Residual CNN Using Single Input", 《IEEE ACCESS》*
王海彬: "基于深度特征的跨模态行人重识别技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》*
郑爱华等: "基于局部异质协同双路网络的跨模态行人重识别", 《模式识别与人工智能》*

Cited By (10)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113033438A (en)*2021-03-312021-06-25四川大学Data feature learning method for modal imperfect alignment
CN113033438B (en)*2021-03-312022-07-01四川大学Data feature learning method for modal imperfect alignment
CN113112534A (en)*2021-04-202021-07-13安徽大学Three-dimensional biomedical image registration method based on iterative self-supervision
CN113112534B (en)*2021-04-202022-10-18安徽大学Three-dimensional biomedical image registration method based on iterative self-supervision
CN113627272A (en)*2021-07-192021-11-09上海交通大学Serious misalignment pedestrian re-identification method and system based on normalization network
CN113627272B (en)*2021-07-192023-11-28上海交通大学Serious misalignment pedestrian re-identification method and system based on normalization network
CN113610180A (en)*2021-08-172021-11-05湖南工学院 Ship classification method and device based on deep learning fusion of visible light image and infrared image
CN114550210A (en)*2022-02-212022-05-27中国科学技术大学 Person Re-identification Method Based on Modality Adaptive Mixing and Invariant Convolution Decomposition
CN114550210B (en)*2022-02-212024-04-02中国科学技术大学 Pedestrian re-identification method based on modality adaptive mixing and invariant convolution decomposition
CN114817655A (en)*2022-03-172022-07-29北京达佳互联信息技术有限公司Cross-modal retrieval method, network training method, device, equipment and medium

Also Published As

Publication numberPublication date
CN112434654B (en)2022-09-13

Similar Documents

PublicationPublication DateTitle
CN112434654B (en)Cross-modal pedestrian re-identification method based on symmetric convolutional neural network
CN111539370B (en)Image pedestrian re-identification method and system based on multi-attention joint learning
Wang et al.Mancs: A multi-task attentional network with curriculum sampling for person re-identification
Zhang et al.Deep-IRTarget: An automatic target detector in infrared imagery using dual-domain feature extraction and allocation
Zhang et al.Differential feature awareness network within antagonistic learning for infrared-visible object detection
Hao et al.HSME: Hypersphere manifold embedding for visible thermal person re-identification
CN111325115B (en) Adversarial cross-modal person re-identification method and system with triple constraint loss
CN107330396B (en) A pedestrian re-identification method based on multi-attribute and multi-strategy fusion learning
CN113283362A (en)Cross-modal pedestrian re-identification method
CN110070066A (en)A kind of video pedestrian based on posture key frame recognition methods and system again
CN112784768A (en)Pedestrian re-identification method for guiding multiple confrontation attention based on visual angle
CN107315795B (en)The instance of video search method and system of joint particular persons and scene
CN119648999A (en) A target detection method, system, device and medium based on cross-modal fusion and guided attention mechanism
CN112507853B (en)Cross-modal pedestrian re-recognition method based on mutual attention mechanism
CN110309770A (en) A Vehicle Re-Identification Method Based on Quadruple Loss Metric Learning
CN113269099B (en)Vehicle re-identification method under heterogeneous unmanned system based on graph matching
CN113887353A (en)Visible light-infrared pedestrian re-identification method and system
CN113761995A (en) A Cross-modal Pedestrian Re-identification Method Based on Double Transform Alignment and Blocking
CN117333908A (en) Cross-modal pedestrian re-identification method based on posture feature alignment
CN113869151B (en)Cross-view gait recognition method and system based on feature fusion
CN116597177B (en) A multi-source image block matching method based on dual-branch parallel deep interactive collaboration
CN114743162A (en)Cross-modal pedestrian re-identification method based on generation of countermeasure network
CN119649399A (en) Cross-modal person re-identification method based on multi-scale cross-attention Transformer
Zhang et al.Two-stage domain adaptation for infrared ship target segmentation
Zhang et al.Attention-aware scoring learning for person re-identification

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp