Detailed Description
As can be seen from the background art, the existing pixel alignment method for multiple images has complex calculation, long time consumption, and low pixel alignment efficiency during the pixel alignment process, so how to efficiently and quickly align the pixels of the infrared image, the depth image, and the color image becomes a problem that needs to be solved urgently.
In order to achieve the purpose of further improving the pixel alignment efficiency of the infrared image, the depth image and the color image, an embodiment of the present invention provides an image acquisition method, including: acquiring a depth image, an infrared image and a color image of a target area; acquiring a transmission transformation matrix of aligning the depth image and the infrared image to the color image; and performing transmission transformation on the depth image and the infrared image according to the transmission transformation matrix to obtain the depth image and the infrared image which are aligned with the color image pixels.
The image acquisition method provided by the embodiment of the invention has the advantages that after the depth image, the infrared image and the color image which need to be subjected to pixel alignment are acquired, a transmission transformation matrix for aligning the depth image and the infrared image to a color image is acquired, and the depth image and the infrared image are subjected to transmission transformation according to the transmission transformation matrix to obtain the depth image and the infrared image which are aligned with the color image pixels, the pixel alignment of the depth image, the infrared image and the color image is realized by utilizing the transmission transformation matrix to carry out transmission transformation on the depth image and the infrared image, the coordinate of each pixel of the depth image and the infrared image after being aligned with the color image pixel is avoided from being calculated pixel by pixel, the calculation amount is greatly reduced, the pixel alignment process is simplified, the efficiency of aligning the pixels of the depth image, the infrared image and the color image is improved on the basis of ensuring the accuracy of pixel alignment.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, it will be appreciated by those of ordinary skill in the art that numerous technical details are set forth in order to provide a better understanding of the present application in various embodiments of the present invention. However, the technical solution claimed in the present application can be implemented without these technical details and various changes and modifications based on the following embodiments. The following embodiments are divided for convenience of description, and should not constitute any limitation to the specific implementation manner of the present invention, and the embodiments may be mutually incorporated and referred to without contradiction.
The following embodiments are provided to describe the implementation details of the image capturing method in detail, and the following embodiments are only provided to facilitate understanding of the implementation details and are not necessary to implement the present invention.
In a specific application, the image acquisition method is applied to electronic devices such as a camera, a terminal, a robot, and the like capable of acquiring multiple images, and this embodiment is described by taking a camera applied to acquiring multiple images as an example, and the image acquisition process is described with reference to fig. 1, and includes the following steps:
step 101, obtaining a depth image, an infrared image and a color image of a target area.
Specifically, when a camera for acquiring a face image in a scene of face recognition or security authentication receives a service request, a control processing module of the camera synchronously controls an RGB camera (color camera) and a depth camera to perform image shooting on a specified target area according to preset working states and parameters, and acquires a depth image, an infrared image and a color image of the target area, wherein the depth image and the infrared image are shot by the depth camera, and the color image is shot by the RGB camera.
In addition, after the depth image, the infrared image and the color image of the target area are acquired, the control processing module of the camera also adjusts the color image in advance for the alignment effect, and adjusts the size of the color image to the size matched with the depth image or the infrared image. As shown in fig. 2, the image adjustment diagram of the color image is that the depth image and the infrared image are acquired by the depth camera, the color image is acquired by the RGB camera, and the field angle (FOV) of the image thumbnail (RAW image) of the color image acquired by the RGB camera is far beyond the field angle of the image thumbnail of the infrared image and the depth image acquired by the depth camera, so that the color image needs to be adjusted to an image in a common area with the field angle of the infrared image or the depth image for the alignment effect. For example, the resolution of the infrared and depth images is 400px 640px, the field angle is V50.3 × H73.9, the resolution of the color image is 540px 960px, the field angle is V56.8 × H88.16; the aim of pixel alignment is to align the color image, the infrared image and the depth image into an image with a resolution of 480px 768px, the control processing module firstly cuts and scales the color image according to preset adjustment rules and internal parameters of the camera, and adjusts the color image into an image with a resolution of 466px 746px and a field angle of V50 × H75, so that the color image is adjusted into an image matched with the field angles of the depth image and the infrared image, and the three images are conveniently aligned into an image with a resolution of 480px 768 px. The color image is adjusted into an image matched with the depth image and the infrared image by adjusting the image of the color image, and the effect and the efficiency of subsequent pixel alignment are ensured by utilizing the preprocessing of the image.
And 102, acquiring a transmission transformation matrix for aligning the depth image and the infrared image to the color image.
Specifically, after the depth image, the infrared image and the color image are acquired, a control processing module of the camera acquires a transmission transformation matrix for aligning the depth image and the infrared image to the color image by reading or calculating.
In one example, a control processing module of a camera acquires a transmission transformation matrix for aligning a depth image and an infrared image to a color image, comprising: selecting a plurality of feature points in the depth image, and acquiring a first pixel coordinate of each feature point in the depth image; acquiring a coordinate transformation matrix, and acquiring second pixel coordinates of each characteristic point in the color image according to the coordinate transformation matrix and the first pixel coordinates of each characteristic point; acquiring a transmission transformation matrix according to the first pixel coordinate and the second pixel coordinate; the coordinate transformation matrix is obtained according to an internal reference matrix of the depth camera for obtaining the depth image and the infrared image, an internal reference matrix of the color camera for obtaining the color image and a rotation and translation matrix between the depth camera and the color camera. According to the first pixel coordinates of each feature point in the depth image and the coordinate transformation matrix containing the internal and external parameters of the depth camera and the color camera, the second pixel coordinates of each feature point in the color image are calculated, the accuracy of coordinate transformation between the first pixel coordinates and the second pixel coordinates is guaranteed, and the accuracy of the generated transmission transformation matrix is further guaranteed.
Further, the control processing module of the camera selects a plurality of feature points in the depth image, including: taking the center of the depth image as the center of the polygon frame, and searching the polygon frame in the depth image according to the preset initial side length and the side length increasing step length until a target polygon frame meeting the conditions is searched; the condition is that the depth values of all the vertexes of the polygon frame searched currently are not zero; and taking a plurality of vertexes of the target polygon frame as each feature point of the depth image. For example, when the feature point is selected, the control processing module uses the center of the depth image as the center of the square frame, uses 10px as the initial side length, and the side length increasing step length of 10px searches the square frame in the depth image, when the depth values of 4 vertexes of the initial square frame with the side length of 10px are not zero, uses the initial square frame as the target square frame, uses 4 vertexes of the initial square frame as 4 feature points, and stops the polygon frame search; when a vertex with a depth value of 0 exists in 4 vertexes of an initial square frame with a side length of 10px, detecting the depth value of the vertex of the square frame with a side length of 20px with the center of the depth image as the center, if the depth values of the 4 vertexes of the square frame with a side length of 20px are not zero, using the square frame with a side length of 20px as a target square frame, using the 4 vertexes of the square frame with a side length of 20px as 4 feature points, stopping the polygon frame search, detecting the depth value of the vertex of the square frame with a side length of 30px with the center of the depth image as the center when the vertex with a depth value of 0 exists in the 4 vertexes of the initial square frame with a side length of 20px, and so on until a target square frame satisfying the condition is detected, and using the vertex of the target square frame as the feature point. By means of searching out the target polygon frame by taking the center of the depth image as the center to obtain the feature points, the feature point obtaining efficiency is guaranteed, meanwhile, the fact that the obtained feature points cannot be used for subsequent coordinate calculation is avoided, and accuracy and success rate of image alignment are guaranteed.
When feature points are selected in the depth image, the searched polygon frame has more than 4 vertexes, the specific type of the polygon frame is not limited, meanwhile, the length of the side increase step can be adjusted according to needs, and the specific limitation is not also imposed by the embodiment.
Further, a control processing module in the camera calculates second pixel coordinates of each characteristic point in the color image according to the coordinate transformation matrix and the first pixel coordinates of each characteristic point; the method comprises the following steps: calculating a second pixel coordinate (U) of each feature point according to the following formulaR ,VR ):
Wherein r isij For the elements of the ith row and ith column in the coordinate transformation matrix, ZL As depth values of feature points, UL Is the abscissa, V, of the first pixel coordinateL Is the ordinate of the first pixel coordinate. Taking the searched target polygon as a square as an example, the first pixel coordinates of four feature points acquired in the depth image are (U) respectivelyL1 ,VL1 )、(UL2 ,VL2 )、(UL3 ,VL3 ) And (U)L4 ,VL4 ) The depth value corresponding to each feature point is ZL1 、ZL2 、,ZL3 And ZL4 The coordinate transformation matrix R obtained from the internal reference matrix of the depth camera, the internal reference matrix of the color camera and the rotational translation matrix between the depth camera and the color camera is as follows:
wherein r isij The parameters of the ith row and the ith column of the coordinate transformation matrix are calculated according to the internal reference matrix of the depth camera, the internal reference matrix of the color camera and the rotation translation matrix between the depth camera and the color camera. According to the initial calculation formula of coordinate conversion and the first pixel coordinate (U) of each feature point in the depth imageL ,VL ) Calculating the second pixel coordinate (U) of each feature point in the color imageR ,VR ):
Wherein Z isL For depth values of feature points in the depth image, ZR For the depth value of the feature point in the color image, the initial calculation formula is simplified to obtain the following formula:
therefore, the control processing module can directly calculate the second pixel coordinates (U) of the four feature points in the color image according to the formulaR1 ,VR1 )、(UR2 ,VR2 )、(UR3 ,VR3 ) And (U)R4 ,VR4 ). The simplified formula is directly used for calculating the second pixel coordinate, so that excessive calculation data are avoided, the calculation process is simplified, and the efficiency of acquiring the second pixel coordinate is improved on the basis of ensuring the accuracy of the sitting standard conversion.
In addition, when the control processing module calculates the second pixel coordinate, the calculation formula of the second pixel coordinate may be directly read from its own storage module for calculation, or the calculation formula of the second pixel coordinate is obtained from a storage device connected to the control processing module through communication interaction for calculation, or the calculation formula of the second pixel coordinate is calculated by the control processing module in real time according to the internal and external parameters of the camera, which is not limited in this embodiment.
And 103, performing transmission transformation on the depth image and the infrared image according to the transmission transformation matrix to obtain the depth image and the infrared image which are aligned with the color image pixels.
Specifically, after acquiring a transmission transformation matrix for aligning the infrared image and the depth image to the color image, the control processing module of the camera performs transmission transformation on the depth image and the infrared image according to the transmission transformation matrix to acquire the depth image and the infrared image aligned with the color image pixel, that is, the control processing module rotates and translates the depth image and the infrared image according to the transmission transformation matrix to align each pixel point of the depth image and the infrared image with each pixel point of the color image, so as to acquire the color image, the depth image and the infrared image with consistent resolution and view angle.
In another example, the control processing module of the camera, after obtaining the depth image and the infrared image aligned with the color image pixels, further includes: and carrying out face detection on the color image and the infrared image after pixel alignment, and adjusting the exposure and gain in the process of acquiring the infrared image or the color image according to the result of the face detection. The infrared image and the color image which are aligned in pixels are respectively subjected to face detection, and exposure and gain adjustment in the image acquisition process are carried out according to the face detection result, so that the quality of the obtained image face area can be accurately improved.
Further, the control processing module of the camera adjusts exposure and gain in the process of acquiring the infrared image or the color image according to the result of the face detection, and comprises the following steps: under the condition that the infrared image and the color image can both detect the human face and the overlapping degree of the human face frame of the infrared image and the human face frame of the color image does not meet a preset threshold, adjusting exposure and gain in the process of acquiring the infrared image or the color image according to the detection result of the human face confidence coefficient; and under the condition that the infrared image and the color image can not detect the human face, calculating the target brightness of the other image according to the coordinate information of the human face frame of the image capable of detecting the human face, and adjusting the exposure and the gain in the process of acquiring the other image according to the target brightness.
Specifically, after face detection is performed on a color image and an infrared image, and a face can be detected in both the infrared image and the color image, then the overlapping degree (IOU) of face frames in the two images is detected, when the IOU meets a preset threshold value, the environment is determined to be normal, the quality of the two images is also high, exposure and gain are not adjusted, when the IOU does not meet the preset threshold value, the image with poor image quality is determined to exist, face confidence detection is performed on the face in the two images, and the exposure and gain in the infrared image or the color image acquisition process are adjusted according to the face confidence detection result. When the human face is not detected in the infrared image and the color image, the problem that the image quality of the image in which the human face cannot be detected is low is judged, calculating the brightness of the face area in the image according to the face frame coordinate information of the image capable of detecting the face, and the calculated brightness value is taken as the target brightness of the face region of the image with lower image quality, the exposure and gain during image acquisition are adjusted according to the brightness of the target, e.g., a color image can detect a human face, an infrared image cannot detect a human face, because the pixels are aligned, the pixels of all pixel points in the face frame are superposed to obtain the brightness of the face area according to the coordinate information of the face frame in the color image, the obtained face area brightness is used as the target brightness of the face area of the infrared image, and exposure and gain in the infrared image acquisition process are adjusted according to the target brightness; and when the color image cannot detect the face, superposing pixels of all pixel points in the face frame according to coordinate information of the face frame in the infrared image to obtain face region brightness, taking the obtained face region brightness as target brightness of the face region of the color image, and adjusting exposure and gain in the color image acquisition process according to the target brightness. According to the detection result of the face region, exposure and gain in the image acquisition process with low image quality are accurately adjusted, the quality of the face region in the obtained image can be effectively improved, and the accuracy of face recognition is improved conveniently.
Furthermore, the control processing module of the camera adjusts exposure and gain in the process of acquiring the infrared image or the color image according to the detection result of the face confidence, and comprises the following steps: taking the image to which the face with high confidence coefficient belongs as a first image, and taking the image to which the face with low confidence coefficient belongs as a second image; and calculating a target brightness value of the second image according to the coordinate information of the face frame in the first image, and adjusting exposure and gain in the second image acquisition process according to the target brightness value. For example, if the confidence level of the face detected in the infrared image is lower than the confidence level of the face detected in the color image, the color image is used as a first image, the infrared image is used as a second image, the brightness of the face region in the color image is calculated according to the coordinate information of the face frame in the color image, the calculated brightness value is used as the target brightness value of the face region in the infrared image, and the exposure and the gain in the infrared image acquisition process are adjusted. By taking the face with high confidence as the correct face, the exposure and the gain in the image acquisition process with low confidence of the detected face are adjusted, so that the face quality of the acquired image is accurately improved, and the accurate face recognition is facilitated.
In addition, under the condition that human faces are detected in both the color image and the infrared image, the control processing module judges that no human face exists in the field angle, and does not adjust exposure and gain, so that the influence on the normal use of the camera is avoided.
Another aspect of the embodiments of the present invention provides an image capturing apparatus, with reference to fig. 3, including:
an acquiringmodule 301, configured to acquire a depth image, an infrared image, and a color image of a target area.
And a determiningmodule 302, configured to acquire a transmission transformation matrix for aligning the depth image and the infrared image to the color image.
And thealignment module 303 is configured to perform transmission transformation on the depth image and the infrared image according to the transmission transformation matrix, and acquire the depth image and the infrared image aligned with the color image pixel.
It should be understood that the present embodiment is an apparatus embodiment corresponding to the method embodiment, and the present embodiment can be implemented in cooperation with the method embodiment. The related technical details mentioned in the method embodiment are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related art details mentioned in the present embodiment can also be applied in the method embodiment.
It should be noted that, all the modules involved in this embodiment are logic modules, and in practical application, one logic unit may be one physical unit, may also be a part of one physical unit, and may also be implemented by a combination of multiple physical units. In addition, in order to highlight the innovative part of the present invention, a unit which is not so closely related to solve the technical problem proposed by the present invention is not introduced in the present embodiment, but this does not indicate that there is no other unit in the present embodiment.
Another aspect of the embodiments of the present application further provides an electronic device, with reference to fig. 4, including: at least oneprocessor 401; and amemory 402 communicatively coupled to the at least oneprocessor 401; thememory 402 stores instructions executable by the at least oneprocessor 401, and the instructions are executable by the at least oneprocessor 401 to enable the at least oneprocessor 401 to perform the image capturing method described in any of the above method embodiments.
Where thememory 402 and theprocessor 401 are coupled by a bus, which may include any number of interconnected buses and bridges that couple one or more of the various circuits of theprocessor 401 and thememory 402 together. The bus may also connect various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. A bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by theprocessor 401 may be transmitted over a wireless medium via an antenna, which may receive the data and transmit the data to theprocessor 401.
Theprocessor 401 is responsible for managing the bus and general processing and may provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. Andmemory 402 may be used to store data used byprocessor 401 in performing operations.
Embodiments of the present invention also provide a computer-readable storage medium storing a computer program. The computer program realizes the above-described method embodiments when executed by a processor.
That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples for carrying out the present application, and that various changes in form and details may be made therein without departing from the spirit and scope of the present application in practice.