Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application. Embodiments of the application and features of the embodiments may be combined with one another arbitrarily without conflict. Also, while a logical order is depicted in the flowchart, in some cases, the steps depicted or described may be performed in a different order than presented herein.
In order to facilitate understanding of the technical solution provided by the embodiments of the present application, some key terms used in the embodiments of the present application are explained here:
 The key points of the human face refer to the pixel points capable of describing the human face features in the human face image, and generally comprise the pixel points of the human face contour features and the pixel points of the five sense organs features. In general, the detection of the face keypoints may be performed based on a face keypoint detection model, and more common face keypoint models may be a face keypoint detection model based on 106 keypoints, a face keypoint detection model based on 256 keypoints, a face keypoint detection model based on 68 keypoints, and the like.
Scaling mapping, or liquefying deformation, refers to performing certain deformation treatment on the eyebrow region and the peripheral region by adopting a liquefying deformation treatment mode, so that the eyebrow region presents a shrinking effect in the face.
The standard face is an average face obtained by statistics based on a large amount of face data, and can represent the characteristic conditions of most faces.
The technical scheme of the embodiment of the application relates to an artificial intelligence technology, wherein artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) is a theory, a method, a technology and an application system which simulate, extend and expand human intelligence by using a digital computer or a machine controlled by the digital computer, sense environment, acquire knowledge and acquire an optimal result by using the knowledge. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.
The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operational interaction systems, electromechanical integration, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and other directions. With the development and progress of artificial intelligence, the artificial intelligence is developed and applied in various fields, such as common fields of smart home, smart customer service, virtual assistant, smart sound box, smart marketing, smart wearable equipment, unmanned driving, automatic driving, unmanned plane, robot, smart medical treatment, internet of vehicles, automatic driving, smart transportation, etc., and it is believed that with the further development of future technology, the artificial intelligence will be applied in more fields, playing an increasingly important value.
Machine learning (MACHINE LEARNING, ML) is the core of artificial intelligence, which is the fundamental way for computers to have intelligence that is applied throughout the various fields of artificial intelligence, while machine learning is deep learning, which is a technique for implementing machine learning. Machine learning typically includes deep learning, reinforcement learning, transfer learning, induction learning, artificial neural networks, teaching learning, etc., and deep learning includes convolutional neural networks (Convolutional Neural Networks, CNN), deep confidence networks, recurrent neural networks, automatic encoders, generation countermeasure networks, etc.
Computer Vision (CV) is a comprehensive discipline that integrates multiple disciplines such as Computer science, signal processing, physics, application mathematics, statistics, neurophysiology, and the like, and is a challenging and important research direction in the scientific field. Computer vision is a subject for researching how to make a machine "look at", and more specifically, the subject refers to that various imaging systems such as a camera and a computer are used for replacing human visual organs, machine vision processing such as recognition, tracking and measurement is performed on targets, and collected images are processed into images which are more suitable for human eyes to observe or are transmitted to an instrument to detect through further map processing.
Computer vision is taken as a scientific subject, and by researching related theory and technology, the computer is tried to be provided with the capability of observing and understanding the world through visual organs like human beings, and an artificial intelligence system capable of acquiring information from images or multidimensional data is established. Computer vision technologies typically include image processing, image recognition, image semantic understanding, image retrieval, optical character recognition (Optical Character Recognition, OCR), video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technology, virtual reality, augmented reality, synchronous positioning and mapping, autopilot, intelligent transportation, and the like, in addition to common biometric technologies such as face recognition, fingerprint recognition, and the like.
The augmented reality (Augmented Reality, AR) technology carries out simulation on entity information (such as visual information, sound information, taste information, touch information and the like) which is difficult to experience in a certain time and space range of the real world originally, and carries out simulation through scientific and technical means such as a computer and the like, and then superimposes virtual information on the real world, so that two different information of an environment of the real world and an object of the virtual world are displayed to a user in the same picture or space, and sensory experience exceeding reality is achieved.
The embodiment of the application mainly relates to deep learning, computer vision and other technologies in an artificial intelligence technology, namely a face detection model is obtained by adopting a deep learning mode, and the key points of the eyebrows in the face are detected by utilizing the trained face detection model so as to facilitate the subsequent eyebrow related processing by combining the computer vision technology, and the specific process is described through the subsequent embodiment, and is not repeated.
The following briefly describes the design concept of the embodiment of the present application.
At present, in video processing scenes, such as short video scenes or video live broadcasting scenes, there is often a demand for automatic makeup of human faces, while eyebrows are an indispensable part of the makeup, and the attractive eyebrows can effectively modify faces, enhance the stereoscopic impression of faces, and make facial expressions richer and lively at the same time, so that the quality of the makeup effect is directly affected. Generally, in actual life, users often make up eyebrows manually, firstly, the conditions of the eyebrows need to be judged to determine the eyebrow shape, and then the eyebrow shaping tool is used for shaping the eyebrows, which requires users to have a certain shaping experience, consumes time and materials, and is not easy to replace different eyebrow shapes in a short time.
In the related art, there is a scheme that makeup can be automatically applied to the eyebrows in an augmented reality manner, that is, the eyebrows are first identified by a detection or segmentation method, so as to obtain the area of the eyebrows, then, affine transformation is performed on a pure eyebrow template to obtain a deformed template, and then the deformed template is covered on the original eyebrows to obtain the makeup eyebrows, or after the original eyebrows are deducted, the color of the skin around the eyebrows is utilized to fill the eyebrows, and then, the texture of the template is attached.
However, these schemes often directly superimpose the eyebrow template on the face eyebrow area, and there may be a problem that the eyebrow template is not matched with the original eyebrow, so that the original eyebrow cannot be covered by the eyebrow template completely, and the make-up effect is poor. Based on this, the user can only select suitable eyebrow templates according to the eyebrow situation of the user, and can cover the original eyebrows, and when the user wants to try different eyebrow types, the related technical scheme still has certain limitation, and cannot show better make-up effect. However, the method of removing the original eyebrows still has challenges in filling the surrounding skin, and if the skin is improperly filled, abnormal display effects may be displayed around the eyebrows, for example, uneven skin color may be caused, and thus a better make-up effect may not be displayed.
Based on the above, the embodiment of the application provides an image rendering processing method, in the method, when virtual makeup is performed on the eyebrow of an image to be processed, a deformation area to be processed is determined based on key points of the eyebrow area, scaling mapping processing is performed on the deformation area to be processed, so that the eyebrow area is reduced, when the eyebrow with a makeup effect is added subsequently, the eyebrow area can be well covered, so that the makeup effect is better, in addition, when material rendering is performed, independent parallel rendering can be performed among pixels by establishing a pixel mapping relation, thereby improving the image rendering efficiency and correspondingly improving the real-time performance of video makeup.
After the design idea of the embodiment of the present application is introduced, some simple descriptions are made below for application scenarios applicable to the technical solution of the embodiment of the present application, and it should be noted that the application scenarios described below are only used for illustrating the embodiment of the present application and are not limiting. In the specific implementation process, the technical scheme provided by the embodiment of the application can be flexibly applied according to actual needs.
The scheme provided by the embodiment of the application can be applied to most of scenes with image rendering processing, such as scenes with image beautifying requirements, such as a live broadcast platform, a content sharing platform and the like, and can also be applied to daily scenes, such as video call, simulation trial makeup of an off-line special cabinet or an on-line electronic commerce platform. As shown in fig. 1, an application scenario diagram provided in an embodiment of the present application may include a terminal device 101 and a background server 102.
The terminal device 101 may be, for example, a mobile phone, a tablet computer (PAD), a notebook computer, a desktop computer, a smart television, a smart wearable device, and the like. The terminal device 101 may be installed with an application that can make image beautification, such as a social application, a beauty application, or a photographing application.
The background server 102 may be a background server corresponding to an application installed on the terminal device 101. The background server 102 may be, for example, an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, and basic cloud computing services such as big data and artificial intelligence platforms, but is not limited thereto.
The background server 102 may include one or more processors 1021, memory 1022, and I/O interfaces 1023 for interaction with terminals, etc. In addition, the background server 102 may also configure a database 1024, and the database 1024 may be used to store model data, image rendering processing parameter data, and the like. The memory 1022 of the background server 102 may further store program instructions of the image rendering method provided by the embodiment of the present application, where the program instructions, when executed by the processor 1021, may be used to implement the steps of the image rendering method provided by the embodiment of the present application, so as to perform image rendering processing on an image to be processed, and obtain a target image with eyebrow makeup effect.
In a possible implementation manner, the user provides the image to be processed in the application installed on the terminal device 101, for example, selects the image from the album, or shoots the image again, so that the terminal device 101 may upload the image to be processed to the background server 102, and the background server 102 processes the image to obtain the target image with virtual makeup effect according to the image rendering processing method provided by the embodiment of the present application, and returns the target image to the terminal device 101 for display.
In another possible implementation manner, the user performs video acquisition by using an application installed on the terminal device 101, and transmits the video stream to the background server 102, and the background server 102 processes video frames containing faces in the video stream based on the eyebrow template selected by the user and returns the processed video frames to the terminal device 102 in real time, so that the terminal device 102 can present the video with the make-up effect in real time in the process of video acquisition. In addition, in some scenarios, such as a live video scenario or a call video scenario, the background server 102 may also send the processed video frames to the terminal device 102 of the other user, and then the terminal device 102 of the other user may also be able to present the video with status.
In specific implementation, the image rendering process may also be performed on the terminal device 101 when the processing capability of the terminal device 101 allows, which is not limited by the embodiment of the present application.
The terminal device 101 and the background server 102 may be directly or indirectly connected through one or more networks 103. The network 103 may be a wired network, or may be a Wireless network, for example, a mobile cellular network, or may be a Wireless-Fidelity (WIFI) network, or may be other possible networks, which are not limited in this embodiment of the present application.
It should be noted that, in the embodiment of the present application, the number of terminal devices 101 may be one or plural, that is, the number of terminal devices 101 is not limited.
Of course, the method provided by the embodiment of the present application is not limited to the application scenario shown in fig. 1, but may be used in other possible application scenarios, and the embodiment of the present application is not limited. The functions that can be implemented by each device in the application scenario shown in fig. 1 will be described together in the following method embodiments, which are not described in detail herein.
Referring to fig. 2, a flowchart of an image rendering processing method according to an embodiment of the present application may be implemented by the background server 102 or the terminal device 101 in fig. 1, and the flowchart of the method is described below.
Step 201, face key point recognition is performed on a first to-be-processed image containing a face, and a plurality of first key points corresponding to a first actual eyebrow area of the face are recognized from the first to-be-processed image.
In the embodiment of the application, the first image to be processed can be a single image or a certain video frame in the video, when the first image to be processed is a single image, the image can be a shot image selected by a user from the album or a re-shot image by the user, and when the first image to be processed is a certain video frame in the video, the video can also be a shot video selected by the user from the album or a video recorded by the user in real time.
In practical application, when a user requests to add a virtual makeup effect to a video, face recognition can be performed on the video to determine whether a to-be-processed image containing the face exists, that is, the to-be-processed image needing to be subjected to image rendering, so that subsequent rendering processing is performed on the to-be-processed image with the face in the video.
In order to perform targeted processing on the eyebrow area, firstly, the area where the eyebrow is actually located needs to be identified from the image to be processed, here, in order to distinguish, the original eyebrow area in the image to be processed is referred to as a first actual eyebrow area, and the key points in the image to be processed are referred to as first key points, then the area where the eyebrow is located is identified, that is, a plurality of first key points need to be identified from the image to be processed, and the area surrounded by the plurality of first key points is the first actual eyebrow area.
Specifically, each first key point can be identified by a pre-trained face key point model, and any possible face key point model can be adopted, such as a face key point model based on 106 points, a face key point model based on 68 points and a face key point model based on 256 points. Taking a 106-point-based face key point model as an example, referring to fig. 2, the first key points of the first actual eyebrow area obtained based on the 106-point-based face key point model are shown as key points 1 to 19 shown in fig. 3, where key points 1 to 5 and 15 to 18 are key points of left eyebrows, and key points 6 to 14 are key points of right eyebrows.
In the specific implementation, the face key point model needs to be trained in advance to be put into practical use. When training is performed, a large number of training samples with face key points marked in advance can be adopted, and the constructed face key point model is trained in a plurality of circulating iterations until the iteration times reach the set times condition or the accuracy of the face key point model reaches the set conditions. The training samples can acquire marked training data from an open-source training data set, and can also mark the acquired face images to form the training samples. The face key point model may employ any possible deep neural network, which is not limited in this embodiment of the present application.
Step 202, determining a deformation area to be processed containing a first actual eyebrow area in a first image to be processed based on a plurality of first key points, and performing scaling mapping processing on each pixel point in the deformation area to be processed to obtain a second image to be processed containing a scaled second actual eyebrow area.
In the embodiment of the application, in order to enable the eyebrow portion in the target image after rendering processing to be more real and fit, the primary eyebrow portion, namely the first actual eyebrow portion area is subjected to liquefaction deformation, so that the first actual eyebrow portion area is scaled, and a second image to be processed including a scaled second actual eyebrow portion area is obtained.
Referring to fig. 4, a comparison diagram of images before and after the scaling mapping process is shown, wherein a first actual eyebrow area in the first generation processed image is consistent with a native eyebrow size of a human face before the scaling mapping process, after the scaling mapping process, the first actual eyebrow area is reduced in the human face, so as to obtain a second to-be-processed image, and as shown in fig. 4, the rest parts except the eyebrow area in the second to-be-processed image are unchanged, and the second actual eyebrow area is obviously reduced compared with the first actual eyebrow area, and eyebrows in the second actual eyebrow area are obviously finer, so that after the eyebrow in the eyebrow template is added later, the reduced eyebrow can be easily covered, and accordingly, the make-up effect of the eyebrow after the eyebrow template is added is improved.
In general, in image processing, a single area is processed, and it is difficult to directly process only the eyebrow area, but not the area near the eyebrow area, and in order to make the effect after the processing more natural, it is necessary to process the area near the eyebrow area at the same time, and before the scaling mapping processing, it is necessary to determine the area to be processed, that is, the deformed area to be processed.
In the embodiment of the application, when determining the deformation area to be processed, the reference point of the deformation area to be processed can be determined in the first image to be processed based on the coordinate value of the first key point corresponding to the first actual eyebrow area in the constructed reference coordinate system, and then the deformation area to be processed is determined in the first image to be processed based on the set shape model and with the reference point as the center.
Wherein the reference coordinate system is generally constructed based on the first image to be processed.
In one possible implementation manner, the reference coordinate system may be constructed by taking the center point of the first image to be processed as an original coordinate and taking the horizontal axis and the vertical axis as the horizontal axis and the vertical axis respectively.
In another possible implementation manner, one corner point of the first image to be processed is taken as an original coordinate, and a horizontal axis and a vertical axis are respectively taken as a horizontal axis and a vertical axis to construct a reference coordinate system.
It should be noted that, the above-mentioned deformation areas to be processed are specific to single-sided eyebrows, that is, for a face, two deformation areas to be processed are determined, one deformation area to be processed corresponds to one eyebrow, and then the eyebrows on both sides are processed respectively.
In the embodiment of the application, the pixel point corresponding to the average value of the coordinate values of each first key point can be determined as the reference point of the deformation area to be processed, that is, the reference point of the deformation area to be processed can be expressed as follows:
 Wherein (x0,y0) is the coordinate value of the reference point, (xi,yi) is the coordinate value of the ith first key point, and N is the total number of first key points corresponding to a single eyebrow.
Of course, other possible manners may be used to calculate the reference point, for example, after the shape fitting is performed based on each first key point, the emphasis may be determined as the reference point, which is not limited in this embodiment of the present application.
In the embodiment of the present application, the shape model may include, for example, an ellipse model, a circular model, or a triangle model, and taking the ellipse model as an example, it means that the outline of the deformation area to be processed is elliptical, and other shape models and so on.
The manner of determining the deformation region to be processed is different based on different shape models, and is described below.
(1) Elliptical model
When the shape model is an ellipse model, distances between the reference point and each first key point can be calculated one by one, a first distance with the maximum value is determined from the obtained distances, and the semi-major axis of the ellipse is determined based on the first distances.
Taking the datum point as the average value of the first key points as an example, in general, the first key point at the end of the eyebrow is the farthest from the datum point. Referring to fig. 5, the reference point is a pixel point 0, and is generally a first key point 1 that is farthest from the reference point 0, so that a semi-major axis of an ellipse can be determined based on a distance between the reference point 0 and the first key point, and a line connecting the reference point 0 and the first key point is a line where the major axis of the ellipse is located.
In addition, a second distance with a larger value can be selected from the distances between the datum point and the nearest two first key points, and the semi-minor axis of the ellipse is determined based on the second distance, so that a deformation area to be processed is obtained. Referring to fig. 5, for a schematic view of a constructed elliptical model, two first key points closest to the reference point 0 are 3 and 17, and then the distances between the reference point 0 and the first key points 3 and 17 can be calculated respectively, and a larger value is selected as the reference distance of the semi-minor axis.
In one possible embodiment, the first distance may be taken as the semi-major axis of the ellipse and the second distance as the semi-minor axis. Based on the fiducial point, semi-major axis and semi-minor axis, the following ellipse equation can then be determined:
 Where d1 is the length of the semi-major axis of the ellipse, i.e., the first distance described above, and d2 is the length of the semi-minor axis of the ellipse, i.e., the second distance described above.
In another possible embodiment, in order to avoid the case that the ellipse cannot enclose all the eyebrow areas, a certain magnification may be set, that is, after the first distance and the second distance are magnified, the elliptical areas are constructed with the magnified first distance and second distance as the semi-major axis and semi-minor axis, respectively, which is specifically illustrated in fig. 5 as an example. Further, the elliptic equation may be expressed as follows:
 The values of alpha and beta can be the same or different, can be set according to empirical values, and can be adjusted according to actual eyebrow conditions.
In the embodiment of the present application, after the elliptical model is constructed, scaling mapping processing may be performed by using the elliptical region as the deformation region to be processed, that is, scaling mapping processing may be performed by using the region enclosed by the ellipse shown in fig. 5 as the deformation region to be processed.
Specifically, in order to know which pixels are subjected to scaling mapping, it is necessary to determine which pixels are located in the elliptical region based on the constructed elliptical region. Since the determination process for each pixel is similar, only pixel a is shown here as an example. Referring to fig. 6, a flow chart for determining whether the pixel point a is an ellipse inner point is shown.
And S1, determining an included angle value between a connecting line between the datum point and the first key point corresponding to the first distance and a set coordinate axis in a datum coordinate system.
Since the coordinates of each reference point, pixel point or first key point are the coordinate values in the reference coordinate system, it is necessary to convert them into the coordinate system in which the ellipse is located for distance calculation.
And S2, taking the datum point as the center, rotating the pixel point A by an included angle value, and determining the distance between the position of the rotated pixel point A and the datum point.
Referring to fig. 5, since a certain angle θ is formed between the horizontal axis and the line between the reference point and the first key point 1, when the coordinate system is converted, the pixel point needs to be rotated, if the coordinate value of the pixel point a in the reference coordinate system is (xa,ya), the coordinate value after the rotation angle θ is (xa′,ya'), the distance between the pixel point a and the reference point 0 may be defined as (here, the first embodiment is taken as an example):
 Wherein r is the distance from the pixel point A to the center of the ellipse.
And S3, determining whether the distance between the position of the rotated pixel point A and the datum point is smaller than a set distance threshold value.
And S4, if the determination result in the step S3 is negative, determining the pixel point A as the pixel point which does not belong to the deformation area to be processed.
And S5, if the determination result in the step S3 is yes, determining the pixel point A as the pixel point positioned in the deformation area to be processed.
Specifically, for the distance definition described above, the set distance threshold may be set to 1, that is, when r <1, the pixel point a belongs to an intra-ellipse point, that is, a pixel point in the deformation area to be processed that needs scaling mapping processing, and when r is equal to or greater than 1, the pixel point a belongs to an extra-ellipse point or a point on an elliptic line, that is, a pixel point not belonging to the deformation area to be processed that needs scaling mapping processing.
(2) Round model
When the shape model is a circular model, the distances between the reference points and the first key points can be calculated one by one, the first distance with the maximum value is determined from the obtained distances, and the radius of the circle is determined based on the first distance.
Taking the datum point as the average value of the first key points as an example, in general, the first key point at the end of the eyebrow is the farthest distance from the datum point. Referring to fig. 7, a reference point is a pixel point 0, and is generally a first key point 1 farthest from the reference point 0, so that a radius of a circle can be determined based on a distance between the reference point 0 and the first key point.
In one possible embodiment, the first distance may be defined as a circle radius, as shown in fig. 7, and the following circle equation may be determined based on the reference point and the circle radius:
 Wherein d1 is a circular radius, i.e., the first distance described above.
In another possible embodiment, a certain magnification may also be provided, i.e. the first distance is magnified and then the circle is constructed with the magnified first distance as radius. Further, the elliptic equation may be expressed as follows:
 also, the manner of determining whether each pixel point is a point in the circular region is similar to the above-described elliptical model, and thus a detailed description thereof will be omitted herein.
It should be noted that any other possible shape model may be used in the embodiments of the present application, which is not limited thereto.
In the embodiment of the present application, for each pixel point in the determined deformation area to be processed, scaling mapping processing is performed in the following manner, and here, a first key point 1 in the deformation area to be processed is shown as an example.
For the first key point 1, the coordinate value of the first key point 1 after scaling can be determined based on the coordinate value of the first key point 1 and the set scaling mapping relation, and then, the pixel point at the position where the scaled coordinate value is located is updated according to the pixel characteristic value of the first key point 1, so as to obtain a second image to be processed, which is shown in fig. 4 and includes the scaled second actual eyebrow area.
Referring to fig. 8, a schematic diagram of the pixel scaling mapping process is shown. It can be seen that, after the native eyebrow is shrunk, the first key point 1 on the native eyebrow is shifted to the position of the pixel point 1', that is, the pixel value of the position of the pixel point 1' is substantially replaced with the pixel characteristic value of the first key point 1.
Specifically, in the set scaling mapping relationship adopted in the embodiment of the present application, the scaling ratio is inversely related to the distance between the reference points of the deformation area to be processed, that is, the more the pixel points farther from the reference points of the deformation area to be processed deform, the less the points nearer to the edge of the deformation area to be processed deform, until no deformation occurs at the edge of the deformation area to be processed, so that the scaling mapping relationship satisfying the deformation relationship is applicable.
In one possible implementation, the following scaling mapping may be employed:
f(r)=(1+(r-1)2*ε)
 Where f (r) is a scaling map expression, ε is the degree of eyebrow scaling, and the numerical range is [0,1].
Thus, for any one pixel (x, y) in the deformed region to be processed, the mapped pixel position is (x ', y'), which is expressed as follows:
 in general, since the eyebrows are elongated along the x-axis, the eyebrows may be mapped using asymmetric scaling in addition to the scaling mapping relationship described above, for example, the left and right eyebrow portions may be scaled differently, the upper and lower eyebrow portions may be scaled differently, or the like.
In the embodiment of the application, since the number of pixels before reduction is greater than that of pixels after reduction, some pixels cannot be mapped to the pixels after scaling, and therefore the pixels are discarded, or a plurality of pixels are mapped to the same pixel after scaling, and then the scaled pixels can select the feature average value or the feature maximum value of the plurality of pixels.
Step 203, obtaining a target eyebrow template containing a virtual makeup effect, and establishing a pixel mapping relation between a first actual eyebrow area and a simulated eyebrow area in a one-to-one correspondence manner based on a plurality of first key points and a plurality of second key points corresponding to the simulated eyebrow area in the eyebrow template.
In one possible implementation, the target eyebrow template may be obtained by triggering a user selection operation. For example, the user may select a target eyebrow template from the provided eyebrow templates before capturing the video or image, or may select a corresponding target eyebrow template for the image after capturing the video or image.
In another possible implementation manner, the most suitable target eyebrow module is matched with the face after the face is identified based on the image to be processed provided by the user.
In order to perform rendering processing on the second to-be-processed image by using the eyebrow material in real time, a pixel mapping relation corresponding to the first actual eyebrow area and the simulated eyebrow area one by one can be established, so that the rendering processing of each pixel point can be independently and parallelly performed, the time complexity is greatly reduced, and the processing speed is improved.
Specifically, when the pixel mapping relationship is constructed, gridding processing can be performed on the first actual eyebrow area and the simulated eyebrow area in advance respectively to obtain a corresponding first grid structure and a corresponding second grid structure, and then the pixel mapping relationship is established based on the grid correspondence relationship between the first grid structure and the second grid structure.
In the second grid structure, a plurality of second key points in the simulated eyebrow area are vertexes of grids in the second grid structure, one grid corresponds to one sub-area of the simulated eyebrow area, and the grids in the first grid structure correspond to the grids of the second grid structure one by one.
In one possible implementation, the grid may be constructed by using a grid construction method based on a triangular grid, and performing triangular gridding on the eyebrow area.
Specifically, the grid construction method based on the triangular grid can construct closed line segments by taking the key points as the end points based on a given key point set, and constructs a triangular mesh structure by the closed line segments, wherein the triangular mesh structure meets the following conditions:
 (1) Edges in a triangular mesh structure do not contain any points in the set of points other than the end points.
(2) There are no intersecting edges.
(3) All faces in the triangular mesh structure are triangular faces, and the aggregate of all triangular faces is a convex hull of the key point aggregate.
Among them, the more common triangulation method is the Delaunay triangulation (Delaunay triangulation algorithm) algorithm, which is a special triangulation. The Delaunay edge in the Delaunay algorithm satisfies the null circle characteristic, that is, a closed line segment e formed by two key points 1 and 2 in the key point set satisfies that a circle passes through the two points 1 and 2, no other point in the key point set is contained in the circle, and the null circle characteristic is satisfied, so that the closed line segment e is called as the Delaunay edge, and thus if one triangulation of the key point set only contains the Delaunay edge, the triangulation is called as the Delaunay triangulation.
Furthermore, the triangular mesh structure obtained by Delaunay triangulation meets the following conditions:
 any triangulation T in the triangular mesh structure, satisfying that T is one Delaunay triangulation of the set of keypoints, is currently only if the interior of the circumscribed circle of each triangle in T does not contain any points in the set of keypoints.
In the embodiment of the application, the key points of the actual eyebrow area in the image to be processed or the key points of the simulated eyebrow area in the eyebrow template are consistent in number, and the only difference is that the distances among the key points are different due to the difference of the shapes of the eyebrows, but the relative position relations of the key points are also consistent, so that the connection relation of the grid structure can be constructed in advance based on the standard face, namely, the connection of the key points in the finally constructed grid structure forms a subdivision surface, and therefore, in the actual application, the connection relation can be directly quoted, and the grid structure is formed quickly.
Referring to fig. 9, a schematic diagram of the correspondence relationship of the constructed grid structure is shown. In the first actual eyebrow area, each first key point forms a first grid structure, and the first key points 1,2 and 18 shown in fig. 9 form a grid, where the grid corresponds to a sub-area of the first actual eyebrow area, and so on. Similarly, in the eyebrow area, each second key point also forms a second grid structure, and the connection relationship in the second grid structure is identical to that of the first grid structure, as shown in fig. 9, the second key points 1,2 and 18 form a grid corresponding to a sub-area of the eyebrow area, and the grid formed by the second key points 1,2 and 18 corresponds to the grid formed by the second key points 1,2 and 18 in the first grid structure, and the dashed arrows shown in fig. 9 indicate the correspondence relationship, and the other is similar.
Further, based on the correspondence between the respective meshes in the first actual eyebrow area and the respective meshes in the simulated eyebrow area, a pixel mapping relationship is established for each corresponding mesh, for example, for the mesh constituted by the first key points 1,2, and 18 and the mesh constituted by the second key points 1,2, and 18 shown in fig. 9, and a pixel mapping relationship between the two meshes is established.
In the embodiment of the present application, as shown in fig. 9, considering that the grid constructed by the key points may not cover all the areas of the eyebrow, auxiliary key points may be added, so that when performing the gridding processing, the gridding processing may be performed based on the set gridding processing mode with each key point and at least one set auxiliary key point as the vertex, and a corresponding grid structure is obtained. Referring to fig. 10, in order to show a grid structure diagram after adding auxiliary key points, wherein points a to K are auxiliary key points, it can be seen that after adding auxiliary key points, all the regions of the eyebrow region can be well contained, so that the make-up effect is more consistent with the actual effect.
The auxiliary key points are pixel points located outside the first actual eyebrow area, the gridding processing mode is determined based on the standard eyebrow area in the standard face image, the standard face image is obtained by carrying out averaging processing on various face templates, the face image is in a front view state, and the gridding processing mode is set, such as the grid connection relation obtained in advance.
And 204, extracting rendering materials of each pixel point from the simulated eyebrow area based on the pixel mapping relation, and respectively carrying out fusion rendering processing on the corresponding pixel point in the second image to be processed to obtain a target image.
In the embodiment of the application, the pixel mapping relation is established by a mapping method based on gridding, and then the texture mapping technology of computational graphics can be utilized to sample materials from the eyebrow template, so that the textures of the materials are mapped into a second image to be processed, and a target image is obtained.
For example, a triangle mesh-based texture mapping method may be used to perform triangle meshing on the simulated eyebrow region in the eyebrow template, that is, the eyebrow key points are used as the vertices of the mesh, the triangle mesh structure of the eyebrow region is obtained by using the Delaunay algorithm, and similarly, for the first actual eyebrow region in the to-be-processed image, such triangle mesh structure may be constructed, so that the triangle mesh topology structure is used to sample from the eyebrow template, so as to obtain the texture mapping of the second to-be-processed image of the eyebrow template after the eyebrow is reduced, and in this way, all the pixel points may be processed in parallel by using the image processor (graphics processing unit, GPU), thereby greatly reducing complexity and improving processing speed.
The image rendering process according to the embodiment of the present application may be applied to various image processing scenes, such as image rendering in a video stream, and will be described below by taking image rendering in a video stream as an example. Referring to fig. 11, a flow chart of image rendering in a video stream is shown.
Step 1101, obtaining a video stream to be processed.
Specifically, the video stream to be processed may be selected by the user from the local album, or may be photographed by the user in real time.
Step 1102, determining whether the current video frame contains a human face.
Step 1103, if the result of step 1102 is yes, determining the current video frame as the image to be processed.
And 1104, performing image rendering processing on the image to be processed to obtain a corresponding target image.
Step 1105, obtaining a target video stream containing virtual makeup effects based on the target images corresponding to the images to be processed.
If the result of step 1102 is negative, the process jumps to the next video frame for further processing.
In the embodiment of the application, the rendering processing of each image to be processed can be sequentially performed according to the queue, or the rendering processing of a plurality of images to be processed can be performed in parallel.
In a possible implementation manner, the processing of the video stream may be performed locally by the terminal device.
In another possible implementation manner, the processing procedure of the video stream may be that the terminal device uploads the video stream to a background server, where the background server provides a background processing service. For example, after a user selects a virtual makeup effect on a shooting application of the terminal device, the shooting application is utilized to start shooting a video, a corresponding video stream is uploaded to a background server, and the background server processes each video frame by utilizing the processing flow and returns a target video stream to the terminal device, so that the terminal device displays the target video stream in real time, and the user can view the target video stream added with the virtual makeup effect.
In summary, the embodiment of the application adopts the augmented reality method, the eyebrow area is liquefied and deformed to locally zoom the eyebrow, and then different eyebrows are attached in an augmented reality mode, so that the eyebrow diversification and natural make-up effect is obtained, wherein the eyebrow template cannot be obviously mixed and overlapped with the original eyebrow by the eyebrow shrinkage scheme, so that the make-up effect of the eyebrow is more natural, and the eyebrow template is sampled in texture by using the texture attachment of graphics, so that the eyebrow can be made up in real time.
Referring to fig. 12, based on the same inventive concept, an embodiment of the present application further provides an image rendering processing apparatus 120, which includes:
 The face recognition unit 1201 is configured to perform face key point recognition on a first to-be-processed image including a face, and recognize a plurality of first key points corresponding to a first actual eyebrow area of the face from the first to-be-processed image;
 A scaling mapping unit 1202, configured to determine, based on a plurality of first key points, a to-be-processed deformation area including a first actual eyebrow area in a first to-be-processed image, and perform scaling mapping processing on each pixel point in the to-be-processed deformation area, to obtain a second to-be-processed image including a scaled second actual eyebrow area;
 The grid construction unit 1203 is configured to obtain a target eyebrow template including a virtual makeup effect, and establish a pixel mapping relationship between a first actual eyebrow area and a simulated eyebrow area based on a plurality of first key points and a plurality of second key points corresponding to the simulated eyebrow area in the eyebrow template;
 The texture fitting unit 1204 is configured to extract rendering materials of each pixel point from the simulated eyebrow area based on the pixel mapping relationship, and respectively perform fusion rendering processing on the corresponding pixel point in the second image to be processed, so as to obtain a target image.
Optionally, the scaling mapping unit 1202 is specifically configured to:
 determining a reference point of a deformation region to be processed in a first image to be processed based on coordinate values of a plurality of first key points in a constructed reference coordinate system;
 based on the set shape model, a deformation region to be processed is determined in the first image to be processed centering on the reference point.
Optionally, the scaling mapping unit 1202 is specifically configured to:
 and determining the pixel point corresponding to the average value of the coordinate values corresponding to each of the plurality of first key points as a datum point.
Optionally, the scaling mapping unit 1202 is specifically configured to:
 Determining the distance between the datum point and each first key point, determining the first distance with the maximum value from the obtained distances, and determining the semi-long axis based on the first distance;
 Selecting a second distance with a larger value from the distances between the datum point and the nearest two first key points, and determining a semi-minor axis based on the second distance;
 and determining an elliptical area surrounded by a semi-major axis and a semi-minor axis by taking the datum point as a center as a deformation area to be processed.
Optionally, the scaling mapping unit 1202 is specifically configured to:
 Determining an included angle value between a connecting line between the reference point and a first key point corresponding to the first distance and a set coordinate axis in a reference coordinate system;
 For each pixel point in the image to be processed, the following operations are respectively executed:
 For one pixel point, taking the datum point as the center, rotating the one pixel point by an included angle value, and determining the distance between the position of the rotated one pixel point and the datum point;
 If the distance between the position of the rotated pixel point and the reference point is smaller than the set distance threshold value, determining the pixel point as the pixel point in the deformation area to be processed.
Optionally, the scaling mapping unit 1202 is specifically configured to:
 determining the distance between the datum point and each first key point, and determining the distance with the maximum value from the obtained distances;
 and determining a circular area taking the datum point as a center and the distance with the maximum value as the radius as a deformation area to be processed.
Optionally, the scaling mapping unit 1202 is specifically configured to:
 For each pixel point in the deformation area to be processed, the following operations are respectively executed to obtain a second image to be processed:
 Determining a coordinate value of a pixel point after scaling based on the coordinate value of the pixel point and a set scaling mapping relation, wherein the distance between the scaling scale and a datum point of a deformation area to be processed is in negative correlation in the set scaling mapping relation;
 and updating the pixel point at the position of the scaled coordinate value by using the pixel characteristic value of one pixel point.
Optionally, the grid construction unit 1203 is specifically configured to:
 based on a plurality of first key points, performing gridding treatment on the first actual eyebrow area to obtain a first grid structure, wherein the plurality of first key points are vertexes of grids in the first grid structure, one grid corresponds to one sub-area of the first actual eyebrow area, and
Performing gridding treatment on the simulated eyebrow area based on a plurality of second key points to obtain a second grid structure, wherein the plurality of second key points are vertexes of grids in the second grid structure, one grid corresponds to one sub-area of the simulated eyebrow area, and the grids in the first grid structure correspond to the grids of the second grid structure one by one;
 and establishing a pixel mapping relation based on the grid corresponding relation between the first grid structure and the second grid structure.
Optionally, the grid construction unit 1203 is specifically configured to:
 taking a plurality of first key points and at least one set auxiliary key point as vertexes, and carrying out gridding treatment on a first actual eyebrow area based on a set gridding treatment mode to obtain a first grid structure;
 The gridding processing mode is determined based on a standard eyebrow area in a standard face image, and the standard face image is obtained by averaging various face templates and is in a front view state.
Optionally, the apparatus further comprises a video stream processing unit 1205 for:
 Acquiring a video stream to be processed, and selecting an image containing a human face from the video stream to be processed as an image to be processed; and obtaining a target video stream containing the virtual makeup effect based on the target image corresponding to each image to be processed.
The device may be used to execute the method shown in the embodiment shown in fig. 2 to 11, so the description of the embodiment shown in fig. 2 to 11 may be referred to for the functions that can be implemented by each functional module of the device, and will not be repeated. Among them, the video stream processing unit 1205 is an optional functional module, and is therefore shown in broken lines in fig. 12.
Based on the same technical concept as the above method embodiment, the embodiment of the present application further provides a computer device 130, which may include a memory 1301 and a processor 1302.
The memory 1301 is configured to store a computer program executed by the processor 1302. The memory 1301 may mainly include a storage program area that may store an operating system, an application program required for at least one function, and the like, and a storage data area that may store data created according to the use of the computer device, and the like. The processor 1302 may be a central processing unit (central processing unit, CPU), or a digital processing unit, or the like. The specific connection medium between the memory 1301 and the processor 1302 is not limited in the embodiments of the present application. The embodiment of the present application is shown in fig. 13, where the memory 1301 and the processor 1302 are connected by a bus 1303, where the bus 1303 is shown in bold lines in fig. 13, and the connection between other components is merely illustrative, and not limited to. The bus 1303 may be classified into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in fig. 13, but not only one bus or one type of bus.
Memory 1301 may be a volatile memory (RAM) such as random-access memory (RAM), memory 1301 may also be a non-volatile memory (non-volatile memory) such as read-only memory, flash memory (flash memory), hard disk (HARD DISK DRIVE, HDD) or solid state disk (solid-state disk) (STATE DRIVE, SSD), or memory 1301 is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited thereto. Memory 1301 may be a combination of the above.
A processor 1302, configured to execute a method executed by the apparatus in the embodiment shown in fig. 2 to 11 when invoking the computer program stored in the memory 1301.
A computing device 140 according to such an embodiment of the application is described below with reference to fig. 14. The computing device 140 of fig. 14 is only one example and should not be taken as limiting the functionality and scope of use of embodiments of the present application.
As shown in fig. 14, computing device 140 is in the form of a general purpose computing device. The components of computing device 140 may include, but are not limited to, at least one processing unit 1401 as described above, at least one memory unit 1402 as described above, and a bus 1403 that connects the various system components, including memory unit 1402 and processing unit 1401.
Bus 1403 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a processor, and a local bus using any of a variety of bus architectures.
The storage unit 1402 may include a readable medium in the form of a volatile memory, such as a Random Access Memory (RAM) 14021 and/or a cache storage unit 14022, and may further include a Read Only Memory (ROM) 14023.
The storage unit 1402 may also include a program/utility 14025 having a set (at least one) of program modules 14024, such program modules 14024 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
The computing device 140 may also communicate with one or more external devices 1404 (e.g., keyboard, pointing device, etc.), one or more devices that enable a user to communicate with other computing devices, and/or any apparatus (e.g., router, modem, etc.) that enables the computing device 140 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 1405. Moreover, computing device 140 may also communicate with one or more networks, such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through network adapter 1406. As shown, the network adapter 1406 communicates with other modules for the computing device 140 over the bus 1403. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with computing device 140, including, but not limited to, microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
In some possible embodiments, aspects of the method provided by the present application may also be implemented in the form of a program product comprising program code for causing a computer device to carry out the steps of the method according to the various exemplary embodiments of the application described in this specification, when said program product is run on the computer device, for example, the computer device may carry out the method as carried out by the device in the example shown in fig. 2-11.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of a readable storage medium include an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.