Movatterモバイル変換


[0]ホーム

URL:


CN100505840C - Method and device for transmitting face synthesized video - Google Patents

Method and device for transmitting face synthesized video
Download PDF

Info

Publication number
CN100505840C
CN100505840CCNB2007101767909ACN200710176790ACN100505840CCN 100505840 CCN100505840 CCN 100505840CCN B2007101767909 ACNB2007101767909 ACN B2007101767909ACN 200710176790 ACN200710176790 ACN 200710176790ACN 100505840 CCN100505840 CCN 100505840C
Authority
CN
China
Prior art keywords
face
information
video
texture information
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CNB2007101767909A
Other languages
Chinese (zh)
Other versions
CN101179665A (en
Inventor
何健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co LtdfiledCriticalTencent Technology Shenzhen Co Ltd
Priority to CNB2007101767909ApriorityCriticalpatent/CN100505840C/en
Publication of CN101179665ApublicationCriticalpatent/CN101179665A/en
Application grantedgrantedCritical
Publication of CN100505840CpublicationCriticalpatent/CN100505840C/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Landscapes

Abstract

The invention discloses a method and device used for facing synthesis video transmission. In the method of the invention, a face position and texture information which is not modified is processed as the expectation of user to obtain a modificatory face position and texture information, and then the video information obtained is transmitted. The invention provides an opportunity for user to modify face. The invention satisfies a presentation desire for particular users, so that a better satisfaction degree of the user is obtained, and a better experience for the users is expected to be reached. Besides, the invention brings an appreciation potential for related services in which video communications is applied.

Description

Method and device for transmitting face synthesis video
Technical Field
The present invention relates to video processing technologies, and in particular, to a method and an apparatus for transmitting a face synthesized video.
Background
Face Detection (Face Detection) refers to a process of determining the position, size, and pose of all faces present in an input image. The face detection technology is a key technology in face information processing. The face image contains abundant pattern features including color features (skin color, hair color, etc.), contour features, histogram features, mosaic features, structural features, transform domain features, template features, heuristic features, and the like. Which of these pattern features are most useful and how to utilize them is a key issue to be studied for face detection. The face pattern has complex and detailed changes, so that multiple pattern features generally need to be synthesized, such as simple combination, statistical inference, fuzzy decision, machine learning, and the like. In summary, the face detection methods can be classified into two types, a method based on a skin color feature and a method based on a gray scale feature, according to the color attribute using the pattern feature. The method based on the skin color characteristics is suitable for constructing a rapid human face detection and human face tracking algorithm; the gray feature-based method utilizes more essential features of human faces different from other objects, and is the key point of research in the field of human face detection. Methods based on gray scale features can be divided into two broad categories according to different models employed in the synthesis of mode features: heuristic (knowledge) model based methods and statistical model based methods.
Face tracking is generally based on face detection to track the position of face motion in a video sequence. Face tracking techniques include Motion-based methods and Model-based methods. The method based on motion adopts methods such as motion segmentation, optical flow, stereoscopic vision and the like, and utilizes space-time gradient, Kalman filter and the like to track the motion of the human face; firstly, acquiring prior knowledge of a target, constructing a target model, and performing model matching on each frame of input image through a sliding window. In face tracking, these two methods are often used in combination.
The face synthesis refers to a process of generating face images in other postures according to a known face image in a certain posture, and is a problem of synthesis of the face images. The human face image synthesis system is based on mathematical modeling and realizes the deformation, transition and aging drawing of images by using a mathematical model. The face synthesis technique may be applied to the face detection technique and the face tracking technique described above.
The existing face synthesis technology is mainly applied as follows: the method comprises the steps of inputting a photo containing a face image or a video sequence, outputting a changed virtual face picture (for example, outputting an aged appearance or an appearance of children and the like) or a cartoon picture (namely, the face cartoon) after processing, but the face synthesis technology is not directly combined with video communication.
Disclosure of Invention
In view of the above, the present invention provides a method and an apparatus for transmitting a face synthesis video, which provide a user with an opportunity to modify the own video.
In order to achieve the purpose, the technical scheme of the invention is realized as follows:
an apparatus for face synthesis video transmission, comprising: the system comprises a face synthesis unit, a video segmentation unit and a video communication unit, wherein the video segmentation unit is used for carrying out segmentation processing on face position, texture information and background information to obtain unmodified separation information of the face position, the texture information and the background information; the face synthesis unit is used for processing the unmodified face position and texture information according to the established mathematical model of the face to obtain the modified face position and texture information; and the video communication unit is used for sending the modified human face, texture information and background information outwards.
The apparatus further comprises: the system comprises a preprocessing unit and a face modeling unit, wherein the preprocessing unit is used for determining a characteristic mode desired by a user; the face modeling unit is used for calculating image change schemes of the face of the user at various acquisition angles according to the characteristic modes provided by the preprocessing unit, establishing a face mathematical model and providing the face mathematical model to the face synthesis unit.
The apparatus further comprises: and the video synthesis unit is used for synthesizing the modified human face position and texture information with the background information and providing the synthesized information to the video communication unit.
The apparatus further comprises: the face detection and tracking unit is used for detecting and tracking the facial features and changes of the face in the video to obtain the synthetic information of the face position, the texture information and the background information and supplying the synthetic information to the face synthesis unit.
A method for transmitting a face synthesis video comprises the following steps:
A. detecting and tracking facial features and changes of a face in a video to obtain unmodified face position and texture information, and processing the unmodified face position and texture information according to an established mathematical model of the face to obtain modified face position and texture information;
B. and transmitting the modified face and texture information and the background information.
The step A comprises the following steps:
a1, calculating image change schemes of the face of the user at various acquisition angles according to the determined characteristic mode, and establishing a mathematical model of the face;
a2, detecting and tracking the facial features and the changes of the facial features to obtain the unmodified facial position and texture information;
and A3, processing the unmodified face position and texture information according to the mathematical model of the face to obtain the modified face position and texture information.
The step A2 includes: the method comprises the steps of detecting and tracking facial features and changes of a human face to obtain synthetic information of a human face position, texture information and background information, and carrying out segmentation processing on the synthetic information of the human face position, the texture information and the background information to obtain unmodified separation information of the human face position, the texture information and the background information.
The step B specifically comprises the following steps: separating and transmitting the modified face position and texture information and background information; or, the modified face position and texture information and the background information are synthesized and then transmitted.
The step A2 includes: extracting features of an input video sequence, and if the current frame is the first frame or a plurality of frames in the preamble of the current frame without detecting a face, performing face detection operation on the current frame; and if the face is detected in the preorder frame of the current frame, carrying out face tracking operation on the current frame.
Before the step of detecting the facial features of the human face and the changes thereof in the video, the method further comprises the following steps: setting a convention condition; the step of detecting the facial features and changes of the human faces in the video comprises the following steps: and carrying out face detection operation according to the appointed conditions.
In the invention, the unmodified face position and texture information are processed according to the determined mode characteristics expected by the user to obtain the modified face position and texture information, and then the obtained video information is transmitted, so that a self-modifying opportunity is provided for the user, the display desire of a specific user is better met, and the user has better satisfaction degree so as to achieve better user experience; and brings value-added potential for services using corresponding video communication.
Drawings
FIG. 1 is a schematic structural diagram of an apparatus for implementing face synthesis video transmission according to the present invention;
FIG. 2 is a schematic diagram of the principle of face detection and tracking in the present invention;
fig. 3 is a flow chart of video transmission for realizing face synthesis in the present invention.
Detailed Description
With the popularization of instant messaging technology and the continuous improvement of network bandwidth, in order to enhance interactivity, more and more relatives and friends start to use a camera for video chat, but the current video communication system only inputs a video encoder without any change to a video captured by the camera, encodes the video and transmits the encoded video to a receiving end. However, a general user usually has a psychological characteristic that the user does not satisfy himself in a hundred percent way of the external appearance, and if the user can be provided with a chance to modify the video of the present person, the user should have better satisfaction. Therefore, the video communication system combined with the face synthesis technology is provided, and a self-modification opportunity is provided for a user so as to achieve better user experience.
In the invention, the unmodified face position and texture information are processed according to the determined characteristic mode expected by the user to obtain the modified face position and texture information, and then the obtained video information is transmitted.
Fig. 1 is a schematic structural diagram of an apparatus for implementing face synthesis video transmission in the present invention, as shown in fig. 1, the apparatus includes: the system comprises a preprocessing unit, a face modeling unit, a face synthesizing unit, a face detecting and tracking unit, a video segmentation unit, a video synthesizing unit and a video communication unit.
The preprocessing unit is used for determining a characteristic mode desired by a user and providing the characteristic mode to the face modeling unit. One implementation is: the preprocessing unit collects pictures and/or video sequences expected by a user and the pictures and/or video sequences of the user, establishes a corresponding relation between the pictures and/or video sequences, namely a characteristic mode expected by the user, and provides the characteristic mode for the face modeling unit. Before the user performs video communication, the user provides a picture or a video sequence which is expected to be shown to the opposite terminal in the video communication and contains certain mode characteristics to the preprocessing unit, wherein the mode characteristics can comprise face shape, skin color, five sense organs distribution and the like. The other realization mode is as follows: the user sets the desired characteristic pattern in the preprocessing unit. The mode feature is a specific feature possessed by a picture or a video sequence, and a specific feature mode is composed of one or more mode features.
The face modeling unit is used for calculating image change schemes of the face of the user at various acquisition angles according to the characteristic modes provided by the preprocessing unit, establishing a face mathematical model and then providing the face mathematical model to the face synthesis unit. The mathematical model of the face can be established by any one of the known face modeling methods.
The face detection and tracking unit is used for detecting and tracking the facial features and changes of the face in the video to obtain the synthetic information of the face position, texture information and background information, and is specifically implemented as shown in fig. 2. Firstly, a face detection and tracking unit extracts features of an input video sequence, the extracted features comprise skin color, outline, histogram, motion vector and the like of a face, and the feature data are stored in a database of model parameters. If the current frame is the first frame or the frames in the preamble of the current frame do not detect the face, the face detection operation is carried out on the current frame. Here, a fast face detection algorithm, such as a face detection algorithm based on a skin color model, may be used. When the face detection is performed on the current frame, some default conditions may be set, for example, only the face in the largest or most significant position in the picture is detected, or the size of the face cannot be smaller than a set value. And if the face is detected in the preorder frame of the current frame, carrying out face tracking operation on the current frame. The face tracking is obtained by jointly analyzing the feature data provided by the model parameter database and the video picture content of the current frame. And finally, providing the obtained human face position and the synthetic information of the texture information and the background information to a video segmentation unit.
The video segmentation unit is used for segmenting the synthesis information of the face position, the texture information and the background information provided by the face detection and tracking unit, then providing the unmodified face position and the texture information for the face synthesis unit, and providing the background information for the video synthesis unit. Because the face detection and tracking unit provides face position, texture information and background information in the form of a complete video, the face position, texture information and background information need to be segmented by the video segmentation unit.
The face synthesis unit is used for processing the unmodified face position and texture information provided by the video segmentation unit according to the mathematical model of the face provided by the face modeling unit to obtain the modified face position and texture information, and providing the modified face position and texture information to the video synthesis unit. The face synthesis unit converts the face features before conversion into the face features expected by the user by using any known face synthesis technology according to the position and texture information of the face by using an image change scheme to obtain the modified face position and texture information.
The video synthesis unit is used for synthesizing the modified face position and texture information provided by the face synthesis unit and the background information provided by the video segmentation unit and then providing the synthesized face position and texture information to the video communication unit.
The video communication unit is used for sending out the video information provided by the video synthesis unit.
In the above apparatus for implementing transmission of face synthesized video, the preprocessing unit, the face modeling unit and the face detection and tracking unit as the pre-processing unit can be omitted in the main structure of the apparatus for implementing transmission of face synthesized video, that is, the main structure of the apparatus for implementing transmission of face synthesized video mainly includes the face synthesis unit, the video segmentation unit and the video communication unit.
In addition, in the processing, the video synthesis unit is not needed, in this case, the video segmentation unit directly provides the background information to the video communication unit, and the face synthesis unit directly provides the decorated face position and texture information to the video communication unit. The video communication unit can select any video coding algorithm to transmit the obtained video information, wherein the video information can be synthesized by the modified face position and texture information and the background information, and can also be separated modified face position and texture information and the background information.
Fig. 3 is a flow chart of implementing transmission of a face synthesized video in the present invention, and as shown in fig. 3, a processing flow of implementing transmission of a face synthesized video includes the following steps:
step 301: a user desired characteristic pattern is determined. Before the user performs video communication, a picture or a video sequence containing certain mode characteristics, which are expected to be shown to the opposite terminal in the video communication, is set, wherein the mode characteristics can comprise face shape, skin color, five sense organs distribution and the like. The user may also preset a desired feature pattern. The mode feature is a specific feature possessed by a picture or a video sequence, and a specific feature mode is composed of one or more mode features. The feature pattern refers to a specific pattern composed of related features desired by the user.
Step 302: and according to the determined characteristic mode, calculating image change schemes of the face of the user at various acquisition angles, and establishing a mathematical model of the face. The mathematical model of the face can be established by any one of the known face modeling methods.
Step 303: the face features and changes of the face in the video are detected and tracked to obtain the synthetic information of the face position, texture information and background information, and the specific implementation is as shown in fig. 2. Firstly, feature extraction is carried out on an input video sequence, the extracted features comprise skin color, outline, histogram, motion vector and the like of a human face, and the feature data are stored in a database of model parameters. If the current frame is the first frame or the frames in the preamble of the current frame do not detect the face, the face detection operation is carried out on the current frame. Here, a fast face detection algorithm, such as a face detection algorithm based on a skin color model, may be used. When the face detection is performed on the current frame, some default conditions may be set, for example, only the face in the largest or most significant position in the picture is detected, or the size of the face cannot be smaller than a set value. And if the face is detected in the preorder frame of the current frame, carrying out face tracking operation on the current frame. The face tracking is obtained by jointly analyzing the feature data provided by the model parameter database and the video picture content of the current frame. And finally, obtaining the synthetic information of the face position, the texture information and the background information.
Step 304: and (4) carrying out segmentation processing on the synthetic information of the face position, the texture information and the background information to obtain the unmodified separation information of the face position, the texture information and the background information. Since the face position, texture information and background information appear in the form of complete video information, it is necessary to segment the separation information of the face position, texture information and background information.
Step 305 to step 306: and processing the unmodified face position and texture information according to the mathematical model of the face to obtain the modified face position and texture information. And converting the face features before transformation into the face features expected by the user by using any known face synthesis technology according to the position and texture information of the face by using an image change scheme to obtain the modified face position and texture information.
For example, if the feature pattern desired by the user is the face shape of a star, a mathematical model of the face is created based on the feature pattern. And after obtaining the unmodified face position and texture information, modifying the face of the user according to the face of the star, and modifying the face of the user to be matched with the face of the star to obtain the modified face position and texture information.
Step 307 to step 308: and synthesizing the modified face position and texture information with background information, and then transmitting the obtained video information.
In addition, the face position and texture information may be transmitted separately from the background information without synthesizing the face position and texture information with the background information.
Any video coding algorithm can be selected for transmission in the transmission process of the video information.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention.

Claims (10)

CNB2007101767909A2007-11-022007-11-02Method and device for transmitting face synthesized videoActiveCN100505840C (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CNB2007101767909ACN100505840C (en)2007-11-022007-11-02Method and device for transmitting face synthesized video

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CNB2007101767909ACN100505840C (en)2007-11-022007-11-02Method and device for transmitting face synthesized video

Publications (2)

Publication NumberPublication Date
CN101179665A CN101179665A (en)2008-05-14
CN100505840Ctrue CN100505840C (en)2009-06-24

Family

ID=39405737

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CNB2007101767909AActiveCN100505840C (en)2007-11-022007-11-02Method and device for transmitting face synthesized video

Country Status (1)

CountryLink
CN (1)CN100505840C (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101287093B (en)*2008-05-302010-06-09北京中星微电子有限公司Method for adding special effect in video communication and video customer terminal
CN102194443B (en)*2010-03-042014-12-10腾讯科技(深圳)有限公司Display method and system for window of video picture in picture and video processing equipment
CN102497530A (en)*2011-05-092012-06-13苏州阔地网络科技有限公司Secure transmission method and system for image in community network
CN102238362A (en)*2011-05-092011-11-09苏州阔地网络科技有限公司Image transmission method and system for community network
CN102142154B (en)*2011-05-102012-09-19中国科学院半导体研究所 Method and device for generating facial virtual image
CN102254336B (en)*2011-07-142013-01-16清华大学Method and device for synthesizing face video
CN104637078B (en)*2013-11-142017-12-15腾讯科技(深圳)有限公司A kind of image processing method and device
CN105184249B (en)2015-08-282017-07-18百度在线网络技术(北京)有限公司Method and apparatus for face image processing
CN105405094A (en)*2015-11-262016-03-16掌赢信息科技(上海)有限公司Method for processing face in instant video and electronic device

Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN1462561A (en)*2001-05-082003-12-17皇家菲利浦电子有限公司Method for multiple view synthesis
CN1932847A (en)*2006-10-122007-03-21上海交通大学Method for detecting colour image human face under complex background

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN1462561A (en)*2001-05-082003-12-17皇家菲利浦电子有限公司Method for multiple view synthesis
CN1932847A (en)*2006-10-122007-03-21上海交通大学Method for detecting colour image human face under complex background

Also Published As

Publication numberPublication date
CN101179665A (en)2008-05-14

Similar Documents

PublicationPublication DateTitle
CN100505840C (en)Method and device for transmitting face synthesized video
CN108537743B (en) A Facial Image Enhancement Method Based on Generative Adversarial Networks
CN105843386B (en)A kind of market virtual fitting system
PearsonDevelopments in model-based video coding
CN112037320B (en)Image processing method, device, equipment and computer readable storage medium
CN102271241A (en) Image communication method and system based on facial expression/action recognition
CN101141608B (en)Video instant communication system and method
Butler et al.Real-time adaptive foreground/background segmentation
CN107578435B (en)A kind of picture depth prediction technique and device
CN111383307A (en) Portrait-based video generation method, device, and storage medium
CN114245215A (en)Method, device, electronic equipment, medium and product for generating speaking video
CN112200816B (en) Method, device and equipment for region segmentation and hair replacement of video images
CN113297944A (en)Human body posture transformation method and system for virtual fitting of clothes
CN114862716B (en)Image enhancement method, device, equipment and storage medium for face image
US20020164068A1 (en)Model switching in a communication system
CN118433962B (en) Intelligent fill light control method and system
CN116129013A (en) A method, device and storage medium for generating virtual human animation video
CN111460945A (en)Algorithm for acquiring 3D expression in RGB video based on artificial intelligence
CN109389076A (en)Image partition method and device
KR20160049191A (en)Wearable device
CN118887324A (en) Method and device for generating action-guided video
CN116959125A (en)Data processing method and related device
CN116524606A (en) Face living body recognition method, device, electronic equipment and storage medium
CN102238362A (en)Image transmission method and system for community network
CN108399358B (en) A method and system for displaying facial expressions in video chat

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
C14Grant of patent or utility model
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp