Movatterモバイル変換


[0]ホーム

URL:


CN102034247A - Motion capture method for binocular vision image based on background modeling - Google Patents

Motion capture method for binocular vision image based on background modeling
Download PDF

Info

Publication number
CN102034247A
CN102034247ACN 201010602544CN201010602544ACN102034247ACN 102034247 ACN102034247 ACN 102034247ACN 201010602544CN201010602544CN 201010602544CN 201010602544 ACN201010602544 ACN 201010602544ACN 102034247 ACN102034247 ACN 102034247A
Authority
CN
China
Prior art keywords
background
image
binocular vision
foreground
binocular
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 201010602544
Other languages
Chinese (zh)
Other versions
CN102034247B (en
Inventor
王阳生
时岭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of SciencefiledCriticalInstitute of Automation of Chinese Academy of Science
Priority to CN 201010602544priorityCriticalpatent/CN102034247B/en
Publication of CN102034247ApublicationCriticalpatent/CN102034247A/en
Application grantedgrantedCritical
Publication of CN102034247BpublicationCriticalpatent/CN102034247B/en
Expired - Fee Relatedlegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Landscapes

Abstract

Translated fromChinese

本发明是一种基于背景分割对双目视觉图像的运动捕捉方法,可以完成对于人体作为前景的分割,同时对人体的上身躯干部分进行运动捕捉,从而完成人机交互的效果。本方法是在背景建模的基础上,通过对摄像头采集的干净背景进行高斯模型的建立,然后将采集的视频同背景模型进行比较,并通过双目摄像头所获得的深度信息,将场景的每一个像素给定一个属于前景或背景的概率值,并通过图切算法完成对场景前景和背景的分割。在分割前景是人体上身躯干的情况下,通过对前景轮廓的细化、去噪和关键点的确定,获得人体的基本骨架模型,从而完成运动捕捉的过程。

Figure 201010602544

The invention is a motion capture method for binocular vision images based on background segmentation, which can complete the segmentation of the human body as the foreground, and at the same time perform motion capture on the upper torso of the human body, thereby achieving the effect of human-computer interaction. This method is based on background modeling, through the establishment of a Gaussian model for the clean background collected by the camera, and then compares the collected video with the background model, and uses the depth information obtained by the binocular camera to convert each part of the scene A pixel is given a probability value belonging to the foreground or background, and the segmentation of the foreground and background of the scene is completed through the graph cut algorithm. In the case that the segmented foreground is the upper body of the human body, the basic skeleton model of the human body is obtained through the refinement of the foreground outline, denoising and determination of key points, thereby completing the process of motion capture.

Figure 201010602544

Description

A kind of based on the motion capture method of background modeling to the binocular vision image
Technical field
The invention belongs to computer vision technique and interactive digital entertainment field, relate to a kind of background segment that binocular camera shooting head and background modeling technology finish and motion-captured process utilized.
Background technology
Movement capturing technology refers to computer vision or other means utilized, and can capture the motion process of human body in real time, exactly.Along with the development of computer software and hardware and the raising of computer user's demand, movement capturing technology is more obvious in the effect of the inside, fields such as digital entertainment, video monitoring, motion analysis.
Yet the development of movement capturing technology also is subjected to the restriction of various conditions and many limitation occur.Such as problems such as blocking of variation, complicated background and the motion process of light.These factors make motion-captured process become more difficult.Yet, carry out the result of background segment by the method for utilizing binocular vision, the prospect in scene has only under the prerequisite of human body, and motion-captured problem will be converted into the prospect profile problem of analyzing scene, make calculated amount simplify greatly.Simultaneously, in the interactive digital entertainment field, movement capturing technology also is the research focus of man-machine interaction in playing in recent years as a kind of video interactive technology.And camera become the general outfit of PC, and man-machine interaction mode general, immersion more and more becomes the focus of digital entertainment research.So, have the research prospect of application fields based on the binocular vision movement capturing technology of background segment technology.
Summary of the invention
Cutting apart of prospect that the objective of the invention is to utilize the binocular camera shooting head to obtain scene and background, finish motion-captured process simultaneously on this basis.This method is at first trained clean background, gathers the background picture of certain frame number, finishes the foundation of background model.On this basis, utilize the new image of gathering, finish the foundation that figure cuts network chart with the color distortion of background model and the depth information of binocular vision, and the method for utilizing dynamic figure to cut, scene prospect and background are cut apart.On the basis of cutting apart, the human body of prospect is carried out the analysis of structure simultaneously, obtain the location of upper body trunk various piece, thereby finish motion-captured process.
For achieving the above object, the invention provides based on background modeling the motion capture method of binocular vision image comprised that step is as follows:
Step S1: with binocular camera shooting head stationkeeping, close white balance, obtain the binocular vision image;
Step S2: to the binocular vision image that obtains, under the clean background image of setting frame number, carry out background modeling, obtain background model;
Step S3: the binocular depth information that utilizes computer binocular vision to obtain, calculating pixel belongs to the probability of prospect and background;
Step S4: utilize binocular depth information and background modeling data and dynamic figure to cut algorithm, binocular vision display foreground and background are cut apart, and extracted prospect profile;
Step S5: prospect profile carries out refinement, determines the human body key point, finishes motion-captured.
Good effect of the present invention:
The present invention utilizes computer vision and image processing techniques, isolates the human body of prospect naturally from scene, and finishes the motion-captured of upper body trunk, thereby realizes the man-machine interaction of nature.The characteristics of traditional interactive mode are based on the hand contact, as mouse, keyboard etc.Development along with computer vision technique, increasing system is by the process of having finished man-machine interaction of the method nature of camera, the user can experience the enjoyment of man-machine interaction more easily by the mode of vision, simultaneously, interface as recreation makes the game player obtain more feeling of immersion.
In addition, the present invention has utilized the collection of binocular vision and the foundation of background model.The employing of binocular vision mainly is to have made full use of depth information, and the prospect of considering often belongs to from the nearer zone of camera, the problem of avoided by shade simultaneously, blocking the segmentation errors that causes.In addition, setting up background model can be so that the cost of cutting apart better obtains calculating, the method for utilizing dynamic figure to cut simultaneously, make cut apart quicker.
Description of drawings
Figure 1A is overall flow figure of the present invention;
Fig. 1 is a binocular vision image of the present invention;
Fig. 2 is left figure and right figure and the parallax that the present invention utilizes binocular vision to obtain;
Fig. 3 is that figure of the present invention cuts the max-flow of algorithm or the network flow graph of minimal cut;
Fig. 4 is a process flow diagram of the present invention;
Fig. 5 is one group of design sketch that video background is cut apart of the present invention;
Fig. 6 is a background segment result's of the present invention edge smoothing synoptic diagram;
Fig. 7 is the refinement of profile of the present invention and the result that key position extracts.
Embodiment
Below in conjunction with accompanying drawing the present invention is described in detail, described embodiment only is intended to be convenient to the understanding of the present invention, and it is not played any qualification effect.
Further specify a kind of operating process of the motion capture method based on background modeling below by example.
All codes of this example are C++ and write, and move under Microsoft visual studio 2005 environment, can also adopt other software and hardware conditions, do not repeat them here.
Figure 1A illustrates and the present invention is based on the overall flow figure of background modeling to the motion capture method of binocular vision image.
The present invention is based on the motion capture method of background modeling to the binocular vision image, based on binocular vision and background segment, its method comprises that step is as follows:
Step S1: with binocular camera shooting head stationkeeping, close white balance, obtain the binocular vision image;
Step S2: to the binocular vision image that obtains, under the clean background image of setting frame number, carry out background modeling, obtain background model;
Step S3: the binocular depth information that utilizes computer binocular vision to obtain, calculating pixel belongs to the probability of prospect and background;
Step S4: utilize binocular depth information and background modeling data and dynamic figure to cut algorithm, binocular vision display foreground and background are cut apart, and extracted prospect profile;
Step S5: prospect profile carries out refinement, determines the human body key point, finishes motion-captured.
Comprise according to obtaining the binocular vision image step described in the step S2:
Step S211: guarantee the stationkeeping of camera, do not have tangible light and shade to change in the scene;
Step S212: close the Automatic white balance of camera, in the hardware parameter of camera, the function of automatic exposure parameter and Automatic white balance is arranged generally, so that when scene light changes, realize regulating automatically the function of picture quality; In background modeling, need to set white balance parameter and fix;
Step S213: gather the fixedly clean background image of frame number (100 frame), be stored in the internal memory.
Comprise as follows according to the step of under the clean background image of setting frame number, carrying out background modeling described in the step S2:
Step S221: utilize the Gaussian Background model to gather the coloured image of each frame in the binocular vision image, use R respectively, G, B represent red, and green and blue three-channel value, span are 0~255;
Step S221: obtained N image in the background modeling process, each image comprises 320 * 240 pixels, calculate the brightness I of each pixel and colourity (r, g).Wherein, r=R/ (R+G+B), g=G/ (R+G+B), R, G, B represent the redness in the Color Channel respectively, the value of green and blue component;
Step S221: the fusion background model of setting up pixel scale; Calculate the brightness and average and the variance of colourity in N image of each pixel, and deposit internal memory in;
Step S221: set up feature background model at brightness space, chrominance space set up based on the colourity model.Deposit the colourity obtained and the background model in the brightness space in internal memory.
According to the depth data cost of each pixel in the binocular vision image described in the step S2 is calculated, obtain the degree of depth cost of each pixel, thereby the binocular depth information is introduced, concrete steps comprise:
Step 231: gather and preservation binocular vision image, be designated as left image and right image respectively;
Step 232: set a depth value for each pixel of left image, described depth value is represented with the parallax of left image and right image;
Step 233:, calculate the difference cost of left image and right image at each depth value;
Step 234: add up the cost value in the left image, and the cost value in the left image is divided into four groups according to the size of described cost value;
Step 235: the cost value of each group is upgraded the prospect of this pixel and the cost of background, and the cost that wherein belongs to prospect reduces according to the exponential relationship to parallax, and the cost of background increases according to the exponential relationship of parallax.
According to utilizing binocular depth information and background modeling data and dynamic figure to cut algorithm described in the step S4, binocular vision display foreground and background to be cut apart, and extracted prospect profile, concrete steps comprise:
Step S41: background modeling reads in the binocular vision image that newly reads after finishing, and described binocular vision image comprises left image and right image;
Step S42: utilize the result of binocular vision data cost acquisition, obtain the data cost of binocular information;
Step S43: utilize background model, compare, obtain to utilize figure to cut algorithm basic principle, set up the network flow of max-flow or minimal cut based on the color cost value with the pixel of left figure;
Step S44: two data cost value utilizing step S42 and step S43 to obtain obtain figure and cut data cost value in the algorithm;
Step S45: utilize the relationship of contrast between the left pixel, the level and smooth item that figure is cut in the algorithm carries out assignment;
Step S46: utilize dynamic figure to cut algorithm, will cut apart based on the video flowing of pixel aspect, segmentation result is divided into two parts, and a part is a prospect, and a part is a background in addition.
Step S47: the prospect background that will cut apart is stored in the picture of identical size according to 0 or 1, and 0 or 1 prospect background picture is obtained edge contour;
Step S48: the mode of utilizing High frequency filter makes the edge more level and smooth the edge denoising;
Step S49: utilize the cut zone of the data error of former frames to proofread and correct.
Utilize picture denoising, refinement mode according to step S5, obtain the key point of trunk, thereby realize that motion-captured effect step comprises:
Step S51: will carry out convergent-divergent through the human body contour outline of aftertreatment;
Step S52: the human body contour outline of convergent-divergent is carried out refinement;
Step S53: the human body contour outline of refinement is enlarged, expand original size to;
Step S54: once more profile is carried out refinement;
Step S55: find neighborhood territory pixel greater than 2 node, and get its center-of-gravity value, be set at the gravity center of human body;
Step S56: search for up and down along center of gravity, find node, be set at head and waist;
Step S57: along about center of gravity, search for, find left arm and right arm, and proportionally determine ancon and shoulder with eccentricity;
Step S58: 9 key points will determining compare with former frames, obtain comparatively stable and trunk position accurately.
The first step as shown in Figure 1 is an images acquired.This method adopts the input of binocular vision video.Among the figure, (z) expression is the coordinate of world coordinate system for x, y; (xL, yL) and (xR, yR) pixel coordinate of the same object of expression in left figure and right figure.
(1) mostly the information of Digital Image Processing is two-dimensional signal, and the process information amount is very big.The piece image here with two-dimensional function f (x, y) expression, x wherein, y is a two-dimensional coordinate, f (x, y) expression point (x, colouring information y).Camera is gathered all optical information in the camera lens from the space, these information enter after the computing machine, is converted to the color model that meets computer standard, carries out Digital Image Processing with the program of entering, and guarantees the continuity and the real-time of video.From the image of gathering, each pixel is handled 76800 pixels of 320 * 240 pixels altogether.The initial effect of video of gathering as shown in Figure 1.Project all operations and computing 320 * 240 pixels that all are based on this each frame subsequently.In the binocular vision, same pixel about the position difference of imaging among two figure, and the size of position difference, reflection be the degree of depth of image.Relatively moving of two pixels can be calculated by the coupling of pixel.Method of the present invention is utilized these information, auxiliary finishing cutting apart of prospect and background.As shown in Figure 2, the utilization of binocular information is that the cost that two width of cloth figure mate about usefulness realizes.What wherein P represented is the position of certain pixel in left figure, and P+d represents the position of this pixel in right figure, and what d represented is exactly the parallax (Display) of this pixel.
(2) the present invention is made up of two parts in the process of utilizing the binocular depth information.
Step 1: the coupling cost in that pixel xi calculates is divided into four groups (maximal value of parallax d is set at 32) according to different parallax value:
A group: pixel xiThe parallax that mates is most arranged, promptly optimum parallax (Disparity), the degree of mating most) d>16, represent that this pixel belongs to prospect very much;
B group: pixel xiThe parallax that mates is most arranged, promptly optimum parallax (Disparity), the degree of mating most) d≤16and d>12, represent that this pixel has the very big prospect that may belong to;
C group: pixel xiThe parallax that mates is most arranged, promptly optimum parallax (Disparity), the degree of mating most) d≤12and d>5, represent that this pixel has the very big background that may belong to;
D group: pixel xiThe parallax that mates is most arranged, promptly optimum parallax (Disparity), the degree of mating most) d≤5, represent that this pixel belongs to background very much.
Under such hypothesis, the present invention needs less time that pixel is divided into four groups, rather than each pixel is carried out 32 possible parallaxes suppose.
Step 2: set the suitable data cost value for figure cuts algorithm.Data item of the present invention comprises that respectively this pixel belongs to the cost of prospect or background, uses D respectivelyi(B) and Di(F) expression.The parallax value of pixel is big more, so it to belong to the possibility of prospect big more, so Di(F) value correspondence reduces Di(B) value is corresponding to be increased.By such corresponding relation, the present invention proposes a corresponding scheme, express with following formula:
Di,ts(B)=Di(B)+λte-dct,Di,ts(F)=Di(F)-λte-dctct
For all t=A, B, C, D, λt>0.Wherein
Figure BDA0000040221370000073
What represent is the background model data item that incorporates binocular information, belongs to t=A respectively, B, C, four groups of D.Di(B) expression is the background segment data item of monocular vision.λtBe the parameter of binocular data cost, what i represented is pixel coordinate.
Figure BDA0000040221370000074
What represent is the foreground model data item that incorporates binocular information.The parallax value that d represents (Disparity).ctWhat represent is the parameter of control d.
As shown in Figure 3, figure cut algorithm max-flow or the network flow graph of minimal cut.P wherein, what q represented is two adjacent pixels.Shown in Figure 4 is the process flow diagram that figure cuts algorithm, comprises the assignment of front end and the partitioning portion of rear end.
(3) figure cuts the important component part that algorithm is a background segment, and it to the effect that utilizes the principle of max-flow or minimal cut, the pixel in the image is cut apart according to certain path, and which is calculated belong to prospect or and background respectively.
The segmentation problem of prospect or background in the image can be considered as the binary identified problems in the computer vision field.If pixel i belongs to prospect, the label f of this pixel of mark theni=F, F refers to prospect.In like manner, if this pixel belongs to background, then be labeled as fi=B.Correspond to two-value label problem, label set only comprises two labels.Figure cuts the weighted graph that algorithm constructs and comprises two corresponding with it summit s and t.As shown in Figure 3, among the figure, left figure is the weighted graph G that provides by 3 * 3 original image structure, G=<V, ε 〉, wherein V is a vertex set, is to be called source node S respectively and terminal node T is dimerous by ordinary node and two.Wherein S and the T two-value label of representing prospect and background respectively for summit ε representative be the limit that is connected the summit, the weights size on limit is represented with the thickness of simplification in last figure.
Flow process such as Fig. 4 that dynamic figure cuts.Comprise data item and level and smooth in the energy function, being provided with of they directly affects figure and cuts the final segmentation result of algorithm.That Fig. 5 represents is several groups of Video Segmentation results of the present invention, and wherein left side 3 width of cloth figure are left figure images of doing video in the input video, and right side 3 width of cloth figure are the results after cutting apart.
(4) low-pass filter that designed in the frequency domain of the present invention comes smooth boundary.Along boundary curve C, the boundary curve of the process of edge smoothing of the present invention as shown in Figure 6, the picture left above is represented the input source image, top right plot is represented the result cut apart; The flat prospect of spending that lower-left figure represents or the edge of background, bottom-right graph is represented is result after level and smooth.Point sequence z (i)=[x (i), y (i)] its complex representation form that obtains of sampling at certain intervals is:
z(i)=x(i)+jy(i)
The Fourier transform of discrete z (i) is:
f(u)=1KΣi=0K-1z(i)e-j2πui/K
In the formula, j, u, K represent complex symbol respectively, frequency and constant term, f (u) is the Fourier transform of z (i), the Fourier that is called the border is described son, is the expression of boundary point sequence in frequency domain.By the Fourier transform theory as can be known, high fdrequency component comprises details, low frequency component decision global shape.Curve is because jagged ability is rough, and high fdrequency component is contained in these rough zones.The HFS of f (u) is carried out filtering just can obtain smooth curve.The present invention defines the high-frequency energy of low frequency energy ratio and filtering 5%:
r(l)=Σu=0l|f(u)|2/Σk=0K-1|f(u)|2
Wherein || be modulo operation.Getting the minimum l value that r (l)>0.95 is set up is the cutoff frequency of low-pass filter.Utilize the character of fourier coefficient(
Figure BDA0000040221370000084
Be the conjugate complex number of f).In coefficient f (u), the radio-frequency component of cancellation in from l to the K-1-l scope.Carry out inverse fourier transform again, the part of curve sudden change has obtained smoothly.
Being motion-captured result of the present invention as shown in Figure 7, wherein is two two field pictures of left figure in the video in the left hand view, and the right side is key point and the skeleton that segmentation result has extracted.Key point represents with circle that skeleton is represented with line.
(5) the present invention motion-captured on the basis of cutting apart comprises three steps,
Step 1: the result that will cut apart carries out aftertreatment, obtains level and smooth and stable relatively contour area, cuts apart owing to relate to profile, so the border does not need accurate calculating.Under will not situation, can finish the skeleton motion tracking effect that this paper needs preferably than macroscopic-void.
Step 2: the profile that will cut apart positions, and determines the basic comprising of nine points.Comprising A1, A2, A3Be representative group, A herein1, A2, A3, A4, A5, A6, A7, A8, A9Nine points.A1, A2, A3Represent three points of head and trunk respectively, A4, A5, A6And A7, A8, A9Represent three points of left arm and right arm respectively.
Step 3: the order of nine points being installed the skeleton profile connects, and finishes motion-captured.
The above; only be the embodiment among the present invention, but protection scope of the present invention is not limited thereto, anyly is familiar with the people of this technology in the disclosed technical scope of the present invention; can understand conversion or the replacement expected, all should be encompassed within the protection domain of claims of the present invention.

Claims (6)

Translated fromChinese
1.一种基于背景建模对双目视觉图像的运动捕捉方法,是基于双目视觉和背景分割的方法,其特征在于,包括步骤如下:1. a method for motion capture of binocular vision images based on background modeling, is based on binocular vision and the method for background segmentation, is characterized in that, comprises steps as follows:步骤S1:将双目摄像头位置固定,关闭白平衡,获取双目视觉图像;Step S1: Fix the position of the binocular camera, turn off the white balance, and obtain the binocular vision image;步骤S2:对获取的双目视觉图像,在设定帧数的干净背景图像下进行背景建模,得到背景模型;Step S2: Perform background modeling on the acquired binocular vision image under a clean background image with a set number of frames to obtain a background model;步骤S3:利用计算机双目视觉获取的双目深度信息,计算像素属于前景和背景的概率;Step S3: Using the binocular depth information obtained by computer binocular vision, calculate the probability that the pixel belongs to the foreground and the background;步骤S4:利用双目深度信息和背景建模数据和动态图切算法,将双目视觉图像前景和背景进行分割,并提取前景轮廓;Step S4: Segment the foreground and background of the binocular vision image by using the binocular depth information, background modeling data and dynamic image cutting algorithm, and extract the foreground contour;步骤S5:前景轮廓进行细化,确定人体关键点,完成运动捕捉。Step S5: refine the foreground outline, determine the key points of the human body, and complete the motion capture.2.根据权利要求1所述的基于背景建模对双目视觉图像的运动捕捉方法,其特征在于:步骤S2中所述获取双目视觉图像的步骤包括如下:2. the motion capture method based on background modeling to binocular vision image according to claim 1, is characterized in that: the step of obtaining binocular vision image described in step S2 comprises as follows:步骤S211:保证摄像头的位置固定,场景中没有明显的明暗变化;Step S211: ensure that the position of the camera is fixed, and there is no obvious change of light and shade in the scene;步骤S212:关闭摄像头的自动白平衡,在摄像头的硬件参数中,一般有自动曝光参数和自动白平衡的功能,以便在场景光线变化时实现自动调节图片质量的功能;在背景建模中,需要设定白平衡参数固定;Step S212: Turn off the automatic white balance of the camera. In the hardware parameters of the camera, there are generally automatic exposure parameters and automatic white balance functions, so as to realize the function of automatically adjusting the picture quality when the scene light changes; Set the white balance parameters to be fixed;步骤S213:采集固定帧数干净背景图像,储存在内存中。Step S213: collecting a clean background image with a fixed number of frames, and storing it in the memory.3.根据权利要求1所述的基于背景建模对双目视觉图像的运动捕捉方法,其特征在于:步骤S2中所述在设定帧数的干净背景图像下进行背景建模的步骤包括如下:3. The method for motion capture of binocular vision images based on background modeling according to claim 1, characterized in that: the step of carrying out background modeling under the clean background image of the set number of frames described in step S2 includes as follows :步骤S221:利用高斯背景模型采集双目视觉图像中每一帧的彩色图像,分别用R,G,B代表红色,绿色和蓝色三通道的值,取值范围为0~255;Step S221: Use the Gaussian background model to collect the color image of each frame in the binocular vision image, and use R, G, and B to represent the values of the three channels of red, green, and blue respectively, and the value range is 0-255;步骤S221:在背景建模过程中获取了N个图像,每个图像包含320×240个像素,计算每个像素的亮度I和色度(r,g),其中,r=R/(R+G+B),g=G/(R+G+B),R,G,B分别表示颜色通道中的红色,绿色和蓝色分量的值;Step S221: In the background modeling process, N images are acquired, each image contains 320×240 pixels, and the brightness I and chromaticity (r, g) of each pixel are calculated, wherein, r=R/(R+ G+B), g=G/(R+G+B), R, G, B represent the red color in the color channel respectively, the value of green and blue component;步骤S221:建立像素级别的融合背景模型;计算每个像素的亮度和色度在N个图像中的均值和方差,并存入内存;Step S221: Establish a pixel-level fused background model; calculate the mean and variance of the brightness and chromaticity of each pixel in the N images, and store them in memory;步骤S221:在亮度空间建立特征背景模型,在色度空间建立基于的色度模型,将获取的色度和亮度空间中的背景模型存入内存。Step S221: Establish a characteristic background model in the luminance space, establish a chromaticity model based on it in the chromaticity space, and store the acquired background model in the chromaticity and luminance space into memory.4.根据权利要求1所述的基于背景建模对双目视觉图像的运动捕捉方法,其特征在于:对步骤S2中所述双目视觉图像中的每个像素的深度数据代价进行计算,得到每个像素的深度代价,从而将双目深度信息引入,具体步骤包括如下:4. the method for motion capture of binocular vision images based on background modeling according to claim 1, characterized in that: the depth data cost of each pixel in the binocular vision images described in step S2 is calculated to obtain The depth cost of each pixel, so as to introduce the binocular depth information, the specific steps include the following:步骤231:采集并保存双目视觉图像,分别记为左图像和右图像;Step 231: Collect and save binocular vision images, which are recorded as left image and right image respectively;步骤232:为左图像的每个像素设定一个深度值,所述深度值用左图像和右图像的视差表示;Step 232: set a depth value for each pixel of the left image, the depth value is represented by the disparity between the left image and the right image;步骤233:针对每个深度值,计算左图像和右图像的差异代价;Step 233: For each depth value, calculate the difference cost of the left image and the right image;步骤234:统计左图像中的代价值,并且将左图像中的代价值按照所述代价值的大小分成四组;Step 234: Count the cost values in the left image, and divide the cost values in the left image into four groups according to the size of the cost values;步骤235:将每一组的代价值对该像素的前景和背景的代价进行更新,其中属于前景的代价按照对视差的指数关系进行减少,背景的代价按照视差的指数关系进行增加。Step 235: Update the foreground and background costs of the pixel with the cost value of each group, wherein the foreground cost decreases according to the exponential relationship with the parallax, and the background cost increases according to the parallax exponential relationship.5.根据权利要求1所述的基于背景建模对双目视觉图像的运动捕捉方法,其特征在于:步骤S4所述利用双目深度信息和背景建模数据和动态图切算法,将双目视觉图像前景和背景进行分割,并提取前景轮廓,具体步骤包括如下:5. The method for motion capture of binocular vision images based on background modeling according to claim 1, characterized in that: step S4 uses binocular depth information and background modeling data and dynamic image cutting algorithm to convert binocular Segment the foreground and background of the visual image, and extract the foreground contour. The specific steps include the following:步骤S41:背景建模结束后,读入新读取的双目视觉图像,所述双目视觉图像包括左图像和右图像;Step S41: After the background modeling is completed, read in the newly read binocular vision image, the binocular vision image includes a left image and a right image;步骤S42:利用双目视觉数据代价获得的结果,得到双目信息的数据代价;Step S42: Obtain the data cost of the binocular information by using the result obtained from the binocular vision data cost;步骤S43:利用背景模型,同左图的像素进行比较,获得基于色彩代价值,利用图切算法的基本原理,建立最大流或最小割的网络流;Step S43: use the background model to compare with the pixels in the left picture, obtain the value based on the color value, and use the basic principle of the graph cut algorithm to establish the maximum flow or the minimum cut network flow;步骤S44:利用步骤S42和步骤S43获得的两个数据代价值,得到图切算法中的数据代价值;Step S44: use the two data cost values obtained in step S42 and step S43 to obtain the data cost value in the graph-cut algorithm;步骤S45:利用左图像素之间的对比度关系,对图切算法中的平滑项进行赋值;Step S45: Using the contrast relationship between the pixels in the left image, assign a value to the smoothing item in the image cut algorithm;步骤S46:利用动态图切算法,将基于像素层面的视频流进行分割,分割结果分为两部分,一部分为前景,另外一部分为背景;Step S46: Using the dynamic image cutting algorithm to segment the video stream based on the pixel level, the segmentation result is divided into two parts, one part is the foreground, and the other part is the background;步骤S47:将分割的前景背景按照0或1存储在相同大小的图片中,并将0或1的前景背景图片获得边缘轮廓;Step S47: store the segmented foreground and background in pictures of the same size according to 0 or 1, and obtain edge contours from the foreground and background pictures of 0 or 1;步骤S48:利用高频滤波的方式将边缘去噪,使得边缘更加平滑;Step S48: Denoising the edge by means of high-frequency filtering to make the edge smoother;步骤S49:利用前几帧的数据对错误的分割区域进行校正。Step S49: Use the data of the previous few frames to correct the erroneous segmented regions.6.根据权利要求1所述的基于背景建模对双目视觉图像的运动捕捉方法,其特征在于:利用图片去噪、细化方式,获取人体躯干的关键点,从而实现运动捕捉的效果步骤包括如下:6. the motion capture method based on background modeling to binocular vision image according to claim 1, is characterized in that: utilize picture denoising, thinning mode, obtain the key point of human trunk, thereby realize the effect step of motion capture Including the following:步骤S51:将经过后处理的人体轮廓进行缩放;Step S51: scaling the post-processed human silhouette;步骤S52:将缩放的人体轮廓进行细化;Step S52: Thinning the scaled human body outline;步骤S53:将细化的人体轮廓进行扩大,扩大到原来的大小;Step S53: Enlarging the thinned outline of the human body to its original size;步骤S54:再次将轮廓进行细化;Step S54: refine the outline again;步骤S55:找到邻域像素大于2的节点,并取其重心值,设定为人体重心;Step S55: Find a node whose neighborhood pixels are greater than 2, and take its center of gravity value, and set it as the center of gravity of the human body;步骤S56:沿着重心上下搜索,找到节点,设定为头部和腰部;Step S56: search up and down along the center of gravity, find nodes, and set them as head and waist;步骤S57:沿着重心左右搜索,找到左臂和右臂,并且按照比例和离心率确定肘部和肩部;Step S57: search left and right along the center of gravity to find the left arm and right arm, and determine the elbow and shoulder according to the proportion and eccentricity;步骤S58:将确定的9个关键点同前几帧进行比较,获得较为稳定和准确的人体躯干位置。Step S58: Compare the determined 9 key points with the previous frames to obtain a relatively stable and accurate position of the human torso.
CN 2010106025442010-12-232010-12-23Motion capture method for binocular vision image based on background modelingExpired - Fee RelatedCN102034247B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN 201010602544CN102034247B (en)2010-12-232010-12-23Motion capture method for binocular vision image based on background modeling

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN 201010602544CN102034247B (en)2010-12-232010-12-23Motion capture method for binocular vision image based on background modeling

Publications (2)

Publication NumberPublication Date
CN102034247Atrue CN102034247A (en)2011-04-27
CN102034247B CN102034247B (en)2013-01-02

Family

ID=43887100

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN 201010602544Expired - Fee RelatedCN102034247B (en)2010-12-232010-12-23Motion capture method for binocular vision image based on background modeling

Country Status (1)

CountryLink
CN (1)CN102034247B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN102184008A (en)*2011-05-032011-09-14北京天盛世纪科技发展有限公司Interactive projection system and method
CN102927652A (en)*2012-10-092013-02-13清华大学Intelligent air conditioner control method based on positions of indoor persons and objects
CN103826071A (en)*2014-03-112014-05-28深圳市中安视科技有限公司Three-dimensional camera shooting method for three-dimensional identification and continuous tracking
CN104243951A (en)*2013-06-072014-12-24索尼电脑娱乐公司Image processing device, image processing system and image processing method
CN105374043A (en)*2015-12-022016-03-02福州华鹰重工机械有限公司Method and device of background filtering of visual odometry
CN105516579A (en)*2014-09-252016-04-20联想(北京)有限公司Image processing method and device and electronic equipment
CN106056056A (en)*2016-05-232016-10-26浙江大学Long-distance non-contact luggage volume detection system and method thereof
CN109064511A (en)*2018-08-222018-12-21广东工业大学 A method, device and related equipment for measuring the height of the center of gravity of a human body
CN109214996A (en)*2018-08-292019-01-15深圳市元征科技股份有限公司A kind of image processing method and device
CN110490877A (en)*2019-07-042019-11-22西安理工大学Binocular stereo image based on Graph Cuts is to Target Segmentation method
CN111567036A (en)*2017-12-072020-08-21微软技术许可有限责任公司 Video capture system and method

Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20040125207A1 (en)*2002-08-012004-07-01Anurag MittalRobust stereo-driven video-based surveillance
US20070031037A1 (en)*2005-08-022007-02-08Microsoft CorporationStereo image segmentation
CN101344965A (en)*2008-09-042009-01-14上海交通大学 Tracking system based on binocular camera
CN101389004A (en)*2007-09-132009-03-18中国科学院自动化研究所 A Moving Target Classification Method Based on Online Learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20040125207A1 (en)*2002-08-012004-07-01Anurag MittalRobust stereo-driven video-based surveillance
US20070031037A1 (en)*2005-08-022007-02-08Microsoft CorporationStereo image segmentation
CN101389004A (en)*2007-09-132009-03-18中国科学院自动化研究所 A Moving Target Classification Method Based on Online Learning
CN101344965A (en)*2008-09-042009-01-14上海交通大学 Tracking system based on binocular camera

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《IEEE Transactions on Pattern Analysis and Machine Intelligence》 20060930 Vladimir Kolmogorov et al. Probabilistic Fusion of Stereo with Color and Contrast for Bilayer Segmentation 全文 1-6 第28卷, 第9期 2*
《LNCS》 20091231 Xiaoyu Wu et al. Video Background Segmentation Using Adaptive Background Models 全文 1-6 第5716卷, 2*

Cited By (16)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN102184008A (en)*2011-05-032011-09-14北京天盛世纪科技发展有限公司Interactive projection system and method
CN102927652B (en)*2012-10-092015-06-24清华大学Intelligent air conditioner control method based on positions of indoor persons and objects
CN102927652A (en)*2012-10-092013-02-13清华大学Intelligent air conditioner control method based on positions of indoor persons and objects
CN104243951B (en)*2013-06-072017-01-11索尼电脑娱乐公司Image processing device, image processing system and image processing method
CN104243951A (en)*2013-06-072014-12-24索尼电脑娱乐公司Image processing device, image processing system and image processing method
US10293252B2 (en)2013-06-072019-05-21Sony Interactive Entertainment Inc.Image processing device, system and method based on position detection
CN103826071A (en)*2014-03-112014-05-28深圳市中安视科技有限公司Three-dimensional camera shooting method for three-dimensional identification and continuous tracking
CN105516579A (en)*2014-09-252016-04-20联想(北京)有限公司Image processing method and device and electronic equipment
CN105374043A (en)*2015-12-022016-03-02福州华鹰重工机械有限公司Method and device of background filtering of visual odometry
CN106056056A (en)*2016-05-232016-10-26浙江大学Long-distance non-contact luggage volume detection system and method thereof
CN111567036A (en)*2017-12-072020-08-21微软技术许可有限责任公司 Video capture system and method
CN109064511A (en)*2018-08-222018-12-21广东工业大学 A method, device and related equipment for measuring the height of the center of gravity of a human body
CN109064511B (en)*2018-08-222022-02-15广东工业大学Method and device for measuring height of center of gravity of human body and related equipment
CN109214996A (en)*2018-08-292019-01-15深圳市元征科技股份有限公司A kind of image processing method and device
CN109214996B (en)*2018-08-292021-11-12深圳市元征科技股份有限公司Image processing method and device
CN110490877A (en)*2019-07-042019-11-22西安理工大学Binocular stereo image based on Graph Cuts is to Target Segmentation method

Also Published As

Publication numberPublication date
CN102034247B (en)2013-01-02

Similar Documents

PublicationPublication DateTitle
CN102034247A (en)Motion capture method for binocular vision image based on background modeling
CN107423698B (en) A Gesture Estimation Method Based on Parallel Convolutional Neural Network
CN113240691A (en)Medical image segmentation method based on U-shaped network
CN110853026B (en)Remote sensing image change detection method integrating deep learning and region segmentation
CN108062525B (en) A deep learning hand detection method based on hand region prediction
CN108388882B (en)Gesture recognition method based on global-local RGB-D multi-mode
CN102567727B (en)Method and device for replacing background target
US9317970B2 (en)Coupled reconstruction of hair and skin
CN112150493B (en)Semantic guidance-based screen area detection method in natural scene
CN112464847B (en) Human action segmentation method and device in video
CN110555857A (en)semantic edge dominant high-resolution remote sensing image segmentation method
CN108399361A (en)A kind of pedestrian detection method based on convolutional neural networks CNN and semantic segmentation
CN102184551A (en)Automatic target tracking method and system by combining multi-characteristic matching and particle filtering
CN103856727A (en)Multichannel real-time video splicing processing system
CN111462027B (en)Multi-focus image fusion method based on multi-scale gradient and matting
CN114495170B (en)Pedestrian re-recognition method and system based on local suppression self-attention
CN114120389B (en)Method, device, equipment and storage medium for network training and video frame processing
CN109543632A (en)A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features
CN103440662A (en)Kinect depth image acquisition method and device
CN110705634B (en)Heel model identification method and device and storage medium
CN109754440A (en) A shadow region detection method based on fully convolutional network and mean shift
CN101930606A (en)Field depth extending method for image edge detection
CN107025672A (en)A kind of conspicuousness detection method based on improvement convex closure
CN106951829A (en) A Video Salient Object Detection Method Based on Minimum Spanning Tree
CN117133032A (en)Personnel identification and positioning method based on RGB-D image under face shielding condition

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
C14Grant of patent or utility model
GR01Patent grant
CF01Termination of patent right due to non-payment of annual fee

Granted publication date:20130102

Termination date:20151223

EXPYTermination of patent right or utility model

[8]ページ先頭

©2009-2025 Movatter.jp