CN103425987A

Movatterモバイル変換

Info

Publication number: CN103425987A
Application number: CN201310396167XA
Authority: CN
Inventors: 张毅; 罗元; 刘想德; 徐晓东; 林海波; 崔叶
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2013-09-03
Filing date: 2013-09-03
Publication date: 2013-12-04
Anticipated expiration: 2033-09-03
Also published as: CN103425987B

Abstract

本发明公开了一种基于双混合唇形特征提取的智能轮椅人机交互方法，涉及唇形识别技术的特征提取与识别控制领域。本发明首先对唇部进行DT_CWT滤波，由于DT_CWT具有近似平移不变性，所以DT_CWT滤波后会使在ROI内不同位置的相同唇形的特征值之间差值较小，克服唇部因在ROI位置偏移而导致唇形识别错误的影响；然后再对DT_CWT提取的唇部特征向量进行DCT变换，使经DT_CWT变换后提取的唇部特征集中在DCT变换后的较大系数中，使特征矢量包含唇部最大的信息量，并且同时达到降维的效果。该方法大大地提高了唇形识别率，提高了唇形识别系统鲁棒性。

The invention discloses a human-computer interaction method for an intelligent wheelchair based on dual hybrid lip shape feature extraction, and relates to the field of feature extraction and recognition control of lip shape recognition technology. The present invention first performs DT_CWT filtering on the lips. Since DT_CWT has approximate translation invariance, the difference between the eigenvalues of the same lip shape at different positions in the ROI will be small after DT_CWT filtering, and overcome the problem that the lips are in the ROI position. The effect of lip shape recognition error caused by offset; then DCT transform is performed on the lip feature vector extracted by DT_CWT, so that the lip features extracted after DT_CWT transform are concentrated in the larger coefficients after DCT transformation, so that the feature vector contains The lip has the largest amount of information, and at the same time achieves the effect of dimensionality reduction. This method greatly improves the lip shape recognition rate and improves the robustness of the lip shape recognition system.

Description

Translated fromChinese

基于双混合唇形特征提取的智能轮椅人机交互方法Human-computer interaction method for intelligent wheelchair based on double-hybrid lip shape feature extraction

技术领域technical field

本发明涉及唇部视觉信息分析与识别控制领域，特别涉及一种唇形处理系统中的特征提取方法。The invention relates to the field of lip visual information analysis and recognition control, in particular to a feature extraction method in a lip shape processing system.

背景技术Background technique

当今社会，世界人口老龄化速度在不断的加快，各种疾病和灾祸等原因造成残障人士的人数也在逐年的上升。诸多因素造成老年人和残障人士在身体上存在不同程度的缺陷，尤其是下肢运动障碍给他们带来了巨大的不便，使他们无法正常的生活。为此，无障碍技术逐渐进入了人们的视线，并且得到了广泛的关注。In today's society, the aging rate of the world's population is constantly accelerating, and the number of people with disabilities caused by various diseases and disasters is also increasing year by year. Many factors have caused the elderly and disabled people to have different degrees of physical defects, especially the lower limb movement disorders have brought them great inconvenience and prevented them from living a normal life. For this reason, barrier-free technology has gradually entered people's sight and has received extensive attention.

无障碍技术是借助先进的科学技术为老年人和残障人士提供有效的辅助手段，使他们能够重新融入社会。人机交互技术是无障碍技术的重要研究内容之一。人机交互技术根据采用控制模式的不同可以分为两类：第一，通过硬件实施操作完成人机交互，如操作鼠标、键盘、操纵杆等。这种控制方式容易操作，但是并不适用于失去上肢或上肢存在缺陷的人群；第二，采用模式识别技术，利用人体自身的器官，如手、腕部、头部和脑电等完成人机交互。具体说来是通过语音识别、手势识别、头部运动、腕部运动、肌电信号和脑电信号(EEG)等来完成对电子设备的控制。这种人机交互方式具有非接触性，交互过程也比较直观，并且适用范围更广。因此，这项技术具有潜在的研究价值和意义。Barrier-free technology is the use of advanced science and technology to provide effective assistive means for the elderly and the disabled so that they can reintegrate into society. Human-computer interaction technology is one of the important research contents of barrier-free technology. Human-computer interaction technology can be divided into two categories according to the different control modes used: first, the human-computer interaction is completed through hardware implementation operations, such as operating the mouse, keyboard, joystick, etc. This control method is easy to operate, but it is not suitable for people who have lost their upper limbs or have defects in their upper limbs. interact. Specifically, control of electronic devices is accomplished through speech recognition, gesture recognition, head movement, wrist movement, myoelectric signals and electroencephalogram signals (EEG). This human-computer interaction method is non-contact, the interaction process is also more intuitive, and the scope of application is wider. Therefore, this technology has potential research value and significance.

日常生活中，人与人之间的交流大多数是通过嘴巴说话进行交流，在视觉人机交互中我们同样可以通过摄像头采集唇部运动信息来进行和谐、友好的人机交互。利用唇部视觉信息来控制智能轮椅是当前的一个热点。对于聋哑残障人士和说话模糊的老年人来说，是一个实现与机器人正常“说话”的交互方式。通过唇部的运动进行控制，用户的身体可以保持静止不动，这对严重的残疾患者人来说，该控制方式是非常有必要的。In daily life, most of the communication between people is through mouth speaking. In visual human-computer interaction, we can also collect lip movement information through cameras to conduct harmonious and friendly human-computer interaction. The use of lip visual information to control smart wheelchairs is a current hotspot. For the deaf-mute disabled and the elderly with blurred speech, it is an interactive way to achieve normal "speaking" with the robot. Controlled by the movement of the lips, the user's body can remain still, which is very necessary for severely disabled patients.

智能轮椅作为一种代步工具，主要是为老年人和残障人士提供服务。它融合了多种技术，如自主导航、避障和人机交互等技术。传统意义上的智能轮椅是通过手动操纵杆来完成对运动的控制，但是并不适用于上肢不便的用户，因此应用的人群范围受到了限制。随着科技的迅猛发展，基于模式识别的新型控制技术已经在智能轮椅上得到了广泛应用，如手势、头部运动、肌电信号和基于脑电信号的BCI技术等。为了给更多的残障人士和老年人提供一种能与机器人正常“说话”的交互方式，且根据嘴巴活动灵活、快速且形状多变的特点，所以基于唇形的人机交互技术在智能轮椅中的应用前景将十分广阔。将唇形识别技术应用于智能轮椅上，不仅使其具有传统轮椅的功能，还能够通过变换不同的唇形来完成对轮椅的运动控制。因此，研究基于唇形的智能轮椅系统具有重要的应用价值与现实意义。As a means of transportation, smart wheelchairs mainly serve the elderly and the disabled. It incorporates a variety of technologies, such as autonomous navigation, obstacle avoidance, and human-computer interaction. In the traditional sense, smart wheelchairs control the movement through manual joysticks, but they are not suitable for users with handicapped upper limbs, so the range of people who can be applied is limited. With the rapid development of science and technology, new control technologies based on pattern recognition have been widely used in smart wheelchairs, such as gestures, head movements, myoelectric signals, and BCI technologies based on EEG signals. In order to provide more people with disabilities and the elderly with a way to interact with robots normally, and according to the characteristics of flexible, fast and changeable mouth movements, the human-computer interaction technology based on lip shape is used in smart wheelchairs. The application prospect will be very broad. Applying the lip shape recognition technology to the smart wheelchair not only makes it have the functions of a traditional wheelchair, but also can complete the motion control of the wheelchair by changing different lip shapes. Therefore, the study of lip-based intelligent wheelchair system has important application value and practical significance.

发明内容Contents of the invention

有鉴于此，本发明所要解决的技术问题是提供一种针对唇形识别技术领域中的唇形特征提取环节，提出一种混合双树复小波（Dual-Tree Complex Wavelet Transform,DT_CWT）与离散余弦变换(Discrete Cosine Transform,DCT)对唇形进行特征提取的方法。In view of this, the technical problem to be solved by the present invention is to provide a lip shape feature extraction link in the field of lip shape recognition technology, and propose a hybrid dual-tree complex wavelet (Dual-Tree Complex Wavelet Transform, DT_CWT) and discrete cosine Transformation (Discrete Cosine Transform, DCT) method for feature extraction of lip shape.

本发明的目的是这样实现的：The purpose of the present invention is achieved like this:

本发明提供的基于双混合唇形特征提取的智能轮椅人机交互方法，包括以下步骤：The intelligent wheelchair human-computer interaction method based on double-mixed lip feature extraction provided by the present invention comprises the following steps:

S1：采集包含人脸的图像；S1: collecting an image containing a human face;

S2：对图像经过图像预处理后提取唇部图像；S2: extract the lip image after image preprocessing;

S3：根据唇部图像提取唇形特征向量；S3: Extracting lip shape feature vectors according to the lip image;

S4：根据唇形特征向量获取唇形识别结果；S4: Obtain the lip shape recognition result according to the lip shape feature vector;

S5：根据唇形识别结果产生控制指令并驱动智能轮椅运动。S5: Generate control instructions based on the lip shape recognition results and drive the smart wheelchair to move.

进一步，所述步骤S3中的提取唇形特征向量具体包括以下步骤：Further, the extraction of the lip shape feature vector in the step S3 specifically includes the following steps:

S31：对唇部图像进行DT_CWT滤波并通过DT_CWT算法提取的唇部特征向量；S31: Perform DT_CWT filtering on the lip image and extract the lip feature vector through the DT_CWT algorithm;

S32：对唇部特征向量进行DCT变换形成唇形特征向量并进行特征分类；S32: Perform DCT transformation on the lip feature vector to form a lip shape feature vector and perform feature classification;

S33：将特征分类的结果转换成唇形识别结果。S33: Convert the result of feature classification into the result of lip shape recognition.

进一步，所述控制指令是通过无线传输方式传送给智能轮椅。Further, the control instruction is transmitted to the intelligent wheelchair through wireless transmission.

进一步，所述步骤S31中对唇部图像进行DT_CWT滤波并通过DT_CWT算法提取的唇部特征向量的具体步骤如下：Further, in the step S31, the specific steps of performing DT_CWT filtering on the lip image and extracting the lip feature vector through the DT_CWT algorithm are as follows:

S311：将唇部图像设置为ROI图像并归一化ROI图像；S311: set the lip image as the ROI image and normalize the ROI image;

S312：将归一化后的ROI图像分割成若干子图像；S312: Divide the normalized ROI image into several sub-images;

S313：对每个子图像进行DT_CWT多尺度二维滤波，在每个尺度上形成高频系数矩阵；S313: Perform DT_CWT multi-scale two-dimensional filtering on each sub-image to form a matrix of high-frequency coefficients on each scale;

S314：对所有尺度上的高频系数矩阵进行复系数的幅度值计算以形成实数系数矩阵；S314: Calculating the amplitude values of the complex coefficients on the high-frequency coefficient matrices on all scales to form a real coefficient matrix;

S315：将实数系数矩阵按列方向依次排列按如下方式形成特征向量X：S315: Arranging the matrix of real coefficients sequentially in the column direction to form the eigenvector X in the following manner:

其中，上标T表示转置操作，V_l,θ表示每个尺度上的实数矩阵按列方向依次排列形成的列向量，l表示DT_CWT变换的分解层数，θ表示DT_CWT变换的方向参数。Among them, the superscript T represents the transpose operation, V_{l, θ} represents the column vector formed by the real number matrix on each scale arranged in sequence in the column direction, l represents the number of decomposition layers of DT_CWT transformation, and θ represents the direction parameter of DT_CWT transformation.

进一步，所述步骤S32中对唇部特征向量进行DCT变换形成唇形特征向量并进行特征分类的具体步骤如下：Further, in the step S32, the lip feature vector is carried out to DCT transformation to form the lip feature vector and the specific steps of feature classification are as follows:

S321：对唇部特征向量采用如下公式进行降维计算：S321: Perform dimensionality reduction calculation on the lip feature vector using the following formula:

Y＝AX，Y=AX,

其中，X表示N维特征向量，Y表示M维低维特征，A表示线性变换矩阵；Among them, X represents an N-dimensional feature vector, Y represents an M-dimensional low-dimensional feature, and A represents a linear transformation matrix;

S322：选择满足预设条件的DCT特征系数，所述DCT特征系数通过以下公式计算：S322: Select a DCT characteristic coefficient satisfying a preset condition, and the DCT characteristic coefficient is calculated by the following formula:

$x x ((u u,, v v)) = = a a ((u u)) a a ((v v)) {Σ Σ}_{x x = = 00}^{M m - - 11} {Σ Σ}_{y the y = = 00}^{N N - - 11} f f ((x x,, y the y)) cos cos [[\frac{((22 x x + + 11)) uπ uπ}{22 M m}]] cos cos [[\frac{((22 y the y + + 11)) vπ vπ}{22 N N}]]$

其中，x(u,v)为DCT特征系数，u＝0,1,2,…,M-1；v＝0,1,2,…,N-1；f(x,y)表示一幅大小为M×N的图像，a(u)，a(ν)分别定义为：Among them, x(u,v) is the DCT characteristic coefficient, u=0,1,2,...,M-1; v=0,1,2,...,N-1; f(x,y) represents a For an image of size M×N, a(u), a(ν) are defined as:

$a a ((u u)) = = \begin{matrix} \{\begin{matrix} \sqrt{\frac{11}{M m}},, u u = = 00 \\ \sqrt{\frac{22}{M m}},, u u = = 1,2 1,2,, \cdot &Center Dot; \cdot &Center Dot; \cdot &Center Dot;,, M m - - 11 \end{matrix} \end{matrix}$ $a a ((v v)) = = \{\begin{matrix} \sqrt{\frac{11}{N N}},, v v = = 00 \\ \sqrt{\frac{22}{N N}},, v v = = 1,2 1,2,, \cdot &Center Dot; \cdot &Center Dot; \cdot &Center Dot;,, N N - - 11 \end{matrix}$

S323：采用Zig-Zag法按以下方式构造唇部特征向量：S323: Use the Zig-Zag method to construct the lip feature vector in the following manner:

$y the y = = [[{x x}_{00}^{11},, {x x}_{11}^{11},, \cdot &Center Dot; \cdot &Center Dot; \cdot \cdot {,, x x}_{K K - - 11}^{11},, {x x}_{00}^{22},, {x x}_{11}^{22},, \cdot &Center Dot; \cdot &Center Dot; \cdot \cdot,, {x x}_{K K - - 11}^{22},, \cdot &Center Dot; \cdot \cdot \cdot &Center Dot;,, {x x}_{00}^{99},, {x x}_{11}^{99},, \cdot \cdot \cdot &Center Dot; \cdot &Center Dot;,, {x x}_{K K - - 11}^{99} {]]}^{T T},,$

其中，K表示在子图像中Zig-Zag选择的特征系数的个数，表示第m个子图像的第n个特征系数。Among them, K represents the number of feature coefficients selected by Zig-Zag in the sub-image, Indicates the nth feature coefficient of the mth subimage.

进一步，所述步骤S1中采用摄像头来采集包含人脸的图像。Further, in the step S1, a camera is used to capture an image containing a human face.

进一步，所述图像预处理、提取唇形特征向量以及获取唇形识别结果采用作为上位机的笔记本电脑或单片机。Further, the image preprocessing, extracting the lip shape feature vector and obtaining the lip shape recognition result adopt a notebook computer or a single-chip microcomputer as a host computer.

进一步，所述驱动的智能轮椅作为上位机控制的下位机。Further, the driven intelligent wheelchair is used as the lower computer controlled by the upper computer.

本发明的优点在于：本发明采用一种混合DT_CWT和DCT的唇形特征提取方法来对唇形进行识别，本发明首先对唇部进行DT_CWT滤波，由于DT_CWT具有近似平移不变性，所以DT_CWT滤波后会使在ROI内不同位置的相同唇形的特征值之间差值较小，克服唇部因在ROI位置偏移而导致唇形识别错误的影响；然后再对DT_CWT提取的唇部特征向量进行DCT变换，使经DT_CWT变换后提取的唇部特征集中在DCT变换后的较大系数中，使特征矢量包含唇部最大的信息量，并且同时达到降维的效果。该方法大大地提高了唇形识别率，提高了唇形识别系统鲁棒性。The advantage of the present invention is that: the present invention adopts a kind of lip shape feature extraction method that mixes DT_CWT and DCT to recognize lip shape, the present invention first carries out DT_CWT filter to lip, because DT_CWT has approximate translation invariance, so after DT_CWT filter The difference between the eigenvalues of the same lip shape at different positions in the ROI will be small, and overcome the influence of the lip shape recognition error caused by the offset of the lip in the ROI position; and then perform the lip feature vector extracted by DT_CWT DCT transformation makes the lip features extracted after DT_CWT transformation concentrated in the larger coefficients after DCT transformation, so that the feature vector contains the largest amount of information about the lips, and at the same time achieves the effect of dimensionality reduction. This method greatly improves the lip shape recognition rate and improves the robustness of the lip shape recognition system.

附图说明Description of drawings

为了使本发明的目的、技术方案和优点更加清楚，下面将结合附图对本发明作进一步的详细描述，其中：In order to make the purpose, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with the accompanying drawings, wherein:

图1为基于唇形的智能轮椅控制系统框架；Figure 1 is the framework of the lip-based intelligent wheelchair control system;

图2为DCT唇形特征提取框图；Fig. 2 is a DCT lip shape feature extraction block diagram;

图3为DT_CWT分解结构图。Figure 3 is a breakdown structure diagram of DT_CWT.

具体实施方式Detailed ways

以下将结合附图，对本发明的优选实施例进行详细的描述；应当理解，优选实施例仅为了说明本发明，而不是为了限制本发明的保护范围。The preferred embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings; it should be understood that the preferred embodiments are only for illustrating the present invention, rather than limiting the protection scope of the present invention.

实施例1Example 1

本实施例中的双混合唇形特征提取方法是指采用混合DT_CWT与DCT的唇形特征提取的方法。本发明针对唇形识别技术领域中的唇形特征提取环节，提出一种混合双树复小波（Dual-Tree Complex Wavelet Transform,DT_CWT）与离散余弦变换(Discrete CosineTransform,DCT)对唇形进行特征提取的方法。The dual-hybrid lip shape feature extraction method in this embodiment refers to a lip shape feature extraction method that uses a mixture of DT_CWT and DCT. Aiming at the lip shape feature extraction link in the field of lip shape recognition technology, the present invention proposes a hybrid dual-tree complex wavelet (Dual-Tree Complex Wavelet Transform, DT_CWT) and discrete cosine transform (Discrete Cosine Transform, DCT) for lip shape feature extraction Methods.

由于DT_CWT滤波具有近似平移不变性，所以DT_CWT滤波后会使在ROI内不同位置的相同唇形的特征值之间差值较小，克服唇部因在ROI位置偏移而导致唇形识别错误的影响；然后再对DT_CWT提取的唇部特征向量进行DCT变换，使经DT_CWT变换后提取的唇部特征集中在DCT变换后的较大系数中，使特征矢量包含唇部最大的信息量，并且同时达到降维的效果。Since DT_CWT filtering has approximate translation invariance, after DT_CWT filtering, the difference between the eigenvalues of the same lip shape at different positions in the ROI will be small, and the lip shape recognition error caused by the offset of the lip in the ROI position will be overcome. Influence; then DCT transformation is performed on the lip feature vector extracted by DT_CWT, so that the lip features extracted after DT_CWT transformation are concentrated in the larger coefficients after DCT transformation, so that the feature vector contains the largest amount of information about the lip, and at the same time achieve dimensionality reduction.

图1为基于唇形的智能轮椅控制系统框架，图2为DCT唇形特征提取框图，图3为DT_CWT分解结构图，图中，DT_CWT通过2组正交、完全重构且互为希尔波特变换的滤波器（树a和树b）来实现。一组滤波器（树a）生成变换的实部，另外一组（树b）生成虚部，DT_CWT的输出结果由树a和树b的输出结果组成，其中，0000a、0001a、0000b、0001b表示第四层的DT_CWT小波系数。Figure 1 is the framework of the intelligent wheelchair control system based on lip shape, Figure 2 is the block diagram of DCT lip shape feature extraction, and Figure 3 is the decomposition structure diagram of DT_CWT, in the figure, DT_CWT is completely reconstructed through two groups of orthogonal and mutual Hill wave Transformed filter (tree a and tree b) to achieve. A set of filters (tree a) generates the real part of the transformation, and another set (tree b) generates the imaginary part. The output of DT_CWT consists of the output of tree a and tree b, where 0000a, 0001a, 0000b, and 0001b represent DT_CWT wavelet coefficients of the fourth layer.

如图所示：本发明提供的基于双混合唇形特征提取的智能轮椅人机交互方法，包括以下步骤：As shown in the figure: the intelligent wheelchair human-computer interaction method based on double-mixed lip feature extraction provided by the present invention includes the following steps:

所述步骤S3中的提取唇形特征向量具体包括以下步骤：Extracting the lip shape feature vector in the step S3 specifically includes the following steps:

所述步骤S31中对唇部图像进行DT_CWT滤波并通过DT_CWT算法提取的唇部特征向量的具体步骤如下：The specific steps of performing DT_CWT filtering on the lip image and extracting the lip feature vector through the DT_CWT algorithm in the step S31 are as follows:

S312：将归一化后的ROI图像分割为n×n大小的子图像；S312: Divide the normalized ROI image into n×n sub-images;

所述步骤S32中对唇部特征向量进行DCT变换形成唇形特征向量并进行特征分类的具体步骤如下：In the step S32, the lip feature vector is carried out to the DCT transformation to form the lip feature vector and the specific steps of feature classification are as follows:

Y＝AX，Y=AX,

S322：选择满足预设条件的DCT特征系数；S322: Select DCT characteristic coefficients satisfying preset conditions;

采用如下公式选择满足预设条件的DCT特征系数；Use the following formula to select DCT characteristic coefficients that meet the preset conditions;

对于一幅大小为M×N的图像f(x,y)，其中x＝0,1,2,…,M-1；y＝0,1,2,…,N-1，其二维DCT的定义为：For an image f(x,y) of size M×N, where x=0,1,2,…,M-1; y=0,1,2,…,N-1, its two-dimensional DCT is defined as:

其中u＝0,1,2,…,M-1；v＝0,1,2,…,N-1。Where u=0,1,2,...,M-1; v=0,1,2,...,N-1.

$a a ((u u)) = = \{\begin{matrix} \sqrt{\frac{11}{M m}},, u u = = 00 \\ \sqrt{\frac{22}{M m}},, u u = = 1,2 1,2,, \cdot &Center Dot; \cdot \cdot \cdot &Center Dot;,, M m - - 11 \end{matrix}$ $a a ((v v)) = = \{\begin{matrix} \sqrt{\frac{11}{N N}},, v v = = 00 \\ \sqrt{\frac{22}{N N}},, v v = = 1,2 1,2,, \cdot &Center Dot; \cdot &Center Dot; \cdot &Center Dot;,, N N - - 11 \end{matrix}$

其中x(u,v)称为DCT系数。Among them x (u, v) is called DCT coefficient.

$y the y = = [[{x x}_{00}^{11},, {x x}_{11}^{11},, \cdot &Center Dot; \cdot &Center Dot; \cdot &Center Dot;,, {x x}_{K K - - 11}^{11},, {x x}_{00}^{22},, {x x}_{11}^{22},, \cdot &Center Dot; \cdot &Center Dot; \cdot &Center Dot; {,, x x}_{K K - - 11}^{22},, \cdot &Center Dot; \cdot \cdot \cdot &Center Dot;,, {x x}_{00}^{99},, {x x}_{11}^{99},, \cdot &Center Dot; \cdot &Center Dot; \cdot &Center Dot;,, {x x}_{K K - - 11}^{99},, {]]}^{T T},,$

其中，K表示在子图像中Zig-Zag选择的特征系数的个数，

表示第m个子图像的第n个特征系数。Among them, K represents the number of feature coefficients selected by Zig-Zag in the sub-image,

Indicates the nth feature coefficient of the mth subimage.

所述控制指令是通过无线传输方式传送给智能轮椅。The control instruction is transmitted to the intelligent wheelchair through wireless transmission.

所述步骤S1中采用摄像头来采集包含人脸的图像。In the step S1, a camera is used to capture an image containing a human face.

所述图像预处理、提取唇形特征向量以及获取唇形识别结果采用作为上位机的笔记本电脑或单片机。Said image preprocessing, extracting lip shape feature vector and obtaining lip shape recognition results adopt notebook computer or single-chip microcomputer as host computer.

所述驱动的智能轮椅作为上位机控制的下位机。The driven intelligent wheelchair is used as the lower computer controlled by the upper computer.

实施例2Example 2

本实施例与实施例1的区别仅在于：The difference between this embodiment andembodiment 1 is only:

本实施例首先对唇部进行DT_CWT滤波，由于DT_CWT具有近似平移不变性，所以DT_CWT滤波后会使在ROI内不同位置的相同唇形的特征值之间差值较小，克服唇部因在ROI位置偏移而导致唇形识别错误的影响；然后再对DT_CWT提取的唇部特征向量进行DCT变换，使经DT_CWT变换后提取的唇部特征集中在DCT变换后的较大系数中，使特征矢量包含唇部最大的信息量，并且同时达到降维的效果。In this embodiment, DT_CWT filtering is first performed on the lips. Since DT_CWT has approximately translation invariance, after DT_CWT filtering, the difference between the eigenvalues of the same lip shape at different positions in the ROI will be small, which overcomes the problem of the lips being in the ROI. lip shape recognition error due to position offset; then DCT transform is performed on the lip feature vector extracted by DT_CWT, so that the lip features extracted after DT_CWT transform are concentrated in the larger coefficients after DCT transform, so that the feature vector Contains the largest amount of information in the lips, and at the same time achieves the effect of dimensionality reduction.

根据DT_CWT变换原理，一副图像经此变换后在每一层级上会产生6个方向（θ∈{+15°,+45°,+75°,-75°,-45°,-15°}）的高频子带矩阵，一个低频子带矩阵。低频子带矩阵是下一层分解的初始输入，高频子带矩阵则是包含6个方向对应的纹理特征的系数；有研究表明，高频系数在目标识别中比低频系数更加重要，并且低频系数是图片光照信息的特征，会干扰识别过程，所以在在构造特征向量时通常只选用每个层级中6个方向的高频系数。According to the principle of DT_CWT transformation, an image will produce 6 directions at each level after this transformation (θ∈{+15°,+45°,+75°,-75°,-45°,-15°} ), a matrix of high-frequency subbands and a matrix of low-frequency subbands. The low-frequency sub-band matrix is the initial input of the next layer of decomposition, and the high-frequency sub-band matrix is the coefficient containing the texture features corresponding to the six directions; some studies have shown that high-frequency coefficients are more important than low-frequency coefficients in target recognition, and low-frequency Coefficients are the characteristics of image lighting information, which will interfere with the recognition process, so when constructing feature vectors, only high-frequency coefficients in 6 directions in each level are usually selected.

对于一副M×N的唇部图像如果对其进行L级分解，则会得到6×L个高频系数矩阵，第一层的6个高频矩阵的维数均是M/2×N/2,即是原图像维数的一半，以此类推，下一级的每个高频矩阵的维数又是上一级的一半。For an M×N lip image, if it is decomposed at L level, 6×L high-frequency coefficient matrices will be obtained, and the dimensions of the six high-frequency matrices in the first layer are all M/2×N/ 2, which is half the dimension of the original image, and so on, the dimension of each high-frequency matrix of the next level is half of the previous level.

本实施例首先把唇部感兴趣区域归一化为48×48，对唇部感兴趣区域进行4级DT_CWT二维滤波，唇部图像滤波将产生4个尺度，每个尺度上6个方向的高频系数矩阵，所以图像特征共包括24个高频系数矩阵。第一、二、三、四层级的高频系数矩阵的大小分别为：24×24、12×12、6×6、3×3。由于DT_CWT产生的系数是复数，所以对每一个系数矩阵进行复系数的幅度值的计算,将复数矩阵变为实数矩阵。然后将每个实数矩阵按列方向依次排列,排成一个列向量,用V_l,θ来表示，其中l和θ分别表示DT_CWT变换的分解层数和方向参数，其取值范围为l∈{1,…,4}，θ∈{+15°,+45°,+75°,-75°,-45°,-15°}。唇部图像经4层DT_CWT变换后的特征向量X可通过将24个幅度矩阵对应的列向量组合构成，可表示为式(6)：In this embodiment, the lip ROI is first normalized to 48×48, and 4-level DT_CWT two-dimensional filtering is performed on the lip ROI. The lip image filtering will generate 4 scales, and each scale has 6 directions. High-frequency coefficient matrix, so the image features include a total of 24 high-frequency coefficient matrices. The sizes of the high-frequency coefficient matrices of the first, second, third and fourth levels are respectively: 24×24, 12×12, 6×6, 3×3. Since the coefficients generated by DT_CWT are complex numbers, the magnitude value of the complex coefficients is calculated for each coefficient matrix, and the complex number matrix is changed into a real number matrix. Then arrange each real number matrix in sequence in the column direction, and arrange it into a column vector, which is represented by V_{l, θ} , where l and θ represent the decomposition layers and direction parameters of DT_CWT transformation respectively, and the value range is l∈{ 1,...,4}, θ ∈ {+15°, +45°, +75°, -75°, -45°, -15°}. The feature vector X of the lip image transformed by 4 layers of DT_CWT can be formed by combining the column vectors corresponding to the 24 magnitude matrices, which can be expressed as formula (6):

其中上标T表示转置操作。由(6)式可看出，X特征向量的维数是1、2、3、4层分解中每层DT_CWT滤波6个方向产生的系数个数的总和，其维数为：

这样大的空间维数，给计算和识别速度将会造成很大的负担。所以在进行DT_CWT滤波之后对特征矩阵X使用式(1)进行DCT变换，提取出特征矩阵X中包含唇部信息量最大的DCT系数，然后利用Zig-Zag法选择前面81个较大的DCT系数来构造最终特征矢量y，使y拥有唇部的最大信息量，保证图像信号的最少缺失，并同时达到降维效果，以提高唇形识别率。where the superscript T represents the transpose operation. It can be seen from formula (6) that the dimension of the X feature vector is the sum of the number of coefficients generated by each layer of DT_CWT filtering in 6 directions in the 1, 2, 3, and 4-layer decomposition, and its dimension is:

Such a large spatial dimension will impose a great burden on the calculation and recognition speed. Therefore, after DT_CWT filtering, DCT transforms the feature matrix X using formula (1), extracts the DCT coefficients containing the largest lip information in the feature matrix X, and then uses the Zig-Zag method to select the first 81 larger DCT coefficients To construct the final feature vector y, so that y has the maximum amount of information of the lips, ensuring the least loss of image signals, and at the same time achieving the effect of dimensionality reduction to improve the lip shape recognition rate.

以上所述仅为本发明的优选实施例，并不用于限制本发明，显然，本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样，倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内，则本发明也意图包含这些改动和变型在内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. Obviously, those skilled in the art can make various changes and modifications to the present invention without departing from the spirit and scope of the present invention. Thus, if these modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalent technologies, the present invention also intends to include these modifications and variations.

Claims

Translated fromChinese

1.基于双混合唇形特征提取的智能轮椅人机交互方法，其特征在于：包括以下步骤：1. The intelligent wheelchair human-computer interaction method based on double-mixed lip shape feature extraction, is characterized in that: comprise the following steps:

2.根据权利要求1所述的基于双混合唇形特征提取的智能轮椅人机交互方法，其特征在于：所述步骤S3中的提取唇形特征向量具体包括以下步骤：2. the intelligent wheelchair human-computer interaction method based on double-mixing lip shape feature extraction according to claim 1, is characterized in that: extracting lip shape feature vector in the described step S3 specifically comprises the following steps:

3.根据权利要求1所述的基于双混合唇形特征提取的智能轮椅人机交互方法，其特征在于：所述控制指令是通过无线传输方式传送给智能轮椅。3. The human-computer interaction method for intelligent wheelchairs based on double-mixed lip shape feature extraction according to claim 1, characterized in that: the control instruction is transmitted to the intelligent wheelchair through wireless transmission.

4.根据权利要求2所述的基于双混合唇形特征提取的智能轮椅人机交互方法，其特征在于：所述步骤S31中对唇部图像进行DT_CWT滤波并通过DT_CWT算法提取的唇部特征向量的具体步骤如下：4. The intelligent wheelchair human-computer interaction method based on double-mixed lip feature extraction according to claim 2, characterized in that: in the step S31, the DT_CWT filter is carried out to the lip image and the lip feature vector extracted by the DT_CWT algorithm The specific steps are as follows:

5.根据权利要求2所述的基于双混合唇形特征提取的智能轮椅人机交互方法，其特征在于：所述步骤S32中对唇部特征向量进行DCT变换形成唇形特征向量并进行特征分类的具体步骤如下：5. The intelligent wheelchair human-computer interaction method based on double-mixed lip feature extraction according to claim 2, characterized in that: in the step S32, lip feature vectors are subjected to DCT transformation to form lip feature vectors and feature classification The specific steps are as follows:

Y＝AX，Y=AX,

x x ((u u,, v v)) = = a a ((u u)) a a ((v v)) {Σ Σ}_{x x = = 00}^{M m - - 11} {Σ Σ}_{y the y = = 00}^{N N - - 11} f f ((x x,, y the y)) cos cos [[\frac{((22 x x + + 11)) uπ uπ}{22 M m}]] cos cos [[\frac{((22 y the y + + 11)) vπ vπ}{22 N N}]]

其中，x(u,v)为DCT特征系数，u＝0,1,2,…,M-1；v＝0,1,2,…,N-1；f(x,y)表示一幅大小为M×N的图像，a(u)，a(v)分别定义为；Among them, x(u,v) is the DCT characteristic coefficient, u=0,1,2,...,M-1; v=0,1,2,...,N-1; f(x,y) represents a For an image of size M×N, a(u), a(v) are defined as;

a a ((u u)) = = \{\begin{matrix} \sqrt{\frac{11}{M m}},, u u = = 00 \\ \sqrt{\frac{22}{M m}},, u u = = 1,2 1,2,, \cdot &Center Dot; \cdot &Center Dot; \cdot &Center Dot;,, M m - - 11 \end{matrix}

a a ((v v)) = = \{\begin{matrix} \sqrt{\frac{11}{N N}},, v v = = 00 \\ \sqrt{\frac{22}{N N}},, v v = = 1,2 1,2,, \cdot &Center Dot; \cdot &Center Dot; \cdot &Center Dot;,, N N - - 11 \end{matrix}

y the y = = [[{x x}_{00}^{11},, {x x}_{11}^{11},, \cdot &Center Dot; \cdot &Center Dot; \cdot &Center Dot;,, {x x}_{K K - - 11}^{11},, {x x}_{00}^{22},, {x x}_{11}^{22},, \cdot &Center Dot; \cdot &Center Dot; \cdot &Center Dot;,, {x x}_{K K - - 11}^{22},, \cdot &Center Dot; \cdot &Center Dot; \cdot &Center Dot;,, {x x}_{00}^{99},, {x x}_{11}^{99},, \cdot \cdot \cdot &Center Dot; \cdot &Center Dot;,, {x x}_{K K - - 11}^{99} {]]}^{T T},,

6.根据权利要求1所述的基于双混合唇形特征提取的智能轮椅人机交互方法，其特征在于：所述步骤S1中采用摄像头来采集包含人脸的图像。6. The intelligent wheelchair human-computer interaction method based on double-mixed lip feature extraction according to claim 1, characterized in that: in the step S1, a camera is used to collect images containing human faces.

7.根据权利要求1所述的基于双混合唇形特征提取的智能轮椅人机交互方法，其特征在于：所述图像预处理、提取唇形特征向量以及获取唇形识别结果采用作为上位机的笔记本电脑或单片机。7. the intelligent wheelchair human-computer interaction method based on double-mixing lip feature extraction according to claim 1, is characterized in that: described image preprocessing, extracting lip feature vector and obtaining lip recognition result adopt as host computer Laptop or microcontroller.

8.根据权利要求7所述的基于双混合唇形特征提取的智能轮椅人机交互方法，其特征在于：所述智能轮椅作为上位机控制的下位机。8. The intelligent wheelchair human-computer interaction method based on double-mixed lip shape feature extraction according to claim 7, characterized in that: the intelligent wheelchair is used as a lower computer controlled by the upper computer.