CN111428871B

Movatterモバイル変換

Info

Publication number: CN111428871B
Application number: CN202010243856.7A
Authority: CN
Inventors: 谢张宁; 朱惠臣; 孙晓光; 吴俊杰; 李智玮; 傅云霞; 雷李华; 孔明; 管钰晴; 刘娜; 王道档
Original assignee: China Jiliang University; Shanghai Institute of Measurement and Testing Technology
Current assignee: China Jiliang University; Shanghai Institute of Measurement and Testing Technology
Priority date: 2020-03-31
Filing date: 2020-03-31
Publication date: 2023-02-24
Anticipated expiration: 2040-03-31
Also published as: CN111428871A

Abstract

The invention relates to a sign language translation method based on a BP neural network, which is characterized by comprising the following steps: the method comprises the following steps: 1. collecting gesture voltage signals by using the raspberry pi 3B through a wearable data glove; 2. compiling sign language words and common sign language sentences corresponding to each group of gesture voltage signals into a sign language sentence library by using a signal screening program; 3. writing a neural network classification program comprising a BP neural network structure frame model, a data transmission module and a storage module, wherein the BP neural network structure frame model adopts a three-layer neural network comprising an input layer, an output layer and a hidden layer; 4. converting the gesture voltage signals received each time into sign language words through a BP neural network framework model; 5. and 4, converting the sign language words obtained in the step 4 within a period of time into sign language word groups, matching the sign language word groups with the sign language word library, associating and filling the sign language word groups with the sign language word library to form sentence output results. The invention realizes automatic real-time translation and recognition of sign language by combining a neural network and a sensing technology.

Description

Translated fromChinese

一种基于BP神经网络的手语翻译方法A sign language translation method based on BP neural network

技术领域technical field

本发明涉及一种手语翻译方法，特别是公开一种基于BP神经网络的手语翻译方法，结合神经网络及传感技术实现手语自动翻译识别。The invention relates to a sign language translation method, and in particular discloses a BP neural network-based sign language translation method, which realizes sign language automatic translation recognition in combination with the neural network and sensing technology.

背景技术Background technique

现在社会聋哑人和正常人交流过程中，由于正常人无法理解手语，使聋哑人和正常人之间存在了隔阂，限制了聋哑人的交际圈，给他们生活与发展空间带来很大的限制。现在市场上存在的聋哑人辅助设备有两种，一种是从上世纪50年代就开始的电子喉，它是通过装在喉节处，感应声带的振动并将之扩大化来帮助发声，但用于发声的材料价格昂贵，一般没有社会工作保障的残疾人根本负担不起。另一种是进几年出现的基于计算机视觉的手语翻译设备，该类设备价格不高，但是对于肢体动作识别技术尚处于起步阶段，同时图像处理对采集环境的要求很严格。In the process of communication between the deaf-mute and normal people in today's society, because normal people cannot understand sign language, there is a gap between the deaf-mute and normal people, which limits the communication circle of the deaf-mute and brings great space for their life and development. Big restrictions. There are two types of assistive devices for deaf-mute people on the market today. One is the electronic larynx that started in the 1950s. It is installed at the larynx joint to sense the vibration of the vocal cords and amplify it to help vocalization. However, the materials used to produce voices are expensive, and generally disabled people who do not have social work protection cannot afford them at all. The other is sign language translation equipment based on computer vision that has appeared in the past few years. This type of equipment is not expensive, but the body movement recognition technology is still in its infancy, and image processing has strict requirements on the acquisition environment.

而神经网络是一种运算模型，由大量的节点（或称神经元）之间相互联接构成。每个节点代表一种特定的输出函数，称为激活函数（activation function）。每两个节点间的连接都代表一个对于通过该连接信号的加权值，称之为权重（weight），神经网络就是通过这种方式来模拟人类的记忆。网络的输出则取决于网络的结构、网络的连接方式、权重和激活函数。而各层网络的权重便是需要存储的模型。神经网络在机器学习中应用比较广泛，比如函数逼近、模式识别、分类、数据压缩及数据挖掘等领域。因此用神经网络来构建一个非线性数据分类模型是一种比较好的方法。The neural network is an operational model, which is composed of a large number of nodes (or neurons) connected to each other. Each node represents a specific output function, called an activation function. The connection between each two nodes represents a weighted value for the signal passing through the connection, called weight, and the neural network simulates human memory in this way. The output of the network depends on the structure of the network, the way the network is connected, the weights and the activation function. The weights of each layer of the network are the models that need to be stored. Neural networks are widely used in machine learning, such as function approximation, pattern recognition, classification, data compression, and data mining. Therefore, it is a better method to use neural network to construct a nonlinear data classification model.

发明内容Contents of the invention

本发明的目的是解决现有技术的不足，设计一种基于BP神经网络的手语翻译方法，利用传感技术采集手势电压信号作为BP神经网络的输入，利用神经元之间的计算、加权，将输入的电压信号转换为正常所需的对应语义输出，为制造出基于手势电压的以及基于BP（back propagation）神经网络的手语翻译系统提供了可行性，方便残障人士与正常人的交流，拉近相互间的距离，让聋哑人更好的融入正常社会。The purpose of the present invention is to solve the deficiencies in the prior art, design a kind of sign language translation method based on BP neural network, utilize sensing technology to gather gesture voltage signal as the input of BP neural network, utilize the calculation between neurons, weighting, will The input voltage signal is converted into the corresponding semantic output required by the normal, which provides feasibility for the manufacture of a sign language translation system based on gesture voltage and BP (back propagation) neural network, which facilitates the communication between disabled people and normal people, and brings them closer together. The distance between them allows deaf-mute people to better integrate into normal society.

本发明是这样实现的：一种基于BP神经网络的手语翻译方法，其特征在于所述的手语翻译方法包括如下步骤：The present invention is achieved in that a kind of sign language translation method based on BP neural network is characterized in that described sign language translation method comprises the steps:

步骤1：由树莓派3B单板机通过设于可穿戴数据手套上的柔性传感器以及加速度传感器采集手势电压信号，并将手势电压信号滤波、放大后经其集成的蓝牙模块传输至储存器存储。Step 1: The Raspberry Pi 3B single-board computer collects the gesture voltage signal through the flexible sensor and the acceleration sensor installed on the wearable data glove, filters and amplifies the gesture voltage signal and transmits it to the memory through its integrated Bluetooth module .

所述步骤1中的可穿戴的数据手套上设置的柔性传感器为固定在10根手指部位处的应变片，利用应变片跟随手指的弯曲程度和两块分别固定在左右两手背部位处的三轴加速度传感器的相互位置来表征手势电压信号，所述采集手势电压信号是采集10个手指弯曲信号和6个手势方向信号，共16个信号。The flexible sensor set on the wearable data glove in thestep 1 is the strain gauges fixed on the 10 fingers, and the strain gauges are used to follow the bending degree of the fingers and two three-axis sensors respectively fixed on the backs of the left and right hands. The mutual position of the acceleration sensors is used to characterize the gesture voltage signal, and the collection of the gesture voltage signal is to collect 10 finger bending signals and 6 gesture direction signals, a total of 16 signals.

步骤2：利用信号筛选程序将每一组信号对应的手语词语和常用手语句子编入手语库，制成手语语句库，将多次收集的手势电压信号和对应手语词语以7:3分为训练集和测试集。Step 2: Use the signal screening program to compile the sign language words and commonly used sign language sentences corresponding to each group of signals into the sign language library to make a sign language sentence library, and divide the gesture voltage signals collected many times and the corresponding sign language words into training at a ratio of 7:3 set and test set.

所述步骤2中手语语句库包括手语词语库和手语句子库，手语语句库的制成是将收到的当前16个手势电压信号先记录于Excel中，归一化整理后再保存至Access数据库。In thestep 2, the sign language sentence library includes the sign language word library and the sign language sentence library. The making of the sign language sentence library is to record the current 16 gesture voltage signals received in Excel first, and save them to the Access database after normalization .

步骤3：编写建立BP神经网络结构框架模型的程序，程序主要包括神经网络结构框架模型、数据传输模块及储存模块三大模块，通过步骤2所述训练集对BP神经网络结构框架模型进行训练，并将训练好的BP神经网络结构框架模型导入测试集测试，测试结果符合预期后将BP神经网络结构框架模型储存于存储模块，所述的BP神经网络结构框架模型采用输入层、输出层以及隐藏层的三层神经网络。Step 3: Write the program that sets up the BP neural network structural framework model, the program mainly includes three major modules of neural network structural framework model, data transmission module and storage module, train the BP neural network structural framework model through the training set described instep 2, And import the trained BP neural network structure frame model into the test set test, after the test result meets expectations, the BP neural network structure frame model is stored in the storage module, and the BP neural network structure frame model adopts input layer, output layer and hidden Layers of a three-layer neural network.

所述步骤3中的BP神经网络结构框架模型的三层神经网络为输入层16个神经元，中间层有64个神经元，输出层有18个神经元，传输方式分为两段，一段有8个电压信号，共16个电压信号，输出层18个神经元的输出序号为0到17，依次对应18个常用词组，常用词组随机组合成53常用短语。The three-layer neural network of the BP neural network structure frame model in the describedstep 3 is 16 neurons of the input layer, and the intermediate layer has 64 neurons, and the output layer has 18 neurons, and the transmission mode is divided into two sections, and one section has There are 8 voltage signals, 16 voltage signals in total, and the output serial numbers of 18 neurons in the output layer are 0 to 17, corresponding to 18 common phrases in turn, and the common phrases are randomly combined into 53 common phrases.

步骤4：将每次采集到的手势电压信号通过BP神经网络框架模型转换为手语词语，包括如下步骤：Step 4: Convert the gesture voltage signal collected each time into sign language words through the BP neural network framework model, including the following steps:

步骤4.1：接收可穿戴的数据手套的手势电压信号，利用信号筛选程序筛选齐全的信号；Step 4.1: Receive the gesture voltage signal of the wearable data glove, and use the signal screening program to screen the complete signal;

步骤4.2：通过训练好的BP神经网络结构框架模型将手势电压信号转换成词语。Step 4.2: Convert the gesture voltage signal into words through the trained BP neural network structure framework model.

步骤5：将步骤4在一段时间内由手势电压信号转换获得的手语词语再转换为手语词语组，将手语词语组与手语语句库进行匹配并联想填充成句输出结果，包括如下步骤：Step 5: Convert the sign language words obtained from the gesture voltage signal conversion in step 4 over a period of time into sign language word groups, match the sign language word groups with the sign language sentence database, and associate and fill in the sentence output results, including the following steps:

步骤5.1：将手语语句库中的句子或者收集到的词汇组进行分词并进行统计，将句子里频数大且具有象征意义的词汇规定为元素1，其余词汇为元素0；Step 5.1: Segment the sentences in the sign language sentence library or the collected vocabulary groups and make statistics, define the words with high frequency and symbolic meaning in the sentence aselement 1, and the rest of the words aselement 0;

步骤5.2：将手语语句库中的所有常用手语句子按照步骤5.1规定为词频向量格式，生成对应的词频向量；Step 5.2: convert all commonly used sign language sentences in the sign language sentence library into word frequency vector format according to step 5.1, and generate corresponding word frequency vectors;

步骤5.3：将在一段时间内通过步骤4采集的手势电压信号转换获得的手语词语再转换为手语词语组，获得的手语词语组按照步骤5.1规定为词频向量格式，转换成对应词频向量；Step 5.3: convert the sign language words obtained through the conversion of the gesture voltage signal collected in step 4 within a period of time into a sign language word group, and convert the obtained sign language word group into a word frequency vector format according to the provisions of step 5.1, and convert it into a corresponding word frequency vector;

步骤5.4：计算步骤5.3转换后的词频向量与步骤5.2中手语语句库中的词频向量的余弦相似度，选择余弦相似度最大的手语语句库中的手语词语作为输出词语；Step 5.4: Calculate the cosine similarity of the word frequency vector after the conversion of step 5.3 and the word frequency vector in the sign language sentence library in step 5.2, and select the sign language words in the sign language sentence library with the largest cosine similarity as the output words;

步骤5.5：将步骤5.4获得的输出词语根据手语语句库中的所有常用手语句子的索引匹配对应书面语语句，并将匹配到的书面语语句作为最后的输出结果。Step 5.5: Match the output words obtained in step 5.4 to the corresponding written language sentences according to the indexes of all commonly used sign language sentences in the sign language sentence database, and use the matched written language sentences as the final output result.

本发明的有益效果是：据统计，全国共有2000多万听障生理性及语言障碍人群，他们和正常人的交流存在障碍，聋哑人交流不畅是他们不能正常工作的重要原因。本发明方法通过树莓派3B单板机外接声音和/或视频播放设备，可将手势转换为正常的语句音频或视频的形式输出，翻译准确率高，响应迅速，可为残障人士与正常人的交流提供了极大的便利，满足聋哑障碍人士与正常人群交流的迫切需求。本发明方法采用的硬件设备成本较低，能快速有效解决聋哑人遇到的交流障碍问题，为他们的就业打开了更大的空间，帮助聋哑障碍人士更好地融入社会，提高他们的生活水平。The beneficial effects of the present invention are: According to statistics, there are more than 20 million hearing-impaired physiological and language-impaired people in the country, and there are obstacles in their communication with normal people. The poor communication of deaf-mute people is an important reason why they cannot work normally. The method of the present invention connects a sound and/or video playback device externally to a Raspberry Pi 3B single-board computer, and can convert gestures into normal sentence audio or video output, with high translation accuracy and rapid response, and can be used for disabled people and normal people The communication provides great convenience and meets the urgent needs of deaf-mute people to communicate with normal people. The cost of the hardware equipment adopted by the method of the present invention is relatively low, which can quickly and effectively solve the problem of communication barriers encountered by deaf-mute people, open up more space for their employment, help deaf-mute people to better integrate into society, and improve their standard of living.

附图说明Description of drawings

图1是本发明方法的工作步骤流程示意框图。Fig. 1 is a schematic block diagram of the working steps of the method of the present invention.

图2是本发明方法的采集手势电压信号用的单个可穿戴的数据手套结构示意简图。Fig. 2 is a schematic structural diagram of a single wearable data glove for collecting gesture voltage signals according to the method of the present invention.

图3是本发明方法的信号筛选程序工作原理及工作流程示意图。Fig. 3 is a schematic diagram of the working principle and workflow of the signal screening program of the method of the present invention.

图4是本发明方法的BP神经网络结构框架模型示意图。Fig. 4 is a schematic diagram of a BP neural network structural framework model of the method of the present invention.

图5是本发明方法的经BP神经网络转换后生成的词语进行匹配联想填充成句输出的工作处理流程示意图。Fig. 5 is a schematic diagram of the working process of matching and associating the words generated after the BP neural network conversion of the method of the present invention to fill in and output sentences.

具体实施方式Detailed ways

根据附图1，本发明为一种基于BP神经网络的手语翻译方法，包括以下步骤：According to accompanyingdrawing 1, the present invention is a kind of sign language translation method based on BP neural network, comprises the following steps:

步骤1：由树莓派3B单板机通过设于可穿戴数据手套上的柔性传感器以及加速度传感器采集手势电压信号，并将手势电压信号滤波、放大后经其集成的蓝牙模块传输至储存器存储。Step 1: The Raspberry Pi 3B single-board computer collects the gesture voltage signal through the flexible sensor and the acceleration sensor on the wearable data glove, filters and amplifies the gesture voltage signal, and transmits it to the memory through its integrated Bluetooth module .

步骤2：利用信号筛选程序将每一组信号对应的手语词语和常用手语句子编入手语库，制成手语语句库，将多次收集的手势电压信号和对应手语词语以7:3分为训练集和测试集；Step 2: Use the signal screening program to compile the sign language words and commonly used sign language sentences corresponding to each group of signals into the sign language library to make a sign language sentence library, and divide the gesture voltage signals collected many times and the corresponding sign language words into training at a ratio of 7:3 set and test set;

步骤2.1：基于C#语言编写手语库记录软件，该软件能将每次收到的手势电压信号和该信号所表示的语义记录在Excel中；Step 2.1: Write sign language library recording software based on C# language, which can record each received gesture voltage signal and the semantics represented by the signal in Excel;

步骤2.2：利用信号筛选程序检查每次收到的信号是否齐全，齐全则收录在Excel表格中，反之则剔除；Step 2.2: Use the signal screening program to check whether the received signal is complete each time, if it is complete, it will be included in the Excel table, otherwise it will be eliminated;

步骤2.3：将多次收集的手势电压信号和对应手语词语以7:3的比例分成训练集和数据集；Step 2.3: Divide the gesture voltage signals collected multiple times and the corresponding sign language words into a training set and a data set at a ratio of 7:3;

步骤2.4：将上述Excel表格中记录的的手势电压信号对应的手语词语和常用手语句子进行归一化整理后导入Access数据库，制成手语语句库。Step 2.4: Normalize and organize the sign language words and commonly used sign language sentences corresponding to the gesture voltage signals recorded in the above Excel table, and then import them into the Access database to make a sign language sentence library.

步骤4：将每次采集到的手势电压信号通过BP神经网络框架模型转换为手语词语；Step 4: Convert the gesture voltage signal collected each time into sign language words through the BP neural network framework model;

步骤4.1：接收可穿戴数据手套的手势电压信号，利用信号筛选程序筛选齐全的信号；Step 4.1: Receive the gesture voltage signal of the wearable data glove, and use the signal screening program to screen the complete signal;

步骤4.2：使用训练好的BP神经网络结构框架模型将手势电压信号转换成手语词语。Step 4.2: Use the trained BP neural network framework model to convert gesture voltage signals into sign language words.

步骤5：将步骤4在一段时间内由手势电压信号转换获得的手语词语再转换为手语词语组，将手语词语组与手语语句库进行匹配并联想填充成句输出结果；Step 5: Convert the sign language words obtained from the gesture voltage signal conversion in step 4 over a period of time into sign language word groups, match the sign language word groups with the sign language sentence database, and associate and fill them into sentence output results;

步骤5.1：将手语语句库中的句子或者收集到的词语组进行分词并进行统计，将句子里频数大且具有象征意义的词语规定为元素1，其余词语为元素0；Step 5.1: Segment and count the sentences in the sign language sentence library or the collected word groups, define the words with high frequency and symbolic meaning in the sentence aselement 1, and the rest of the words aselement 0;

下面结合附图和具体实施例对本发明作进一步详细说明。The present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments.

本发明一种基于BP神经网络的手语翻译方法的具体工作步骤如下：The specific working steps of a kind of sign language translation method based on BP neural network of the present invention are as follows:

步骤1：将树莓派3B单板机和可穿戴数据手套上的柔性传感器、加速度传感器及恒定电阻串联，柔性传感器和加速度传感器会根据手指弯曲手的相互运动状态来改变电压，将采集到的手势电压信号滤波、放大后由树莓派3B单板机的蓝牙模块传输至储存器存储。Step 1: Connect the flexible sensor, acceleration sensor and constant resistor on the Raspberry Pi 3B single-board computer and the wearable data glove in series. The flexible sensor and acceleration sensor will change the voltage according to the mutual motion state of the fingers and hands, and the collected After the gesture voltage signal is filtered and amplified, it is transmitted to the storage by the Bluetooth module of the Raspberry Pi 3B single-board computer.

根据附图2，所述步骤1中的可穿戴的数据手套上设置的柔性传感器为固定在10根手指部位处的应变片，加速度传感器为固定在左右两手背部位处的两块三轴加速度传感器。每个应变片输出一个电压值，三轴加速度传感器输出x，y，z三个电压值，每次输出包括开始信号Numi和结束信号Numo总共18个信号依次为：Numi，X₁，X₂，……X₁₆，Numo。其中X₁，X₂，……X₁₆代表手势的电压值。According to the accompanying drawing 2, the flexible sensor set on the wearable data glove in thestep 1 is a strain gauge fixed on the 10 fingers, and the acceleration sensor is two triaxial acceleration sensors fixed on the back of the left and right hands . Each strain gauge outputs a voltage value, and the three-axis acceleration sensor outputs three voltage values of x, y, and z. Each output includes the start signalNumi and the end signalNumo , a total of 18 signals in sequence:Numi ,X₁ ,X₂ , ...X16 ,Numo_. Among them,X₁ ,X₂ , ...X_{1 6} represent the voltage value of the gesture.

步骤2：将每一组采集到的手势电压信号对应的手语词语和常用手语句子编入手语库，制成手语语句库。将手语语句库以7:3分为训练集和测试集。Step 2: Compile the sign language words and commonly used sign language sentences corresponding to each group of collected gesture voltage signals into the sign language database to make a sign language sentence database. The sign language sentence library is divided into training set and test set by 7:3.

步骤2.1：手语库记录软件是基于C#语言编写的，该软件能将每次收到的手势电压信号和该信号所表示的语义记录在Excel表格中。Step 2.1: The sign language library recording software is written based on C# language, and the software can record each received gesture voltage signal and the semantics represented by the signal in an Excel table.

步骤2.2：附图3为信号筛选程序工作原理及工作步骤图，利用信号筛选程序检查每次收到的信号是否齐全，齐全则收录在Excel表格中，反之则剔除。信号筛选程序接受手势电压信号，遇到开始信号Numi时开始计数，当遇到停止信号Numo时计数停止。当计数K值为16时候说明传输数据完成，记录于Excel表格中，否者剔除，重新开始计数。Step 2.2:Attachment 3 is a diagram of the working principle and working steps of the signal screening program. Use the signal screening program to check whether the received signal is complete each time. If it is complete, it will be included in the Excel table, and vice versa. The signal screening program accepts the gesture voltage signal, starts counting when encountering the start signalNumi , and stops counting when encountering the stop signalNumo . When the count K value is 16, it means that the data transmission is completed and recorded in the Excel table. If not, it will be deleted and counting will be restarted.

步骤2.3：将多次收集的手势电压信号和对应手语词语以7:3的比例分成训练集和数据集。Step 2.3: Divide the gesture voltage signals collected multiple times and the corresponding sign language words into a training set and a data set at a ratio of 7:3.

步骤2.4：将Excel表格中的手势电压信号、对应手语词语和常用手语句子进行归一化整理后导入Access数据库制成数据库。Step 2.4: Normalize and organize the gesture voltage signals, corresponding sign language words and commonly used sign language sentences in the Excel table and then import them into the Access database to make a database.

每做一个手势，装有柔性传感器的可穿戴数据手套便会传出16个传感器数值，建立一个模型。将不同组的电压值与手势对应建立索引库是一种方法。但是中文语法手势很多，建立相关语法库需要花费很多时间。而且不同的人做手势而形成的电压值组并不完全相同，随着使用者的增加，每一手势将对应的电压组将越来越大，这会导致索引时长的增加。再者，由于器件的灵敏度的限制，存在不同的手势电压值组相差不是很大的情况，这让精准的识别带来了许多局限性。为了能达到又快又准确的识别，本发明采用机器学习中的BP神经网络来搭建电压组的分类模型。Every time a gesture is made, the wearable data glove equipped with flexible sensors will transmit 16 sensor values to build a model. One way is to build an index library that maps different sets of voltage values to gestures. However, there are many gestures in Chinese grammar, and it takes a lot of time to build a relevant grammar library. Moreover, the voltage value groups formed by different people making gestures are not exactly the same. As the number of users increases, the voltage group corresponding to each gesture will become larger and larger, which will lead to an increase in the indexing time. Furthermore, due to the limitation of the sensitivity of the device, there are situations where the difference between different gesture voltage value groups is not very large, which brings many limitations to accurate recognition. In order to achieve fast and accurate recognition, the present invention adopts BP neural network in machine learning to build a classification model of voltage groups.

本发明手势识别的神经网络分类程序具体由BP神经网络算法部分、模型预测部分以及数据传输部分3部分组成。The neural network classification program for gesture recognition in the present invention is specifically composed of BP neural network algorithm part, model prediction part and data transmission part.

BP神经网络算法部分由包括网络正向传播、反向传播、模型训练与评估以及模型存储。本发明的手势识别神经网络由Python语言编写，在Python语言构建神经网络的好处是方便修改神经单元数目，层数，和激活函数并快速计算大量的数据。编程环境为PyCharm，所用到的程序库有Numpy、Pandas和SciPy。其中Numpy用于BP神经网络的算法编写，Pandas用于数据导入，SciPy用于输出存储。The BP neural network algorithm part consists of network forward propagation, back propagation, model training and evaluation, and model storage. The gesture recognition neural network of the present invention is written by the Python language, and the advantage of constructing the neural network in the Python language is that it is convenient to modify the number of neural units, the number of layers, and the activation function and quickly calculate a large amount of data. The programming environment is PyCharm, and the libraries used include Numpy, Pandas and SciPy. Among them, Numpy is used for algorithm writing of BP neural network, Pandas is used for data import, and SciPy is used for output storage.

BP神经网络算法部分的框架是采用输入层、输出层以及隐藏层的三层神经网络组成的BP神经网络结构框架模型。输入层有16个输入单元，隐藏层有64个单元。输出层为18个常用词组，每一个输出序号为0到17，依次对应有18个常用词组，常用词组随机组合可以组成50来个常用短语。常用词组和编号的对应如下表。The framework of the BP neural network algorithm part is a BP neural network structural framework model composed of a three-layer neural network of input layer, output layer and hidden layer. The input layer has 16 input units and the hidden layer has 64 units. The output layer is 18 commonly used phrases, each output sequence number is 0 to 17, corresponding to 18 commonly used phrases in turn, and the random combination of commonly used phrases can form 50 commonly used phrases. The correspondence between commonly used phrases and numbers is shown in the table below.

表1部分常用词组编号对应表Table 1 Correspondence table of some commonly used phrase numbers

因为有16个传感器数值的输入，所以设置了16个输入神经元单元，每组数据为X₁，X₂，……X₁₆，经过归一化后设置为输入层。设归一化后的输入层为x₁，x₂，……x₁₆，第一层第j个神经元单元和第二层第i个神经元单元之间的连接线上权重参数为

，由于手语库样本很多，为了不过拟合，隐藏层设置了64个神经元单元，所以W⁽¹⁾∈R^（64,16），第二层第i单元的偏置项是

，节点激活函数为Sigmoid函数：Because there are 16 sensor values to be input, 16 input neuron units are set, and each set of data isX₁ ,X₂ , ...X_{1 6} , which are set as the input layer after normalization. Let the normalized input layer bex₁ ,x₂ , ...x_{1 6} , and the weight parameter on the connecting line between the jth neuron unit of the first layer and the ith neuron unit of the second layer is

, because there are many samples in the sign language library, in order not to overfit, the hidden layer is set with 64 neuron units, soW⁽¹⁾ ∈R^(64,16) , the bias term of the i-th unit in the second layer is

, the node activation function is the Sigmoid function:

则第二层每一个神经元节点的输出为：Then the output of each neuron node in the second layer is:

……

设第二层第i单元输入加权和为

，第二层第i单元输入加权和为

，则

，

，设

是第二层第j个神经元单元和第三层第i个神经元单元之间的连接线上权重参数为，

是第三层第i单元的偏置项，我们设置的输出层有18个输出，所以易得W⁽²⁾∈R^（18,64），第三层输出为a^（3），节点激活函数为Sigmoid函数，BP神经网络输出为

，则同理可得：Let the input weighted sum of unit i in the second layer be

, the input weighted sum of unit i in the second layer is

,but

,

,set up

is the weight parameter on the connecting line between the jth neuron unit of the second layer and the ith neuron unit of the third layer,

is the bias item of the i-th unit in the third layer, the output layer we set has 18 outputs, so it is easy to getW⁽²⁾ ∈R^(18,64) , the output of the third layer isa⁽³⁾ , the node activation function is the Sigmoid function, and the output of the BP neural network is

, then similarly we can get:

每次正向传播得到第三层输出a^（3）后需要对结果进行校正，由于所用的激活函数为Sigmoid函数，则将激活函数求导并将a^（3）和Y_i的差值代入可得输出层（即第三层）误差项

和隐藏层（即第二层）每一个神经单元误差项

：After each forward propagation to obtain the third layer outputa^(3), the result needs to be corrected. Since the activation function used is the Sigmoid function, the activation function is derived and the difference betweena⁽³⁾ andY_i can be substituted into Get the error term of the output layer (that is, the third layer)

and the error term of each neuron unit in the hidden layer (i.e. the second layer)

:

最后更新每一个连接点上的权重，设ƞ为学习速率常数，第二层和第三层的连接权重

和第一层和第二层的连接权重

更新为：Finally update the weight on each connection point, letƞ be the learning rate constant, the connection weight of the second layer and the third layer

and the connection weights of the first and second layers

updated to:

本发明BP神经网络结构框架模型如图4所示。The framework model of the BP neural network structure of the present invention is shown in FIG. 4 .

在进行一次正向传播和一次反向传播更新权重后，就完成了一次神经网络的训练，当然一次训练是不够的，需要进行成百上千次训练，本发明神经网络最高训练次数为30000次。本发明的训练样本由Pandas库中read_excel函数将保存为Excel文件的手语库导入。手语库为Excel文件，第1列至第16列为电压值，第17列为对应的手势词组。After performing a forward propagation and a backpropagation to update the weights, the training of a neural network is completed. Of course, one training is not enough, and hundreds of times of training are required. The maximum number of training times of the neural network of the present invention is 30,000 times . The training sample of the present invention will be saved as the sign language library import of Excel file by read_excel function in the Pandas library. The sign language library is an Excel file, the first to the 16th column are the voltage values, and the 17th column is the corresponding gesture phrase.

训练完成后进行模型评估，本发明模型评估采用均方差评估。由于手套模块在使用过程中，随着使用次数和使用人数的增多，手语库的样本会越来越多。不同人相同手势的电压值的存在着细微的区别，这让神经网络识别出来会导致过拟合，为了防止过拟合，将样本分为训练集和测试集。训练集用于训练，测试集用来测试预测准确率，本发明采用BP神经网络预测可达到80%的准确率。After the training is completed, the model evaluation is carried out, and the model evaluation of the present invention adopts mean square error evaluation. During the use of the glove module, as the number of uses and the number of users increase, the number of samples in the sign language library will increase. There are subtle differences in the voltage values of the same gesture of different people, which allows the neural network to recognize that it will lead to over-fitting. In order to prevent over-fitting, the samples are divided into training set and test set. The training set is used for training, and the test set is used for testing the prediction accuracy rate. The present invention adopts BP neural network prediction and can reach 80% accuracy rate.

进行完模型训练后需要对模型进行存储，神经网络训练后的模型就是每个连接项的权重W和每个神经节点的偏置项b，本发明即为W^（1），b^（2），W^（2）和b^（3）。Python里的Scipy库中的savemat函数可以将训练完成的参数矩阵保存为Mat格式的数据文件，loadmat函数可以将Mat文件里的数据导入程序，用来预测调用。After the model training is completed, the model needs to be stored. The model after the neural network training is the weightW of each connection item and the bias itemb of each neural node. This invention isW⁽¹⁾ ,b⁽²⁾ ,W⁽²⁾ andb⁽³⁾ . The savemat function in the Scipy library in Python can save the parameter matrix after training as a data file in Mat format, and the loadmat function can import the data in the Mat file into the program for prediction calls.

模型预测部分和神经网络算法部分的训练评估不同，所得到的数据为实时从手机端传回的手势电压值。接收到手机端实时传回的电压值之后，先对改组电压值进行归一化处理，调用训练评估完成准确率最大的模型（即权重W和偏置项b），进行一次正向传播，传播完成的结果

的最大值所在的索引值即为所求得的手势编号，其对应的手势即为所预测的结果。The training and evaluation of the model prediction part and the neural network algorithm part are different, and the obtained data is the gesture voltage value sent back from the mobile phone in real time. After receiving the voltage value sent back in real time from the mobile phone, first normalize the reshuffled voltage value, call the model with the highest accuracy of training and evaluation (that is, the weightW and the bias itemb ), and perform a forward propagation. completed result

The index value where the maximum value of is the obtained gesture number, and the corresponding gesture is the predicted result.

步骤4：将每次采集到的手势电压信号通过BP神经网络框架模型转换为手语词语。Step 4: Convert the gesture voltage signal collected each time into sign language words through the BP neural network framework model.

步骤4.1：接收可穿戴数据手套的手势电压信号，利用信号筛选程序筛选齐全的信号。Step 4.1: Receive the gesture voltage signal of the wearable data glove, and use the signal screening program to screen the complete signal.

步骤4.2：使用训练好的BP神经网络框架模型将手势电压信号转换成手语词语。Step 4.2: Use the trained BP neural network framework model to convert gesture voltage signals into sign language words.

具体操作为：采集当前2秒手势电压信号，将采集的手势电压信号经BP神经网络框架模型转换为手语词语。The specific operation is: collect the current 2-second gesture voltage signal, and convert the collected gesture voltage signal into sign language words through the BP neural network framework model.

步骤5：将步骤4在一段时间内由手势电压信号转换获得的手语词语再转换为手语词语组，将手语词语组进行匹配填充成句子。从经BP神经网络框架模型转换后生成的手语词语到进行匹配联想填充成句输出的具体工作处理流程如图5所示。Step 5: Convert the sign language words obtained from the gesture voltage signal conversion in step 4 within a period of time into sign language word groups, and match the sign language word groups to form sentences. The specific working process from the sign language words generated after the conversion of the BP neural network framework model to the matching and associative filling of sentence output is shown in Figure 5.

步骤5.1：将手语语句库中的句子或者收集到的词语组进行分词并进行统计，将句子里频数大且具有象征意义的词语规定为元素1，其余词语为元素0，如：根据手语语句库中的句子[“我收到有”，“她漂亮很”，“你衣穷”]进行分词统计得到语句库中出现的词语及其出现的频次[我：1，你：1，她：1，衣服：1，穷：1，漂亮：1，收到：1，有：1，很：1]，规定其词频向量格式为[我，你，她，衣服，穷，漂亮，收到]；Step 5.1: Segment and count the sentences in the sign language sentence database or the collected word groups, define the words with high frequency and symbolic meaning in the sentence aselement 1, and the rest of the words aselement 0, such as: according to the sign language sentence database Sentences in ["I received it", "She is very beautiful", "Your clothes are poor"] perform word segmentation statistics to obtain the words and their frequency of occurrence in the sentence database [I: 1, you: 1, she: 1 , clothes: 1, poor: 1, beautiful: 1, received: 1, have: 1, very: 1], the word frequency vector format is specified as [I, you, her, clothes, poor, beautiful, received];

步骤5.2：将手语语句库中的所有常用手语句子按照步骤（5.1）规定为词频向量格式，生成对应的词频向量，如：语句“我收到有”对应的词频向量为[1,0,0,0,0,0,1]；Step 5.2: Set all common sign language sentences in the sign language sentence library into the word frequency vector format according to step (5.1), and generate the corresponding word frequency vector, for example: the word frequency vector corresponding to the sentence "I received yes" is [1,0,0 ,0,0,0,1];

步骤5.3：将在一段时间内通过步骤4采集的手势电压信号转换获得的手语词语再转换为手语词语组，获得的手语词语组按照步骤5.1规定为词频向量格式，转换成对应词频向量，如：词语组“她漂亮很”对应的词频向量为[0,0,1,0,0,1,0]；Step 5.3: Convert the sign language words obtained through the conversion of the gesture voltage signal collected in step 4 within a period of time into sign language word groups, and the obtained sign language word groups are converted into corresponding word frequency vectors according to the word frequency vector format specified in step 5.1, such as: The word frequency vector corresponding to the word group "she is very beautiful" is [0,0,1,0,0,1,0];

步骤5.4：计算步骤5.3转换后的词频向量与步骤5.2中手语语句库中的词频向量的余弦相似度，选择余弦相似度最大的手语语句库中的手语词语作为输出词语。Step 5.4: Calculate the cosine similarity between the converted word frequency vector in step 5.3 and the word frequency vector in the sign language sentence library in step 5.2, and select the sign language word in the sign language sentence library with the largest cosine similarity as the output word.

输出结果可通过树莓派3B单板机外接声音和/或视频播放设硬件备或装置进行，可采用语句音频或视频的形式输出翻译结果。The output result can be carried out through the Raspberry Pi 3B single-board computer with external sound and/or video playback equipment or devices, and the translation result can be output in the form of sentence audio or video.