技术领域technical field
本发明涉及车辆相关技术领域,具体而言,涉及一种车辆的控制方法、装置和车辆。The present invention relates to the technical field related to vehicles, in particular, to a vehicle control method, device and vehicle.
背景技术Background technique
目前,对车辆进行控制可以通过人为手动控制,也可以通过车辆中的语音系统采集驾乘人员的语音对车辆进行控制,然而,该语音系统对聋哑人并不友好,由于聋哑人无法发出声音,导致其只能通过手动的方式对车辆中的功能进行控制,因此,仍存在相关技术中无法通过唇形识别对车辆进行控制的技术问题。At present, the vehicle can be controlled manually, or the vehicle can be controlled by collecting the voice of the driver and passengers through the voice system in the vehicle. However, the voice system is not friendly to the deaf-mute, because the deaf-mute cannot speak The sound causes it to control the functions in the vehicle only manually, so there is still a technical problem in the related art that the vehicle cannot be controlled through lip shape recognition.
针对上述相关技术中无法通过唇形识别对车辆进行控制的技术问题,目前尚未提出有效的解决方案。For the above-mentioned technical problem in the related art that the vehicle cannot be controlled through lip shape recognition, no effective solution has been proposed yet.
发明内容Contents of the invention
本发明实施例提供了一种车辆的控制方法、装置和车辆,以至少解决相关技术中无法通过唇形识别对车辆进行控制的技术问题。Embodiments of the present invention provide a vehicle control method, device and vehicle to at least solve the technical problem in the related art that the vehicle cannot be controlled through lip shape recognition.
根据本发明实施例的一个方面,提供了一种车辆的控制方法。该方法可以包括:获取车辆中目标对象的唇形待处理图像,其中,唇形待处理图像用于表示目标对象对车辆的功能需求;将唇形待处理图像输入唇形识别模型中进行唇形识别,得到目标唇形识别结果,其中,唇形识别模型为通过唇形图像样本和对应的唇形识别结果样本确定神经网络模型的残差数据,且基于残差数据对神经网络模型训练得到的;确定目标唇形识别结果对应的控制指令;基于控制指令对车辆中的功能进行控制。According to an aspect of an embodiment of the present invention, a method for controlling a vehicle is provided. The method may include: acquiring a lip shape image to be processed of a target object in the vehicle, wherein the lip shape image to be processed is used to represent the functional requirements of the target object on the vehicle; inputting the lip shape image to be processed into a lip shape recognition model to perform lip shape Recognition to obtain the target lip shape recognition result, wherein the lip shape recognition model is obtained by determining the residual data of the neural network model through the lip shape image sample and the corresponding lip shape recognition result sample, and training the neural network model based on the residual data ; Determine the control instruction corresponding to the target lip shape recognition result; control the functions in the vehicle based on the control instruction.
可选地,在将唇形处理图像输入唇形识别模型中进行唇形识别,得到目标唇形识别结果之前,包括:获取唇形图像样本,其中,唇形图像样本用于表示针对车辆中不同的功能需求录入对应的至少一唇形图像;对唇形图像样本进行数据扩充处理,得到对应的扩充处理图像样本;对扩充处理图像样本按照对应的功能需求的类别进行分类,得到对应的唇形数据集,其中,唇形数据集包括分类后的唇形图像样本和扩充处理图像样本;基于唇形数据集确定残差数据。Optionally, before the lip shape processing image is input into the lip shape recognition model for lip shape recognition, and the target lip shape recognition result is obtained, it includes: acquiring lip shape image samples, wherein the lip shape image samples are used to represent different Enter at least one corresponding lip shape image according to the functional requirements; perform data expansion processing on the lip shape image samples to obtain the corresponding expanded processing image samples; classify the expanded processing image samples according to the category of the corresponding functional requirements to obtain the corresponding lip shape A data set, wherein the lip shape data set includes classified lip shape image samples and expanded processing image samples; the residual data is determined based on the lip shape data set.
可选地,对唇形图像样本进行数据扩充处理,得到对应的扩充处理图像样本,包括:基于生成对抗网络模型对唇形图像样本进行数据扩充,确定出第一扩充处理图像样本,其中,第一扩充处理图像样本用于表示经过生成对抗网络模型数据扩充后的唇形图像;对唇形图像样本进行随机旋转,确定出第二扩充处理图像样本,其中,第二扩充处理图像样本用于表示经过增加旋转数据后的唇形图像;对唇形图像样本进行噪声处理,确定出第三扩充处理图像样本,其中,第三扩充处理图像样本用于表示经过添加噪声数据后的唇形图像;将唇形图像样本、第一扩充处理图像样本、第二扩充处理图像样本和第三扩充处理图像样本确定为扩充处理图像样本。Optionally, performing data expansion processing on the lip shape image sample to obtain a corresponding expanded processing image sample includes: performing data expansion on the lip shape image sample based on a generative confrontation network model to determine a first expansion processing image sample, wherein the first An expanded processing image sample is used to represent the lip image after the data expansion of the generated confrontation network model; the lip image sample is randomly rotated to determine a second expanded processing image sample, wherein the second expanded processing image sample is used to represent The lip shape image after adding the rotation data; performing noise processing on the lip shape image sample to determine a third expanded processing image sample, wherein the third expanded processing image sample is used to represent the lip shape image after adding noise data; The lip shape image sample, the first augmented image sample, the second augmented image sample and the third augmented image sample are determined as augmented image samples.
可选地,对扩充处理图像样本按照对应的功能需求的类别进行分类,得到对应的唇形数据集,包括:对分类后的扩充处理图像样本进行数据预处理,确定出唇形数据集,其中,数据预处理包括如下至少之一:像素亮度变换、几何变换和局部领域预处理。Optionally, classify the expanded image samples according to the categories of corresponding functional requirements to obtain the corresponding lip shape datasets, including: performing data preprocessing on the classified expanded image samples to determine the lip shape datasets, wherein , the data preprocessing includes at least one of the following: pixel brightness transformation, geometric transformation and local domain preprocessing.
可选地,获取唇形图像样本,包括:获取车辆上图形用户界面上的选择功能唇形指令,其中,选择功能唇形指令用于选择待录入的唇形对应的功能需求的类别;基于图形用户界面的确定录入指令,生成功能需求的类别对应的唇形图像样本,其中,确定录入指令用于启动对目标对象的唇形的录入。Optionally, obtaining a lip shape image sample includes: obtaining a function lip shape selection instruction on a graphical user interface on the vehicle, wherein the function lip shape selection instruction is used to select the category of functional requirements corresponding to the lip shape to be entered; The input determination instruction of the user interface generates a lip shape image sample corresponding to the category of the functional requirement, wherein the input determination instruction is used to start the input of the lip shape of the target object.
可选地,获取车辆的图形用户界面上的同意采集唇形指令;基于同意采集唇形指令,通过车辆中图像采集设备在车辆运行过程中对目标对象的唇形图像进行采集。Optionally, acquire the consent to collect lip shape instruction on the graphical user interface of the vehicle; based on the consent to collect lip shape instruction, the image collection device in the vehicle collects the lip shape image of the target object during the running of the vehicle.
可选地,在基于同意采集唇形指令,通过车辆中图像采集设备在车辆运行过程中对目标对象的唇形图像进行采集之后,包括:将图像采集设备采集的唇形图像输入唇形识别模型中;基于唇形图像确定残差数据,对唇形识别模型进行模型优化。Optionally, after the lip shape image of the target object is collected by the image collection device in the vehicle during vehicle operation based on the consent to collect the lip shape instruction, it includes: inputting the lip shape image collected by the image collection device into the lip shape recognition model Middle; the residual data is determined based on the lip shape image, and the lip shape recognition model is optimized.
可选地,将唇形待处理图像输入唇形识别模型中进行唇形识别,得到目标唇形识别结果,包括:响应于车辆上图像采集设备采集到唇形待处理图像,通过唇形识别模型对唇形待处理图像按照功能需求的类别进行分类,确定唇形待处理图像对应的目标唇形识别结果。Optionally, input the lip shape image to be processed into the lip shape recognition model for lip shape recognition, and obtain the target lip shape recognition result, including: responding to the lip shape image to be processed collected by the image acquisition device on the vehicle, through the lip shape recognition model The lip shape image to be processed is classified according to the category of functional requirements, and the target lip shape recognition result corresponding to the lip shape image to be processed is determined.
根据本发明实施例的另一个方面,还提供了一种车辆的控制装置。该装置可以包括:获取单元,用于获取车辆中目标对象的唇形待处理图像,其中,唇形待处理图像用于表示目标对象对车辆的功能需求;识别单元,用于将唇形待处理图像输入唇形识别模型中进行唇形识别,得到目标唇形识别结果,其中,唇形识别模型通过唇形图像样本和对应的唇形识别结果样本确定神经网络模型的残差数据,且基于残差数据对神经网络模型训练得到的;确定单元,用于确定目标唇形识别结果对应的控制指令;控制单元,用于基于控制指令对车辆中的功能进行控制。According to another aspect of the embodiments of the present invention, a vehicle control device is also provided. The device may include: an acquisition unit, configured to acquire a lip shape image to be processed of the target object in the vehicle, wherein the lip shape image to be processed is used to represent the functional requirements of the target object on the vehicle; a recognition unit, configured to obtain the lip shape image to be processed The image is input into the lip shape recognition model for lip shape recognition, and the target lip shape recognition result is obtained. The lip shape recognition model determines the residual data of the neural network model through the lip shape image sample and the corresponding lip shape recognition result sample, and based on the residual The difference data is obtained by training the neural network model; the determination unit is used to determine the control instruction corresponding to the target lip shape recognition result; the control unit is used to control the functions in the vehicle based on the control instruction.
根据本发明实施例的另一方面,还提供了一种计算机可读存储介质。该计算机可读存储介质包括存储的程序,其中,在程序运行时控制计算机可读存储介质所在设备执行本发明实施例的车辆的控制方法。According to another aspect of the embodiments of the present invention, a computer-readable storage medium is also provided. The computer-readable storage medium includes a stored program, wherein when the program is running, the device where the computer-readable storage medium is located is controlled to execute the vehicle control method of the embodiment of the present invention.
根据本发明实施例的另一方面,还提供了一种处理器。该处理器用于运行程序,其中,程序运行时执行本发明实施例的车辆的控制方法。According to another aspect of the embodiments of the present invention, a processor is also provided. The processor is used to run a program, wherein the vehicle control method of the embodiment of the present invention is executed when the program is running.
根据本发明实施例的另一方面,还提供一种车辆。该车辆用于执行本发明实施例的车辆的控制方法。According to another aspect of the embodiments of the present invention, a vehicle is also provided. The vehicle is used to implement the vehicle control method of the embodiment of the present invention.
在本发明实施例中,获取车辆中目标对象的唇形待处理图像,其中,唇形待处理图像用于表示目标对象对车辆的功能需求;将唇形待处理图像输入唇形识别模型中进行唇形识别,得到目标唇形识别结果,其中,唇形识别模型为通过唇形图像样本和对应的唇形识别结果样本确定神经网络模型的残差数据,且基于残差数据对神经网络模型训练得到的;确定目标唇形识别结果对应的控制指令;基于控制指令对车辆中的功能进行控制。也就是说,本发明实施例可以训练出一种唇形识别模型,将采集到的车辆中目标对象唇形的唇形待处理图像作为唇形识别模型的输入数据,在将唇形待处理图像输入唇形识别模型之后,通过唇形识别模型可以对唇形待处理图像进行唇形识别,确定出唇形待处理图像是用于启动或关闭车辆中哪个功能的,得到目标唇形识别结果,并生成对应的控制指令,基于控制指令,对车辆进行控制,启动或关闭控制指令对应的功能,由于考虑到聋哑人无需只能通过手动的方式控制车辆,从而达到了提高车辆控制的多样性的目的,进而解决了相关技术中无法通过唇形识别对车辆进行控制的技术问题,实现了可以通过唇形识别对车辆进行控制的技术效果。In the embodiment of the present invention, the lip shape image to be processed of the target object in the vehicle is obtained, wherein the lip shape image to be processed is used to represent the functional requirements of the target object on the vehicle; the lip shape image to be processed is input into the lip shape recognition model to perform Lip shape recognition to obtain target lip shape recognition results, wherein the lip shape recognition model is to determine the residual data of the neural network model through the lip shape image samples and the corresponding lip shape recognition result samples, and train the neural network model based on the residual data obtained; determine the control instruction corresponding to the target lip shape recognition result; and control the functions in the vehicle based on the control instruction. That is to say, the embodiment of the present invention can train a kind of lip shape recognition model, the lip shape to-be-processed image of target object's lip shape collected in the vehicle is used as the input data of the lip shape recognition model, and the lip shape to-be-processed image is used as the input data of the lip shape recognition model. After inputting the lip shape recognition model, the lip shape recognition model can be used to perform lip shape recognition on the lip shape image to be processed, determine which function in the vehicle is used to start or close the lip shape image to be processed, and obtain the target lip shape recognition result. And generate the corresponding control command, based on the control command, control the vehicle, start or close the function corresponding to the control command, because it is considered that the deaf-mute can only control the vehicle manually, so as to improve the diversity of vehicle control The purpose of this method is to further solve the technical problem that the vehicle cannot be controlled through lip shape recognition in related technologies, and realize the technical effect that the vehicle can be controlled through lip shape recognition.
附图说明Description of drawings
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:The accompanying drawings described here are used to provide a further understanding of the present invention and constitute a part of the application. The schematic embodiments of the present invention and their descriptions are used to explain the present invention and do not constitute improper limitations to the present invention. In the attached picture:
图1是根据本发明实施例的一种车辆的控制方法的流程图;FIG. 1 is a flow chart of a vehicle control method according to an embodiment of the present invention;
图2是根据本发明实施例的一种基于唇形对车辆进行控制的示意图;Fig. 2 is a schematic diagram of controlling a vehicle based on a lip shape according to an embodiment of the present invention;
图3是根据本发明实施例的一种车辆的控制装置的流程图。Fig. 3 is a flowchart of a vehicle control device according to an embodiment of the present invention.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都应当属于本发明保护的范围。In order to enable those skilled in the art to better understand the solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are only It is an embodiment of a part of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts shall fall within the protection scope of the present invention.
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first" and "second" in the description and claims of the present invention and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the invention described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having", as well as any variations thereof, are intended to cover a non-exclusive inclusion, for example, a process, method, system, product or device comprising a sequence of steps or elements is not necessarily limited to the expressly listed instead, may include other steps or elements not explicitly listed or inherent to the process, method, product or apparatus.
实施例1Example 1
根据本发明实施例,提供了一种车辆的控制方法,需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。According to an embodiment of the present invention, a method for controlling a vehicle is provided. It should be noted that the steps shown in the flow charts of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, and although in the flow chart The figures show a logical order, but in some cases the steps shown or described may be performed in an order different from that shown or described herein.
图1是根据本发明实施例的一种车辆的控制方法的流程图,如图1所示,该方法可以包括如下步骤:Fig. 1 is a flowchart of a method for controlling a vehicle according to an embodiment of the present invention. As shown in Fig. 1, the method may include the following steps:
步骤S102,获取车辆中目标对象的唇形待处理图像,其中,唇形待处理图像用于表示目标对象对车辆的功能需求。Step S102, acquiring a lip shape image to be processed of the target object in the vehicle, wherein the lip shape image to be processed is used to represent the target object's functional requirements for the vehicle.
在本发明上述步骤S102提供的技术方案中,通过采集车辆中目标对象的唇形,可以生成目标对象的唇形待处理图像,其中,目标对象可以为车辆中的驾乘人员。唇形待处理图像可以用于表示目标对象对车辆中的功能需求,可以为启动或关闭车辆中某功能的唇形图像。In the technical solution provided by the above step S102 of the present invention, by collecting the lip shape of the target object in the vehicle, an image to be processed of the lip shape of the target object can be generated, wherein the target object can be the driver and occupant in the vehicle. The lip shape image to be processed can be used to represent the function requirements of the target object on the vehicle, and can be a lip shape image for activating or deactivating a certain function in the vehicle.
可选地,若目标对象想要通过唇形对车辆进行控制,则可以通过目标对象对车辆上图形用户界面上的允许采集唇形的指令,控制车辆中的图像采集设备对目标对象的唇形待处理图像进行采集,便于确定该唇形待处理图像对应的车辆中的功能,从而用于启动或关闭该功能。Optionally, if the target object wants to control the vehicle through the lip shape, the image acquisition device in the vehicle can be controlled to control the lip shape of the target object by the target object allowing the collection of lip shape instructions on the graphical user interface on the vehicle. The image to be processed is collected to facilitate the determination of the function in the vehicle corresponding to the lip shape image to be processed, so as to activate or deactivate the function.
步骤S104,将唇形待处理图像输入唇形识别模型中进行唇形识别,得到目标唇形识别结果,其中,唇形识别模型为通过唇形图像样本和对应的唇形识别结果样本确定神经网络模型的残差数据,且基于残差数据对神经网络模型训练得到的。Step S104, input the image of the lip shape to be processed into the lip shape recognition model for lip shape recognition, and obtain the target lip shape recognition result, wherein the lip shape recognition model is a neural network determined by lip shape image samples and corresponding lip shape recognition result samples The residual data of the model is obtained by training the neural network model based on the residual data.
在本发明上述步骤S104提供的技术方案中,在获取到车辆中目标对象的唇形待处理图像之后,可以将唇形待处理图像输入唇形识别模型中进行唇形识别,从而确定出该唇形待处理图像对应车辆中的哪种功能,得到目标唇形识别结果,其中,目标唇形识别结果可以为唇形待处理图像对应的车辆中的功能。唇形识别模型可以为通过唇形图像样本和对应的唇形识别结果样本确定神经网络模型的残差数据,且可以基于残差数据对神经网络模型进行训练得到的。神经网络模型可以为卷积神经网络模型(Convolutional NeuralNetwork,简称为CNN)。唇形识别模型可以为残差网络模型(Residual Network,简称为ResNet)。残差数据可以为残差函数。In the technical solution provided by the above-mentioned step S104 of the present invention, after the lip shape image to be processed of the target object in the vehicle is acquired, the lip shape image to be processed can be input into the lip shape recognition model for lip shape recognition, thereby determining the lip shape Which function in the vehicle corresponds to the image to be processed to obtain the target lip shape recognition result, wherein the target lip shape recognition result may be the function in the vehicle corresponding to the lip shape image to be processed. The lip shape recognition model can be obtained by determining the residual data of the neural network model through the lip shape image samples and the corresponding lip shape recognition result samples, and can be obtained by training the neural network model based on the residual data. The neural network model may be a convolutional neural network model (Convolutional Neural Network, CNN for short). The lip shape recognition model may be a residual network model (Residual Network, ResNet for short). The residual data may be a residual function.
可选地,在利用唇形识别模型对唇形待处理图像进行唇形识别之前,可以通过采集到的唇形图像样本以及每个唇形图像样本对应的唇形识别结果样本,对原始的残差网络模型进行训练,确定出残差网络模型的残差函数,通过大量的两种样本,确定出更加贴合唇形控制车辆需求的最终的残差函数,基于最后的残差函数,可以确定出本发明实施例中的唇形识别模型。Optionally, before using the lip shape recognition model to perform lip shape recognition on the image to be processed, the original residual The difference network model is trained to determine the residual function of the residual network model. Through a large number of two samples, the final residual function that is more suitable for the lip shape control vehicle needs is determined. Based on the final residual function, it can be determined The lip shape recognition model in the embodiment of the present invention is shown.
如果未考虑到可以通过唇形识别对车辆进行控制,则会存在无法通过唇形识别对车辆进行控制的技术问题,然而,在本发明实施例中,可以通过采集大量的唇形图像样本和唇形识别结果样本,对残差网络模型进行训练,得到唇形识别模型。在车辆行驶过程中,可以采集目标对象的唇形待处理图像,通过唇形识别模型对唇形待处理图像进行唇形识别,确定出目标对象此时对车辆中的功能需求,由于考虑到除手动与语音控制车辆外,还可以通过唇形识别对车辆进行控制,从而解决了相关技术中无法通过唇形识别对车辆进行控制的技术问题。If it is not considered that the vehicle can be controlled through lip shape recognition, there will be a technical problem that the vehicle cannot be controlled through lip shape recognition. The shape recognition result samples are used to train the residual network model to obtain the lip shape recognition model. During the driving process of the vehicle, the lip shape image of the target object to be processed can be collected, and the lip shape recognition model can be used to perform lip shape recognition on the lip shape image to be processed, so as to determine the functional requirements of the target object on the vehicle at this time. In addition to manual and voice control of the vehicle, the vehicle can also be controlled through lip recognition, thereby solving the technical problem in the related art that the vehicle cannot be controlled through lip recognition.
步骤S106,确定目标唇形识别结果对应的控制指令。Step S106, determining the control instruction corresponding to the target lip shape recognition result.
在本发明上述步骤S106提供的技术方案中,在将唇形待处理结果输入唇形识别模型中进行唇形识别,确定出目标唇形识别结果之后,可以确定出该目标唇形识别结果对应的控制指令,其中,控制指令可以用于对目标唇形识别结果对应的车辆中的功能进行控制。In the technical solution provided by the above step S106 of the present invention, after the lip shape recognition result is input into the lip shape recognition model for lip shape recognition, and the target lip shape recognition result is determined, the target lip shape recognition result corresponding to the target lip shape recognition result can be determined. A control instruction, wherein the control instruction can be used to control the function in the vehicle corresponding to the target lip shape recognition result.
可选地,在训练得到最终的唇形识别模型之后,可以将唇形识别模型部署到云端服务器上。在目标对象使用车辆的过程中,可以通过车辆中的图像采集设备对目标对象的唇形进行采集,形成唇形待处理图像,并控制图像采集设备将唇形待处理图像传输至云端服务器上,通过云端服务器中的唇形识别模型对唇形待处理图像进行唇形识别,确定出目标唇形识别结果,并将目标唇形识别结果发送给车辆,控制车辆生成与目标唇形识别结果对应的控制指令,便于对车辆中对应的功能进行控制,其中,云端服务器可以为云端智能推荐及语音系统。Optionally, after the final lip shape recognition model is obtained through training, the lip shape recognition model can be deployed on a cloud server. In the process of the target object using the vehicle, the lip shape of the target object can be collected by the image acquisition device in the vehicle to form an image of the lip shape to be processed, and the image acquisition device is controlled to transmit the image of the lip shape to be processed to the cloud server. Through the lip shape recognition model in the cloud server, the lip shape recognition is performed on the image to be processed, and the target lip shape recognition result is determined, and the target lip shape recognition result is sent to the vehicle, and the vehicle is controlled to generate an image corresponding to the target lip shape recognition result. The control command is convenient for controlling the corresponding functions in the vehicle, wherein the cloud server can be a cloud intelligent recommendation and voice system.
步骤S108,基于控制指令对车辆中的功能进行控制。Step S108, controlling the functions in the vehicle based on the control instruction.
在本发明上述步骤S108提供的技术方案中,在确定出目标唇形识别结果对应的控制指令之后,可以基于控制指令对车辆中的各种组件进行控制,实现对车辆中相应功能的启动或关闭。In the technical solution provided by the above-mentioned step S108 of the present invention, after the control command corresponding to the target lip shape recognition result is determined, various components in the vehicle can be controlled based on the control command, and the corresponding functions in the vehicle can be activated or deactivated. .
举例而言,若车辆中部署的图像采集设备采集到目标对象的唇形待处理图像,可以将唇形待处理图像发送到云端服务器上,通过云端服务器对该唇形待处理图像进行唇形识别,若该唇形待处理图像为目标对象想要打开车辆的车窗,则目标唇形识别结果为“打开车辆的车窗”,可以将该目标唇形识别结果发送给车辆,控制车辆生成该目标唇形识别结果对应的控制指令“开启车窗”,从而可以基于该控制指令,将车辆的车窗开启。需要说明的是,此处仅为举例说明,不对唇形待处理图像及控制指令的具体内容做具体限制,只要是基于唇形识别对车辆进行控制的方法和过程,均在本发明实施例的保护范围之内。For example, if the image acquisition device deployed in the vehicle collects the image of the target object's lip shape to be processed, it can send the image of the lip shape to be processed to the cloud server, and perform lip shape recognition on the lip shape image to be processed through the cloud server , if the lip image to be processed is that the target object wants to open the window of the vehicle, the target lip recognition result is "open the window of the vehicle", the target lip recognition result can be sent to the vehicle, and the vehicle is controlled to generate the The control instruction "open the window" corresponding to the target lip shape recognition result, so that the window of the vehicle can be opened based on the control instruction. It should be noted that this is only an example, and there is no specific limitation on the specific content of the lip shape image to be processed and the control command. As long as the method and process of controlling the vehicle based on lip shape recognition are included in the embodiment of the present invention within the scope of protection.
在本发明实施例中,可以通过大量的唇形图像样本和唇形识别结果样本对残差网络模型进行训练,确定最合适的残差数据,并基于残差数据确定出最终的唇形识别模型。基于该唇形识别模型对目标对象的唇形对应的车辆中的功能需求进行确定,从而可以实现通过唇形识别对车辆中的功能进行控制的目的,进而实现了可以通过唇形识别对车辆进行控制的技术效果。In the embodiment of the present invention, the residual network model can be trained through a large number of lip image samples and lip recognition result samples, the most suitable residual data can be determined, and the final lip recognition model can be determined based on the residual data . Based on the lip shape recognition model, the functional requirements in the vehicle corresponding to the lip shape of the target object can be determined, so that the purpose of controlling the functions in the vehicle through lip shape recognition can be realized, and the vehicle can be controlled through lip shape recognition. Technical effects of control.
本发明实施例上述步骤S102至步骤S108,获取车辆中目标对象的唇形待处理图像,其中,唇形待处理图像用于表示目标对象对车辆的功能需求;将唇形待处理图像输入唇形识别模型中进行唇形识别,得到目标唇形识别结果,其中,唇形识别模型为通过唇形图像样本和对应的唇形识别结果样本确定神经网络模型的残差数据,且基于残差数据对神经网络模型训练得到的;确定目标唇形识别结果对应的控制指令;基于控制指令对车辆中的功能进行控制。也就是说,本发明实施例可以训练出一种唇形识别模型,将采集到的车辆中目标对象唇形的唇形待处理图像作为唇形识别模型的输入数据,在将唇形待处理图像输入唇形识别模型之后,通过唇形识别模型可以对唇形待处理图像进行唇形识别,确定出唇形待处理图像是用于启动或关闭车辆中哪个功能的,得到目标唇形识别结果,并生成对应的控制指令,基于控制指令,对车辆进行控制,启动或关闭控制指令对应的功能,由于考虑到聋哑人无需只能通过手动的方式控制车辆,从而达到了提高车辆控制的多样性的目的,进而解决了相关技术中无法通过唇形识别对车辆进行控制的技术问题,实现了可以通过唇形识别对车辆进行控制的技术效果。According to the embodiment of the present invention, the above-mentioned steps S102 to S108 are to acquire the image of the lip shape to be processed of the target object in the vehicle, wherein the image of the lip shape to be processed is used to represent the functional requirements of the target object on the vehicle; input the lip shape image to be processed into the lip shape The lip shape recognition is performed in the recognition model to obtain the target lip shape recognition result, wherein the lip shape recognition model is to determine the residual data of the neural network model through the lip shape image samples and the corresponding lip shape recognition result samples, and based on the residual data The neural network model is trained; the control instruction corresponding to the target lip shape recognition result is determined; the functions in the vehicle are controlled based on the control instruction. That is to say, the embodiment of the present invention can train a kind of lip shape recognition model, the lip shape to-be-processed image of target object's lip shape collected in the vehicle is used as the input data of the lip shape recognition model, and the lip shape to-be-processed image is used as the input data of the lip shape recognition model. After inputting the lip shape recognition model, the lip shape recognition model can be used to perform lip shape recognition on the lip shape image to be processed, determine which function in the vehicle is used to start or close the lip shape image to be processed, and obtain the target lip shape recognition result. And generate the corresponding control command, based on the control command, control the vehicle, start or close the function corresponding to the control command, because it is considered that the deaf-mute can only control the vehicle manually, so as to improve the diversity of vehicle control The purpose of this method is to further solve the technical problem that the vehicle cannot be controlled through lip shape recognition in related technologies, and realize the technical effect that the vehicle can be controlled through lip shape recognition.
下面对该实施例的上述方法进行进一步介绍。The above-mentioned method of this embodiment will be further introduced below.
作为一种可选的实施例方式,步骤S104,在将唇形处理图像输入唇形识别模型中进行唇形识别,得到目标唇形识别结果之前,获取唇形图像样本,其中,唇形图像样本用于表示针对车辆中不同的功能需求录入对应的至少一唇形图像;对唇形图像样本进行数据扩充处理,得到对应的扩充处理图像样本;对扩充处理图像样本按照对应的功能需求的类别进行分类,得到对应的唇形数据集,其中,唇形数据集包括分类后的唇形图像样本和扩充处理图像样本;基于唇形数据集确定残差数据。As an optional embodiment, in step S104, before inputting the lip shape processing image into the lip shape recognition model to perform lip shape recognition and obtain the target lip shape recognition result, obtain a lip shape image sample, wherein the lip shape image sample It is used to indicate that at least one corresponding lip image is entered for different functional requirements in the vehicle; data expansion processing is performed on the lip image sample to obtain a corresponding expanded processing image sample; the expansion processing image sample is performed according to the category of the corresponding functional requirement Classify to obtain a corresponding lip shape data set, wherein the lip shape data set includes classified lip shape image samples and expanded processing image samples; determine residual data based on the lip shape data set.
在该实施例中,在将唇形待处理图像输入唇形识别模型中进行唇形识别,确定出该唇型待处理图像对应车辆中的哪种功能,得到目标唇形识别结果之前,可以通过如下步骤确定出唇形识别模型:可以先获取唇形图像样本,可以对唇形图像样本进行数据扩充处理,确定出对应的扩充处理图像样本,然后可以对扩充处理图像样本按照对应车辆的功能需求的类别进行分类,确定出对应的唇形数据集,通过唇形数据集,可以确定残差数据,从而可以确定出唇形识别模型,其中,唇形图像样本可以用于表示针对车辆中不同的功能需求录入对应的至少一唇形图像。唇形数据集可以包括分类后的唇形图像样本和扩充处理图像样本。In this embodiment, before inputting the lip shape image to be processed into the lip shape recognition model for lip shape recognition, determining which function in the vehicle the lip shape image to be processed corresponds to, and obtaining the target lip shape recognition result, the The lip shape recognition model can be determined as follows: the lip shape image sample can be obtained first, the data expansion process can be performed on the lip shape image sample, and the corresponding expanded processing image sample can be determined, and then the expanded processing image sample can be processed according to the functional requirements of the corresponding vehicle Classify the categories of the lip shape to determine the corresponding lip shape data set. Through the lip shape data set, the residual data can be determined, so that the lip shape recognition model can be determined. Among them, the lip shape image samples can be used to represent different At least one lip shape image corresponding to the functional requirements is entered. The lip shape dataset may include classified lip shape image samples and augmented image samples.
可选地,可以通过图形用户界面向车辆的所有驾乘人员发送采集唇形图像样本的信息,在驾乘人员允许采集后,可以通过车辆的图像采集设备对车辆中相应功能对应的至少一个唇形图像样本进行采集,从而可以得到唇形图像样本。Optionally, the information of collecting lip image samples can be sent to all drivers and passengers of the vehicle through the graphical user interface. After the drivers and passengers allow the collection, at least one lip shape corresponding to the corresponding function in the vehicle can be detected by the image acquisition device of the vehicle. Lip shape image samples are collected, so that lip shape image samples can be obtained.
由于采集到的唇形图像样本有限,不足以训练得到唇形识别模型,因此,需要对唇形图像样本进行数据扩充。在本发明实施例中,可以对唇形图像样本进行数据扩充,比如,可以采用生成对抗网络模型(Generative Adversarial Network,简称为GAN)、对唇形图像样本进行随机旋转,也可以对唇形图像样本添加噪声的方式,对唇形图像样本进行数据扩充,得到扩充处理图像样本,由于考虑到单一的唇形图像样本无法代表一类车辆中的功能,通过对单一唇形图像样本进行数据扩充,可以将该唇形图像样本扩充成能够代表对应功能的一类唇形图像样本,从而解决了唇形图像样本较为单一的技术问题。需要说明的是,此处仅为举例说明,不对唇形图像样本进行数据扩充处理采用的方法及实施过程做具体限制,只要是对唇形图像样本进行数据扩充,训练得到唇形识别模型,从而基于唇形识别对车辆进行控制的方法和过程,均在本发明实施例的保护范围之内。Since the collected lip shape image samples are limited, it is not enough to train the lip shape recognition model. Therefore, data expansion of the lip shape image samples is required. In the embodiment of the present invention, data expansion can be performed on the lip image sample, for example, a Generative Adversarial Network (GAN for short) can be used to randomly rotate the lip image sample, or the lip image can be The way of adding noise to the sample is to expand the data of the lip image sample to obtain the expanded processing image sample. Considering that a single lip image sample cannot represent the function of a class of vehicles, by expanding the data of a single lip image sample, The lip shape image sample can be expanded into a type of lip shape image sample that can represent the corresponding function, thereby solving the technical problem that the lip shape image sample is relatively single. It should be noted that this is just an example, and there are no specific restrictions on the method and implementation process for data expansion processing of lip image samples. As long as the data expansion is performed on lip image samples, the lip recognition model is obtained through training, so that The method and process for controlling a vehicle based on lip shape recognition are within the protection scope of the embodiments of the present invention.
可选地,可以预先对车辆中的功能需求进行分类,比如,可以分为打开车窗、打开空调和去加油站等类别。在对唇形图像样本进行数据扩充处理,得到扩充处理图像样本之后,确定每个扩充处理图像样本对应的车辆中的哪种功能,可以按照上述预先分好的功能需求的类别对扩充处理图像样本进行分类,可以将分类好的扩充处理图像样本进行预处理,从而得到唇形数据集。Optionally, the functional requirements in the vehicle can be classified in advance, for example, they can be divided into categories such as opening the window, turning on the air conditioner, and going to a gas station. After performing data expansion processing on the lip shape image samples to obtain the expanded processing image samples, it is determined which function in the vehicle each expanded processing image sample corresponds to, and the expanded processing image samples can be processed according to the above-mentioned categories of pre-divided functional requirements. For classification, the classified expanded image samples can be preprocessed to obtain a lip shape dataset.
可选地,可以选取效果较好的卷积神经网络模型,比如,残差网络模型,作为训练前的初始模型,通过唇形数据集及唇形数据集中每个类别对应的唇形识别结果样本,对残差网络模型进行搭建和训练,从而对残差网络模型中的参数,比如,残差函数等,进行确定,从而得到训练后的唇形识别模型。Optionally, a convolutional neural network model with better effect, such as a residual network model, can be selected as the initial model before training, and the lip shape recognition result samples corresponding to each category in the lip shape dataset and the lip shape dataset , building and training the residual network model, so as to determine the parameters in the residual network model, such as the residual function, etc., so as to obtain the trained lip shape recognition model.
作为一种可选的实施例方式,步骤S104,对唇形图像样本进行数据扩充处理,得到对应的扩充处理图像样本,包括:基于生成对抗网络模型对唇形图像样本进行数据扩充,确定出第一扩充处理图像样本,其中,第一扩充处理图像样本用于表示经过生成对抗网络模型数据扩充后的唇形图像;对唇形图像样本进行随机旋转,确定出第二扩充处理图像样本,其中,第二扩充处理图像样本用于表示经过增加旋转数据后的唇形图像;对唇形图像样本进行噪声处理,确定出第三扩充处理图像样本,其中,第三扩充处理图像样本用于表示经过添加噪声数据后的唇形图像;将唇形图像样本、第一扩充处理图像样本、第二扩充处理图像样本和第三扩充处理图像样本确定为扩充处理图像样本。As an optional embodiment, step S104, performing data expansion processing on the lip shape image sample to obtain the corresponding expanded processing image sample, includes: performing data expansion on the lip shape image sample based on the generative confrontation network model, and determining the first An expanded processing image sample, wherein the first expanded processing image sample is used to represent the lip image after the data expansion of the generated confrontation network model; the lip image sample is randomly rotated to determine the second expanded processing image sample, wherein, The second expanded processing image sample is used to represent the lip shape image after adding rotation data; noise processing is performed on the lip shape image sample to determine the third expanded processing image sample, wherein the third expanded processing image sample is used to represent the added The lip shape image after the noise data; the lip shape image sample, the first image sample for expansion processing, the second image sample for expansion processing and the third image sample for expansion processing are determined as the image sample for expansion processing.
在该实施例中,在对唇形图像样本进行数据扩充处理,确定出对应的扩充处理图像样本的过程中,可以基于生成对抗网络模型对唇形图像样本进行数据扩充,得到第一扩充处理图像样本,可以对唇形图像样本进行随机旋转,确定出第二扩充处理图像样本,也可以对唇形图像样本进行噪声处理,确定出第三扩充处理图像样本,可以将唇形图像样本、第一扩充处理图像样本、第二扩充处理图像样本和第三扩充处理图像样本确定为扩充处理图像样本,其中,第一扩充处理图像样本可以用于表示经过生成对抗网络模型数据扩充后的唇形图像。第二扩充处理图像样本可以用于表示经过增加旋转数据后的唇形图像。第三扩充处理图像样本可以用于表示经过添加噪声数据后的唇形图像。In this embodiment, in the process of performing data expansion processing on the lip shape image sample and determining the corresponding expanded processing image sample, data expansion can be performed on the lip shape image sample based on the generative adversarial network model to obtain the first expanded processing image sample, the lip shape image sample can be randomly rotated to determine the second expanded processing image sample, or noise processing can be performed on the lip shape image sample to determine the third expanded processing image sample, the lip shape image sample, the first The image sample for expansion, the second image sample for expansion and the third image sample for expansion are determined as image samples for expansion, wherein the first image sample for expansion can be used to represent the lip image after the GAN model data is expanded. The second augmented processed image sample can be used to represent the lip shape image after adding the rotation data. The third expanded image sample can be used to represent the lip shape image after noise data is added.
可选地,可以对图像采集设备采集到的原始的唇形图像样本生成对抗网络模型,得到第一扩充处理图像样本。可以对原始的唇形图像样本的图像本身进行随机旋转,比如,旋转各种角度,将旋转后的所有唇形图像确定为第二扩充处理图像样本。可以对原始的唇形图像样本进行添加噪声处理,比如,可以添加高斯噪声、椒盐噪声或随机噪声的方式得到第三扩充处理图像样本,从而将经过数据扩充处理后的第一扩充处理图像样本、第二扩充处理图像样本和第三扩充处理图像样本和原始的唇形图像样本,确定为扩充处理图像样本。需要说明的是,此处仅为举例说明,不对唇形图像样本进行添加噪声的过程和方法做具体限制,只要是对唇形图像样本进行添加噪声处理,确定出唇形识别模型,并基于唇形识别模型对唇形待处理图像进行识别,对车辆进行控制的方法和过程,均在本发明实施例的保护范围之内。Optionally, an adversarial network model may be generated on the original lip shape image sample collected by the image collection device to obtain the first expanded image sample. The image itself of the original lip shape image sample may be randomly rotated, for example, by various angles, and all the lip shape images after rotation are determined as the second expanded image sample. Noise processing can be added to the original lip shape image sample. For example, Gaussian noise, salt and pepper noise or random noise can be added to obtain the third expanded processing image sample, so that the first expanded processing image sample after data expansion processing, The second expanded image sample, the third expanded image sample and the original lip shape image sample are determined as expanded image samples. It should be noted that this is just an example, and there is no specific limitation on the process and method of adding noise to the lip shape image sample, as long as the lip shape image sample is added noise processing, the lip shape recognition model is determined, and based on the lip shape The lip shape recognition model recognizes the image to be processed, and the method and process of controlling the vehicle are all within the protection scope of the embodiments of the present invention.
作为一种可选的实施例方式,步骤S104,对扩充处理图像样本按照对应的功能需求的类别进行分类,得到对应的唇形数据集,包括:对分类后的扩充处理图像样本进行数据预处理,确定出唇形数据集,其中,数据预处理包括如下至少之一:像素亮度变换、几何变换和局部领域预处理。As an optional embodiment, step S104 is to classify the expanded processed image samples according to the category of corresponding functional requirements to obtain the corresponding lip shape data set, including: performing data preprocessing on the classified expanded processed image samples , to determine the lip shape data set, wherein the data preprocessing includes at least one of the following: pixel brightness transformation, geometric transformation and local field preprocessing.
在该实施例中,在对扩充处理图像样本按照对应的功能需求的类别进行分类,得到对应的唇型数据集的过程中,可以对分类后的扩充处理图像样本进行数据预处理,从而确定出唇形数据集,其中,数据预处理可以至少包括如下之一:像素亮度变换、几何变换和局部领域预处理。In this embodiment, in the process of classifying the expanded image samples according to the corresponding functional requirements to obtain the corresponding lip shape data set, data preprocessing can be performed on the classified expanded image samples to determine the The lip shape data set, wherein the data preprocessing may at least include one of the following: pixel brightness transformation, geometric transformation and local field preprocessing.
可选地,可以将按照车辆中的功能需求分类好的扩充处理图像样本进行数据预处理操作,比如,可以进行像素亮度变换操作、几何变换操作和/或局部领域预处理等操作之后,可以得到用于训练残差网络模型的唇形数据集。需要说明的是,此处仅为举例说明,不对扩充处理图像样本进行数据预处理的方法和过程做具体限制。Optionally, data preprocessing operations can be performed on the expanded processed image samples classified according to the functional requirements in the vehicle, for example, after pixel brightness transformation operations, geometric transformation operations and/or local field preprocessing operations, etc., can be obtained Lip shape dataset for training residual network models. It should be noted that this is only an example, and no specific limitation is imposed on the method and process of performing data preprocessing on the image samples for expansion processing.
作为一种可选的实施例方式,步骤S104,获取唇形图像样本,包括:获取车辆上图形用户界面上的选择功能唇形指令,其中,选择功能唇形指令用于选择待录入的唇形对应的功能需求的类别;基于图形用户界面的确定录入指令,生成功能需求的类别对应的唇形图像样本,其中,确定录入指令用于启动对目标对象的唇形的录入。As an optional embodiment, step S104, acquiring a lip shape image sample includes: acquiring a function lip shape selection instruction on a graphical user interface on the vehicle, wherein the lip shape selection function instruction is used to select a lip shape to be entered The category of the corresponding functional requirement; based on the determined input instruction of the GUI, a lip shape image sample corresponding to the category of the functional requirement is generated, wherein the determined input instruction is used to start the input of the lip shape of the target object.
在该实施例中,在获取唇形图像样本的过程中,可以采集目标对象对车辆上图形用户界面上的选择功能唇形指令,在此之后,可以采集目标对象在图形用户界面上的确定录入指令,基于确定录入指令,可以生成功能需求的类别对应的唇形图像样本,其中,选择功能唇形指令可以用于选择待录入的唇形对应的功能需求的类别。确定录入指令可以用于启动录入目标对象对该功能需求的唇形。图形用户界面可以为车辆的车机屏幕或者车载导航等显示设备。In this embodiment, in the process of obtaining the lip shape image sample, the lip shape instruction of the target object on the graphical user interface of the vehicle can be collected. The instruction, based on the determined input instruction, can generate lip shape image samples corresponding to the category of functional requirements, wherein the instruction of selecting a functional lip shape can be used to select the category of functional requirements corresponding to the lip shape to be entered. It is determined that the input instruction can be used to start inputting the lips of the target object for the functional requirement. The graphical user interface may be a display device such as a car screen or a car navigation of the vehicle.
举例而言,在车辆的图形用户界面上可以显示车辆中不同功能选择录入唇形的控件,比如,图像用户界面上可以存在“打开车窗”和“打开空调”等的矩形控件,通过目标对象对上述控件进行选择和点击,也即,基于目标对象的选择功能唇形指令,可以对该选择功能唇形指令对应功能的唇形进行录入。通过车辆中的图像采集设备可以对目标对象对该功能的唇形进行采集,生成唇形图像。在图形用户界面上可以显示“是否确定录入当前唇形”的文本框以及“是”与“否”的功能控件,可以便于使目标对象确认是否将上一时刻采集的唇形图像作为该功能的唇形图像样本。通过目标对象点击“是”的控件,也即,基于目标对象的确定录入指令,可以生成每个功能对应的唇形图像样本。需要说明的是,此处仅为举例说明,不对采集唇形图像样本的形式做具体限制。For example, on the graphical user interface of the vehicle, controls for selecting and entering the lip shape of different functions in the vehicle can be displayed. For example, there may be rectangular controls such as "open the window" and "turn on the air conditioner" on the graphical user interface. Selecting and clicking on the above controls, that is, based on the target object's lip shape selection command, the lip shape of the function corresponding to the lip shape selection command can be entered. The lip shape of the target object for this function can be collected by the image acquisition device in the vehicle to generate a lip shape image. On the graphical user interface, the text box of "is it sure to enter the current lip shape" and the function control of "yes" and "no" can be displayed, which can facilitate the target object to confirm whether the lip shape image collected at the last moment is used as the function control. Sample lip image. The target object clicks the "Yes" control, that is, based on the target object's determined input instruction, a lip shape image sample corresponding to each function can be generated. It should be noted that this is just an example, and there is no specific limitation on the form of collecting lip shape image samples.
作为一种可选的实施例方式,步骤S108,获取车辆的图形用户界面上的同意采集唇形指令;基于同意采集唇形指令,通过车辆中图像采集设备在车辆运行过程中对目标对象的唇形图像进行采集。As an optional embodiment, step S108 is to obtain the consent collection lip shape instruction on the vehicle's graphical user interface; image acquisition.
在该实施例中,通过采集车辆的图形用户界面上的同意采集唇形指令,可以在车辆运行过程中实时地对目标对象的唇形图像进行采集,其中,图像采集设备可以为摄像头。In this embodiment, the lip shape image of the target object can be collected in real time during the running of the vehicle by collecting the lip shape command agreed on the GUI of the vehicle, wherein the image collection device can be a camera.
可选地,在车辆行驶之前或行驶过程中,可以在图形用户界面上显示“是否实时采集唇形”的文本框以及“是”与“否”的控件,通过目标对象点击“是”的控件,也即,通过同意采集唇形指令,可以开启图像采集设备对目标对象的唇形图像的实时采集工作,并将实时采集的唇形图像传输至云端服务器中。Optionally, before or during the driving of the vehicle, the text box of "whether to collect lip shape in real time" and the control of "yes" and "no" can be displayed on the graphical user interface, and the control of "yes" can be clicked by the target object , that is, by agreeing to the lip shape collection command, the image collection device can start the real-time collection of the lip shape image of the target object, and transmit the lip shape image collected in real time to the cloud server.
作为一种可选的实施例方式,步骤S108,在基于同意采集唇形指令,通过车辆中图像采集设备在车辆运行过程中对目标对象的唇形图像进行采集之后,包括:将图像采集设备采集的唇形图像输入唇形识别模型中;基于唇形图像确定残差数据,对唇形识别模型进行模型优化。As an optional embodiment, step S108, after the image collection device in the vehicle collects the lip shape image of the target object during the operation of the vehicle based on the consent to collect the lip shape instruction, includes: collecting the lip shape image of the target object by the image collection device The lip shape image is input into the lip shape recognition model; the residual data is determined based on the lip shape image, and the lip shape recognition model is optimized.
在该实施例中,在基于同意采集唇形指令,通过车辆中图像采集设备在车辆运行过程中对唇形图像进行实时采集之后,可以将图像采集设备采集到的唇形图像传输至云端服务器中,通过云端服务器基于唇形图像确定唇形识别模型的残差数据,从而对残差数据进行更新,实现对唇形识别模型进行模型优化的目的。In this embodiment, after the lip shape image is collected by the image acquisition device in the vehicle in real time during the operation of the vehicle based on the consent to collect the lip shape instruction, the lip shape image collected by the image acquisition device can be transmitted to the cloud server The cloud server determines the residual data of the lip shape recognition model based on the lip shape image, so as to update the residual data and achieve the purpose of model optimization for the lip shape recognition model.
可选地,通过云端服务器实时接收目标对象的唇形图像,通过唇形图像对已经训练好的唇形识别模型的残差数据进行更新,实现不断优化模型的效果。Optionally, the lip shape image of the target object is received in real time through the cloud server, and the residual data of the trained lip shape recognition model is updated through the lip shape image to realize the effect of continuously optimizing the model.
由于目标对象的唇形可能会随着时间产生一定的变化,若在唇形变化的过程中,唇形识别模型一直不变的话,容易造成导致唇形识别的准确率降低的技术问题。为解决上述问题,在本发明实施例中,可以获取基于同意采集唇形指令,在保证目标对象允许采集唇形的前提下,可以对目标对象的唇形图像进行实时采集,并将唇形图像汇总到云端服务器中,不断地对唇形识别模型进行训练,达到对唇形识别模型进行模型优化的目的,从而实现了提高唇形识别的准确率的技术效果。Since the lip shape of the target object may change over time, if the lip shape recognition model remains unchanged during the lip shape change process, it will easily cause technical problems that reduce the accuracy of lip shape recognition. In order to solve the above-mentioned problems, in the embodiment of the present invention, it is possible to obtain the command of lip shape collection based on consent. On the premise that the target object is allowed to collect the lip shape, the lip shape image of the target object can be collected in real time, and the lip shape image Summarized into the cloud server, the lip shape recognition model is continuously trained to achieve the purpose of model optimization for the lip shape recognition model, thereby achieving the technical effect of improving the accuracy of lip shape recognition.
作为一种可选的实施例方式,步骤S104,将唇形待处理图像输入唇形识别模型中进行唇形识别,得到目标唇形识别结果,包括:响应于车辆上图像采集设备采集到唇形待处理图像,通过唇形识别模型对唇形待处理图像按照功能需求的类别进行分类,确定唇形待处理图像对应的目标唇形识别结果。As an optional embodiment, in step S104, input the image of the lip shape to be processed into the lip shape recognition model for lip shape recognition, and obtain the target lip shape recognition result, including: responding to the lip shape collected by the image acquisition device on the vehicle For the image to be processed, the lip shape recognition model is used to classify the lip shape image to be processed according to the category of functional requirements, and determine the target lip shape recognition result corresponding to the lip shape image to be processed.
在该实施例中,在将唇形待处理图像输入唇形识别模型中进行唇形识别,得到目标唇形识别结果的过程中,当车辆上的图像采集设备采集到唇形待处理图像,可以通过唇形识别模型对唇形待处理图像按照功能需求的类别进行分类,确定唇形待处理图像对应的目标唇形识别结果。In this embodiment, in the process of inputting the image of the lip shape to be processed into the lip shape recognition model for lip shape recognition and obtaining the target lip shape recognition result, when the image acquisition device on the vehicle collects the image of the lip shape to be processed, it can be The lip shape recognition model is used to classify the lip shape image to be processed according to the category of functional requirements, and determine the target lip shape recognition result corresponding to the lip shape image to be processed.
可选地,可以预先将车辆中的功能需求进行分类,确定出不同功能需求对应的类别,基于唇形识别模型,可以对唇形待处理图像按照上述类别进行分类,确定出唇形待处理图像对应的车辆中的功能是什么,从而可以对车辆进行控制,实现启动或关闭该功能的目的。Optionally, the functional requirements in the vehicle can be classified in advance to determine the categories corresponding to different functional requirements. Based on the lip shape recognition model, the lip shape images to be processed can be classified according to the above categories, and the lip shape images to be processed can be determined. What is the corresponding function in the vehicle, so that the vehicle can be controlled to achieve the purpose of starting or closing the function.
在本发明实施例中,获取车辆中目标对象的唇形待处理图像,其中,唇形待处理图像用于表示目标对象对车辆的功能需求;将唇形待处理图像输入唇形识别模型中进行唇形识别,得到目标唇形识别结果,其中,唇形识别模型为通过唇形图像样本和对应的唇形识别结果样本确定神经网络模型的残差数据,且基于残差数据对神经网络模型训练得到的;确定目标唇形识别结果对应的控制指令;基于控制指令对车辆中的功能进行控制。也就是说,本发明实施例可以训练出一种唇形识别模型,将采集到的车辆中目标对象唇形的唇形待处理图像作为唇形识别模型的输入数据,在将唇形待处理图像输入唇形识别模型之后,通过唇形识别模型可以对唇形待处理图像进行唇形识别,确定出唇形待处理图像是用于启动或关闭车辆中哪个功能的,得到目标唇形识别结果,并生成对应的控制指令,基于控制指令,对车辆进行控制,启动或关闭控制指令对应的功能,由于考虑到聋哑人无需只能通过手动的方式控制车辆,从而达到了提高车辆控制的多样性的目的,进而解决了相关技术中无法通过唇形识别对车辆进行控制的技术问题,实现了可以通过唇形识别对车辆进行控制的技术效果。In the embodiment of the present invention, the lip shape image to be processed of the target object in the vehicle is obtained, wherein the lip shape image to be processed is used to represent the functional requirements of the target object on the vehicle; the lip shape image to be processed is input into the lip shape recognition model to perform Lip shape recognition to obtain target lip shape recognition results, wherein the lip shape recognition model is to determine the residual data of the neural network model through the lip shape image samples and the corresponding lip shape recognition result samples, and train the neural network model based on the residual data obtained; determine the control instruction corresponding to the target lip shape recognition result; and control the functions in the vehicle based on the control instruction. That is to say, the embodiment of the present invention can train a kind of lip shape recognition model, the lip shape to-be-processed image of target object's lip shape collected in the vehicle is used as the input data of the lip shape recognition model, and the lip shape to-be-processed image is used as the input data of the lip shape recognition model. After inputting the lip shape recognition model, the lip shape recognition model can be used to perform lip shape recognition on the lip shape image to be processed, determine which function in the vehicle is used to start or close the lip shape image to be processed, and obtain the target lip shape recognition result. And generate the corresponding control command, based on the control command, control the vehicle, start or close the function corresponding to the control command, because it is considered that the deaf-mute can only control the vehicle manually, so as to improve the diversity of vehicle control The purpose of this method is to further solve the technical problem that the vehicle cannot be controlled through lip shape recognition in related technologies, and realize the technical effect that the vehicle can be controlled through lip shape recognition.
实施例2Example 2
下面结合优选的实施方式对本发明实施例的技术方案进行举例说明。The technical solutions of the embodiments of the present invention are illustrated below in combination with preferred implementation modes.
目前,为了便于车主使用车内的功能,车内配备了大量智能设备,智能推荐及智能语音系统就是其中的一种。智能推荐系统可以在合适的时机为车主提供及时的服务,并且有语音识别功能的加持,可以做到不需要动手即可选择接收或拒绝推送的服务,真正地做到方便快捷。At present, in order to make it easier for car owners to use the functions in the car, a large number of smart devices are equipped in the car, and smart recommendation and smart voice systems are one of them. The intelligent recommendation system can provide car owners with timely services at the right time, and with the support of voice recognition function, it can choose to accept or reject the push service without hands-on, which is really convenient and fast.
但是,智能推荐及语音系统对聋哑人却十分不友好,由于聋哑人无法发出声音,导致只能手动地对车辆进行控制。因此,仍存在相关技术中无法通过唇形识别对车辆进行控制的技术问题However, the intelligent recommendation and voice system is very unfriendly to the deaf-mute. Since the deaf-mute cannot make a sound, they can only control the vehicle manually. Therefore, there is still a technical problem in the related art that the vehicle cannot be controlled by lip shape recognition
在一种实施方式中,提出了一种基于边缘计算终端的唇语识别方法及装置,包括如下步骤:构建唇语识别模型,通过中文唇语数据集进行模型预训练,提取出预训练模型;对预训练模型进行模型压缩;连续采集边缘计算终端的视频数据,作为目标数据集进行保存;对目标数据集进行预处理;根据目标数据集,对预训练模型进行微调训练,获得轻量化模型;优化轻量化模型,适配边缘计算终端的硬件平台并部署到边缘计算终端,采集用户的视频数据,识别用户的唇语信息,输出用户指令的识别结果。与现有技术相比,该方法具有提高语音识别的识别效果和准确性,提高唇语识别模型的鲁棒性,有利于部署在计算能力有限的边缘计算终端上等优点。In one embodiment, a lip recognition method and device based on an edge computing terminal is proposed, comprising the following steps: constructing a lip recognition model, performing model pre-training through a Chinese lip data set, and extracting a pre-training model; Compress the pre-trained model; continuously collect video data from edge computing terminals and save it as the target data set; preprocess the target data set; fine-tune the pre-trained model according to the target data set to obtain a lightweight model; Optimize the lightweight model, adapt the hardware platform of the edge computing terminal and deploy it to the edge computing terminal, collect the user's video data, recognize the user's lip information, and output the recognition result of the user's instruction. Compared with the existing technology, this method has the advantages of improving the recognition effect and accuracy of speech recognition, improving the robustness of the lip recognition model, and is conducive to deployment on edge computing terminals with limited computing power.
在另一种实施方式中,提出了一种唇语识别方法、服务设备及存储介质,服务设备首先对目标对象进行视频采集,然后对需要提取唇部信息的每帧目标图像分别执行:从目标图像中提取出目标对象的唇部图像,对唇部图像进行分类识别,将唇部图像划分为发音帧或静默帧,若连续多帧唇部图像的分类识别结果满足从静默帧到发音帧再到静默帧的变化规律,则基于该变化规律从连续多帧唇部图像中定位唇语的起止位置,在获取起始位置之间的唇部图像序列之后,对唇部图像序列进行初步粗分类,筛选掉虽有耦合性但不支持的唇语,对筛选后的唇部图像序列进行唇语识别,得到唇语识别结果。从而可以在除语音交互外,增加基于唇语识别结果的多模态的信号提高人机交互的适用性和稳定性。In another embodiment, a lip language recognition method, service device and storage medium are proposed. The service device first collects the video of the target object, and then executes separately for each frame of target image that needs to extract lip information: from the target Extract the lip image of the target object from the image, classify and recognize the lip image, and divide the lip image into pronunciation frames or silent frames. Based on the law of change to the silent frame, the starting and ending positions of the lip language are located from the continuous multi-frame lip images based on the law of change, and after the lip image sequence between the starting positions is obtained, the lip image sequence is initially roughly classified , filter out the lip language that is coupled but not supported, perform lip language recognition on the filtered lip image sequence, and get the lip language recognition result. Therefore, in addition to voice interaction, multimodal signals based on lip recognition results can be added to improve the applicability and stability of human-computer interaction.
然而,上述方法由于未考虑到可以通过唇形识别的方式对车辆进行控制,仍存在相关技术中无法通过唇形识别对车辆进行控制的技术问题。However, since the above method does not take into account that the vehicle can be controlled through lip shape recognition, there is still a technical problem in the related art that the vehicle cannot be controlled through lip shape recognition.
为解决上述问题,本发明实施例提出了一种控制车辆的方法。该方法可以包括:将采集到的车辆中目标对象唇形的唇形待处理图像作为唇形识别模型的输入数据,在将唇形待处理图像输入唇形识别模型之后,通过唇形识别模型可以对唇形待处理图像进行唇形识别,确定出唇形待处理图像是用于启动或关闭车辆中哪个功能的,得到目标唇形识别结果,并生成对应的控制指令,基于控制指令,对车辆进行控制,启动或关闭控制指令对应的功能,由于考虑到聋哑人无需只能通过手动的方式控制车辆,从而达到了提高车辆控制的多样性的目的,进而解决了相关技术中无法通过唇形识别对车辆进行控制的技术问题,实现了可以通过唇形识别对车辆进行控制的技术效果。In order to solve the above problems, an embodiment of the present invention proposes a method for controlling a vehicle. The method may include: using the collected lip shape image to be processed of the target object's lip shape in the vehicle as the input data of the lip shape recognition model, after inputting the lip shape image to be processed into the lip shape recognition model, the lip shape recognition model can Perform lip shape recognition on the lip shape image to be processed, determine which function in the vehicle is used to activate or deactivate the lip shape image to be processed, obtain the target lip shape recognition result, and generate corresponding control instructions, based on the control instruction, control the vehicle To control, start or close the function corresponding to the control command, because deaf-mute people do not need to control the vehicle only manually, so as to achieve the purpose of improving the diversity of vehicle control, and then solve the problem of not being able to pass the lip shape in the related technology. Identify the technical problem of controlling the vehicle, and realize the technical effect that the vehicle can be controlled through lip shape recognition.
下面对该实施例的上述方法进行进一步地介绍。The above-mentioned method of this embodiment will be further introduced below.
图2是根据本发明实施例的一种基于唇形对车辆进行控制的示意,如图2所示,可以通过用户主动拍摄唤醒车机功能所对应的唇形,也即,可以通过用户主动拍摄唇形图像,使得车辆的图像采集设备采集到车辆中不同功能对应的唇形图像样本,并将收集到的唇形图像样本(数据)进行数据扩充,可以通过生成对抗网络模型、添加噪声(高斯噪声、椒盐噪声和随机噪声)和随机旋转进行数据扩充,并将数据扩充后的数据按照车辆中总的服务数进行数据分类,从而得到唇型数据集。并可以将唇形数据集输入残差网络模型中进行调优,从而得到唇形识别模型,可以将唇形识别模型部署到云端智能推荐及语音系统中。在用户使用车辆的过程中,也即,在用户用车场景下,可以通过摄像头捕捉该用户的唇形,得到唇形待处理图像,并将唇形待处理图像上传至云端智能推荐及语音系统中,基于该系统中的唇形识别模型对唇形待处理图像进行唇形识别,输入识别结果,从而控制车机可以基于该识别结果提供服务,也即,基于该识别结果,可以控制车辆启动或关闭相应的功能,满足用户的需求。在此过程中,可以定时地收集数据,对唇形识别模型进行不断训练,达到优化模型的目的。Fig. 2 is a schematic diagram of controlling a vehicle based on lip shape according to an embodiment of the present invention. As shown in Fig. 2, the lip shape corresponding to the function of waking up the vehicle can be taken by the user actively, that is, the lip shape corresponding to the function of waking up the vehicle can be taken by the user. Lip image, so that the image acquisition equipment of the vehicle collects lip image samples corresponding to different functions in the vehicle, and expands the collected lip image samples (data), which can be generated by generating an adversarial network model, adding noise (Gaussian Noise, salt and pepper noise and random noise) and random rotation for data expansion, and the data after data expansion is classified according to the total number of services in the vehicle, so as to obtain the lip shape data set. And the lip shape data set can be input into the residual network model for tuning, so as to obtain the lip shape recognition model, and the lip shape recognition model can be deployed to the cloud intelligent recommendation and speech system. During the process of the user using the vehicle, that is, in the scene of the user using the car, the user's lip shape can be captured by the camera, the image of the lip shape to be processed can be obtained, and the image of the lip shape to be processed can be uploaded to the cloud intelligent recommendation and voice system In this system, based on the lip shape recognition model in the system, the lip shape recognition is performed on the image to be processed, and the recognition result is input, so that the control vehicle can provide services based on the recognition result, that is, based on the recognition result, the vehicle can be controlled to start Or close the corresponding function to meet the needs of users. During this process, data can be collected regularly, and the lip shape recognition model can be continuously trained to achieve the purpose of optimizing the model.
在该实施例中,可以通过如下步骤确定出唇形识别模型:可以先获取唇形图像样本,可以对唇形图像样本进行数据扩充处理,确定出对应的扩充处理图像样本,然后可以对扩充处理图像样本按照对应车辆的功能需求的类别进行分类,确定出对应的唇形数据集,通过唇形数据集,可以确定残差数据,从而可以确定出唇形识别模型。In this embodiment, the lip shape recognition model can be determined through the following steps: the lip shape image sample can be obtained first, the data expansion process can be performed on the lip shape image sample, and the corresponding image sample for expansion processing can be determined, and then the expansion processing can be carried out. The image samples are classified according to the category corresponding to the functional requirements of the vehicle, and the corresponding lip shape data set is determined. Through the lip shape data set, the residual data can be determined, so that the lip shape recognition model can be determined.
可选地,可以通过图形用户界面向车辆的所有驾乘人员发送采集唇形图像样本的信息,在驾乘人员允许采集后,可以通过车辆的图像采集设备对车辆中相应功能对应的至少一个唇形图像样本进行采集,从而可以得到唇形图像样本。Optionally, the information of collecting lip image samples can be sent to all drivers and passengers of the vehicle through the graphical user interface. After the drivers and passengers allow the collection, at least one lip shape corresponding to the corresponding function in the vehicle can be detected by the image acquisition device of the vehicle. Lip shape image samples are collected, so that lip shape image samples can be obtained.
可选地,对图像采集设备采集到的原始的唇形图像样本生成对抗网络模型,得到第一扩充处理图像样本。可以对原始的唇形图像样本的图像本身进行随机旋转,比如,旋转各种角度,将旋转后的所有唇形图像确定为第二扩充处理图像样本。可以对原始的唇形图像样本进行添加噪声处理,比如,可以添加高斯噪声、椒盐噪声或随机噪声的方式得到第三扩充处理图像样本,从而将经过数据扩充处理后的第一扩充处理图像样本、第二扩充处理图像样本和第三扩充处理图像样本和原始的唇形图像样本,确定为扩充处理图像样本。Optionally, an adversarial network model is generated for the original lip shape image sample collected by the image collection device to obtain the first expanded image sample. The image itself of the original lip shape image sample may be randomly rotated, for example, by various angles, and all the lip shape images after rotation are determined as the second expanded image sample. Noise processing can be added to the original lip shape image sample. For example, Gaussian noise, salt and pepper noise or random noise can be added to obtain the third expanded processing image sample, so that the first expanded processing image sample after data expansion processing, The second expanded image sample, the third expanded image sample and the original lip shape image sample are determined as expanded image samples.
在本发明实施例中,可以对唇形图像样本进行数据扩充,比如,可以采用生成对抗网络模型、对唇形图像样本进行随机旋转,也可以对唇形图像样本添加噪声的方式,对唇形图像样本进行数据扩充,得到扩充处理图像样本,由于考虑到单一的唇形图像样本无法代表一类车辆中的功能,通过对单一唇形图像样本进行数据扩充,可以将该唇形图像样本扩充成能够代表对应功能的一类唇形图像样本,从而解决了唇形图像样本较为单一的技术问题。In the embodiment of the present invention, data augmentation can be performed on the lip shape image sample. For example, a generated confrontation network model can be used to randomly rotate the lip shape image sample, or the way of adding noise to the lip shape image sample can be added to the lip shape image sample. The data of the image sample is expanded to obtain the expanded image sample. Since a single lip image sample cannot represent the function of a class of vehicles, the lip image sample can be expanded into A class of lip image samples that can represent corresponding functions solves the technical problem that the lip image samples are relatively single.
可选地,可以预先对车辆中的功能需求进行分类,比如,可以分为打开车窗、打开空调和去加油站等类别。在对唇形图像样本进行数据扩充处理,得到扩充处理图像样本之后,确定每个扩充处理图像样本对应的车辆中的哪种功能,可以按照上述预先分好的功能需求的类别对扩充处理图像样本进行分类,可以将分类好的扩充处理图像样本进行预处理,从而得到唇形数据集。Optionally, the functional requirements in the vehicle can be classified in advance, for example, they can be divided into categories such as opening the window, turning on the air conditioner, and going to a gas station. After performing data expansion processing on the lip shape image samples to obtain the expanded processing image samples, it is determined which function in the vehicle each expanded processing image sample corresponds to, and the expanded processing image samples can be processed according to the above-mentioned categories of pre-divided functional requirements. For classification, the classified expanded image samples can be preprocessed to obtain a lip shape dataset.
可选地,可以将按照车辆中的功能需求分类好的扩充处理图像样本进行数据预处理操作,比如,可以进行像素亮度变换操作、几何变换操作和/或局部领域预处理等操作之后,可以得到用于训练残差网络模型的唇形数据集。需要说明的是,此处仅为举例说明,不对扩充处理图像样本进行数据预处理的方法和过程做具体限制。Optionally, data preprocessing operations can be performed on the expanded processed image samples classified according to the functional requirements in the vehicle, for example, after pixel brightness transformation operations, geometric transformation operations and/or local field preprocessing operations, etc., can be obtained Lip shape dataset for training residual network models. It should be noted that this is only an example, and no specific limitation is imposed on the method and process of performing data preprocessing on the image samples for expansion processing.
可选地,可以选取效果较好的卷积神经网络模型,比如,残差网络模型,作为训练前的初始模型,通过唇形数据集及唇形数据集中每个类别对应的唇形识别结果样本,对残差网络模型进行搭建和训练,从而对残差网络模型中的参数,比如,残差函数等,进行确定,从而得到训练后的唇形识别模型。Optionally, a convolutional neural network model with better effect, such as a residual network model, can be selected as the initial model before training, and the lip shape recognition result samples corresponding to each category in the lip shape dataset and the lip shape dataset , building and training the residual network model, so as to determine the parameters in the residual network model, such as the residual function, etc., so as to obtain the trained lip shape recognition model.
在该实施例中,可以将唇形识别模型部署到云端服务器上。在目标对象使用车辆的过程中,可以通过车辆中的图像采集设备对目标对象的唇形进行采集,形成唇形待处理图像,并控制图像采集设备将唇形待处理图像传输至云端服务器上,通过云端服务器中的唇形识别模型对唇形待处理图像进行唇形识别,确定出目标唇形识别结果,并将目标唇形识别结果发送给车辆,控制车辆生成与目标唇形识别结果对应的控制指令,便于对车辆中对应的功能进行控制,其中,云端服务器可以为云端智能推荐及语音系统。In this embodiment, the lip shape recognition model can be deployed on a cloud server. In the process of the target object using the vehicle, the lip shape of the target object can be collected by the image acquisition device in the vehicle to form an image of the lip shape to be processed, and the image acquisition device is controlled to transmit the image of the lip shape to be processed to the cloud server. Through the lip shape recognition model in the cloud server, the lip shape recognition is performed on the image to be processed, and the target lip shape recognition result is determined, and the target lip shape recognition result is sent to the vehicle, and the vehicle is controlled to generate an image corresponding to the target lip shape recognition result. The control command is convenient for controlling the corresponding functions in the vehicle, wherein the cloud server can be a cloud intelligent recommendation and voice system.
在该实施例中,若目标对象想要通过唇形对车辆进行控制,则可以通过目标对象对车辆上图形用户界面上的允许采集唇形的指令,控制车辆中的图像采集设备对目标对象的唇形待处理图像进行采集,便于确定该唇形待处理图像对应的车辆中的功能,从而用于启动或关闭该功能。In this embodiment, if the target object wants to control the vehicle through the lip shape, the image acquisition device in the vehicle can control the image acquisition device in the vehicle to control the target object through the command of the target object on the graphical user interface on the vehicle to allow the lip shape to be collected. The image of the lip shape to be processed is collected, so as to determine the function in the vehicle corresponding to the lip shape image to be processed, so as to activate or deactivate the function.
在该实施例中,可以预先将车辆中的功能需求进行分类,确定出不同功能需求对应的类别,基于唇形识别模型,可以对唇形待处理图像按照上述类别进行分类,确定出唇形待处理图像对应的车辆中的功能是什么,从而可以对车辆进行控制,实现启动或关闭该功能的目的。In this embodiment, the functional requirements in the vehicle can be classified in advance, and the categories corresponding to different functional requirements can be determined. Based on the lip shape recognition model, the lip shape images to be processed can be classified according to the above categories, and the lip shape to be processed can be determined. What is the function in the vehicle corresponding to the processing image, so that the vehicle can be controlled to realize the purpose of starting or closing the function.
在该实施例中,在将唇形待处理结果输入唇形识别模型中进行唇形识别,确定出目标唇形识别结果之后,可以确定出该目标唇形识别结果对应的控制指令。In this embodiment, after the lip shape recognition result is input into the lip shape recognition model for lip shape recognition, and the target lip shape recognition result is determined, the control instruction corresponding to the target lip shape recognition result can be determined.
可选地,在目标对象使用车辆的过程中,可以通过车辆中的图像采集设备对目标对象的唇形进行采集,形成唇形待处理图像,并控制图像采集设备将唇形待处理图像传输至云端服务器上,通过云端服务器中的唇形识别模型对唇形待处理图像进行唇形识别,确定出目标唇形识别结果,并将目标唇形识别结果发送给车辆,控制车辆生成与目标唇形识别结果对应的控制指令,便于对车辆中对应的功能进行控制。Optionally, when the target object is using the vehicle, the image acquisition device in the vehicle can collect the lip shape of the target object to form an image of the lip shape to be processed, and control the image acquisition device to transmit the lip shape image to be processed to On the cloud server, use the lip shape recognition model in the cloud server to perform lip shape recognition on the image to be processed, determine the target lip shape recognition result, and send the target lip shape recognition result to the vehicle to control the vehicle generation and target lip shape The control instruction corresponding to the recognition result facilitates the control of the corresponding function in the vehicle.
在该实施例中,在确定出目标唇形识别结果对应的控制指令之后,可以基于控制指令对车辆中的各种组件进行控制,实现对车辆中相应功能的启动或关闭。In this embodiment, after the control instruction corresponding to the target lip shape recognition result is determined, various components in the vehicle can be controlled based on the control instruction, so as to activate or deactivate corresponding functions in the vehicle.
本发明实施例可以训练出一种唇形识别模型,将采集到的车辆中目标对象唇形的唇形待处理图像作为唇形识别模型的输入数据,在将唇形待处理图像输入唇形识别模型之后,通过唇形识别模型可以对唇形待处理图像进行唇形识别,确定出唇形待处理图像是用于启动或关闭车辆中哪个功能的,得到目标唇形识别结果,并生成对应的控制指令,基于控制指令,对车辆进行控制,启动或关闭控制指令对应的功能,由于考虑到聋哑人无需只能通过手动的方式控制车辆,从而达到了提高车辆控制的多样性的目的,进而解决了相关技术中无法通过唇形识别对车辆进行控制的技术问题,实现了可以通过唇形识别对车辆进行控制的技术效果。In the embodiment of the present invention, a lip shape recognition model can be trained, and the lip shape to-be-processed image of the target object's lip shape collected in the vehicle is used as the input data of the lip shape recognition model, and the lip shape to-be-processed image is input into the lip shape recognition After the model, the lip shape recognition model can be used to recognize the lip shape of the image to be processed, determine which function in the vehicle is used to start or close the lip shape image to be processed, obtain the target lip shape recognition result, and generate the corresponding Control instructions, based on the control instructions, control the vehicle, start or close the function corresponding to the control instruction, because deaf-mute people do not need to control the vehicle manually, so as to achieve the purpose of improving the diversity of vehicle control, and then The technical problem that the vehicle cannot be controlled through lip shape recognition in related technologies is solved, and the technical effect that the vehicle can be controlled through lip shape recognition is realized.
实施例3Example 3
根据本发明实施例,还提供了一种车辆的控制装置。需要说明的是,该车辆的控制装置可以用于执行实施例1中的一种车辆的控制方法。According to an embodiment of the present invention, a vehicle control device is also provided. It should be noted that the vehicle control device can be used to implement a vehicle control method in Embodiment 1.
图3是根据本发明实施例的一种车辆的控制装置的示意图,如图3所示,该车辆的控制装置300可以包括:获取单元302、识别单元304、确定单元306和控制单元308。FIG. 3 is a schematic diagram of a vehicle control device according to an embodiment of the present invention. As shown in FIG.
获取单元302,用于获取车辆中目标对象的唇形待处理图像,其中,唇形待处理图像用于表示目标对象对车辆的功能需求。The obtaining unit 302 is configured to obtain a lip shape image to be processed of the target object in the vehicle, wherein the lip shape image to be processed is used to represent the target object's functional requirements on the vehicle.
识别单元304,用于将唇形待处理图像输入唇形识别模型中进行唇形识别,得到目标唇形识别结果,其中,唇形识别模型通过唇形图像样本和对应的唇形识别结果样本确定神经网络模型的残差数据,且基于残差数据对神经网络模型训练得到的。The recognition unit 304 is configured to input the image of the lip shape to be processed into the lip shape recognition model for lip shape recognition, and obtain the target lip shape recognition result, wherein the lip shape recognition model is determined by the lip shape image sample and the corresponding lip shape recognition result sample The residual data of the neural network model is obtained by training the neural network model based on the residual data.
确定单元306,用于确定目标唇形识别结果对应的控制指令。The determination unit 306 is configured to determine a control instruction corresponding to the target lip shape recognition result.
控制单元308,用于基于控制指令对车辆中的功能进行控制。The control unit 308 is configured to control the functions in the vehicle based on the control instruction.
可选地,识别单元304可以包括:第一获取模块,用于获取唇形图像样本,其中,唇形图像样本用于表示针对车辆中不同的功能需求录入对应的至少一唇形图像;扩充处理模块,用于对唇形图像样本进行数据扩充处理,得到对应的扩充处理图像样本;分类模块,用于对扩充处理图像样本按照对应的功能需求的类别进行分类,得到对应的唇形数据集,其中,唇形数据集包括分类后的唇形图像样本和扩充处理图像样本;确定模块,用于基于唇形数据集确定残差数据。Optionally, the identification unit 304 may include: a first acquisition module, configured to acquire a lip image sample, wherein the lip image sample is used to represent at least one corresponding lip image entered for different functional requirements in the vehicle; extended processing The module is used to perform data expansion processing on the lip shape image sample to obtain the corresponding expanded processing image sample; the classification module is used to classify the expanded processing image sample according to the category of the corresponding functional requirements to obtain the corresponding lip shape data set, Wherein, the lip shape data set includes classified lip shape image samples and expanded processing image samples; the determining module is used to determine residual data based on the lip shape data set.
可选地,扩充处理模块可以包括:第一扩充处理子模块,用于基于生成对抗网络模型对唇形图像样本进行数据扩充,确定出第一扩充处理图像样本,其中,第一扩充处理图像样本用于表示经过生成对抗网络模型数据扩充后的唇形图像;第二扩充处理子模块,用于对唇形图像样本进行随机旋转,确定出第二扩充处理图像样本,其中,第二扩充处理图像样本用于表示经过增加旋转数据后的唇形图像;第三扩充处理子模块,用于对唇形图像样本进行噪声处理,确定出第三扩充处理图像样本,其中,第三扩充处理图像样本用于表示经过添加噪声数据后的唇形图像;第一确定子模块,用于将唇形图像样本、第一扩充处理图像样本、第二扩充处理图像样本和第三扩充处理图像样本确定为扩充处理图像样本。Optionally, the extended processing module may include: a first extended processing sub-module, configured to perform data augmentation on the lip shape image sample based on the generative confrontation network model, and determine the first extended processed image sample, wherein the first extended processed image sample It is used to represent the lip shape image after the data expansion of the generated confrontation network model; the second expansion processing sub-module is used to randomly rotate the lip shape image sample to determine the second expansion processing image sample, wherein the second expansion processing image The sample is used to represent the lip shape image after adding rotation data; the third extended processing sub-module is used to perform noise processing on the lip shape image sample to determine the third extended processing image sample, wherein the third extended processing image sample is used In order to represent the lip shape image after adding noise data; the first determination sub-module is used to determine the lip shape image sample, the first expanded processing image sample, the second expanded processing image sample and the third expanded processing image sample as the expanded processing Image samples.
可选地,分类模块可以包括:第二确定子模块,用于对分类后的扩充处理图像样本进行数据预处理,确定出唇形数据集,其中,数据预处理包括如下至少之一:像素亮度变换、几何变换和局部领域预处理。Optionally, the classification module may include: a second determination submodule, configured to perform data preprocessing on the classified expanded image samples to determine a lip shape dataset, wherein the data preprocessing includes at least one of the following: pixel brightness Transformation, geometric transformation and local domain preprocessing.
可选地,第一获取模块可以包括:第一获取子模块,用于获取车辆上图形用户界面上的选择功能唇形指令,其中,选择功能唇形指令用于选择待录入的唇形对应的功能需求的类别;生成子模块,用于基于图形用户界面的确定录入指令,生成功能需求的类别对应的唇形图像样本,其中,确定录入指令用于启动对目标对象的唇形的录入。Optionally, the first obtaining module may include: a first obtaining sub-module, configured to obtain a lip shape selection function instruction on the graphical user interface on the vehicle, wherein the lip shape selection function instruction is used to select a lip shape corresponding to the lip shape to be entered. The category of functional requirements; the generation submodule is used to generate lip shape image samples corresponding to the category of functional requirements based on the determined input instruction of the graphical user interface, wherein the determined input instruction is used to start the input of the lip shape of the target object.
可选地,该装置可以包括:第二获取模块,用于获取车辆的图形用户界面上的同意采集唇形指令;采集模块,用于基于同意采集唇形指令,通过车辆中图像采集设备在车辆运行过程中对目标对象的唇形图像进行采集。Optionally, the device may include: a second acquisition module, configured to acquire the consent to collect lip shape instructions on the graphical user interface of the vehicle; During operation, the lip image of the target object is collected.
可选地,该装置还可以包括:输入模块,用于将图像采集设备采集的唇形图像输入唇形识别模型中;模型优化模块,用于基于唇形图像确定残差数据,对唇形识别模型进行模型优化。Optionally, the device may also include: an input module, configured to input the lip shape image collected by the image acquisition device into the lip shape recognition model; a model optimization module, used to determine residual data based on the lip shape image, and perform lip shape recognition Model for model optimization.
可选地,识别单元304可以包括:处理模块,用于响应于车辆上图像采集设备采集到唇形待处理图像,通过唇形识别模型对唇形待处理图像按照功能需求的类别进行分类,确定唇形待处理图像对应的目标唇形识别结果。Optionally, the recognition unit 304 may include: a processing module, configured to classify the lip shape image to be processed according to the category of functional requirements through the lip shape recognition model in response to the lip shape image to be processed collected by the image acquisition device on the vehicle, and determine The target lip shape recognition result corresponding to the lip shape image to be processed.
根据本发明实施例中,通过获取单元,获取车辆中目标对象的唇形待处理图像,其中,唇形待处理图像用于表示目标对象对车辆的功能需求;通过识别单元,将唇形待处理图像输入唇形识别模型中进行唇形识别,得到目标唇形识别结果,其中,唇形识别模型通过唇形图像样本和对应的唇形识别结果样本确定神经网络模型的残差数据,且基于残差数据对神经网络模型训练得到的;通过确定单元,确定目标唇形识别结果对应的控制指令;通过控制单元,基于控制指令对车辆中的功能进行控制,从而解决了相关技术中无法通过唇形识别对车辆进行控制的技术问题,实现了可以通过唇形识别对车辆进行控制的技术效果。According to the embodiment of the present invention, the image to be processed of the lip shape of the target object in the vehicle is acquired through the acquisition unit, wherein the image of the lip shape to be processed is used to represent the functional requirements of the target object on the vehicle; the lip shape to be processed is obtained through the recognition unit The image is input into the lip shape recognition model for lip shape recognition, and the target lip shape recognition result is obtained. The lip shape recognition model determines the residual data of the neural network model through the lip shape image sample and the corresponding lip shape recognition result sample, and based on the residual The difference data is obtained by training the neural network model; the control command corresponding to the target lip shape recognition result is determined through the determination unit; Identify the technical problem of controlling the vehicle, and realize the technical effect that the vehicle can be controlled through lip shape recognition.
实施例4Example 4
根据本发明实施例,还提供了一种计算机可读存储介质,该存储介质包括存储的程序,其中,程序执行实施例1中的车辆的控制方法。According to an embodiment of the present invention, there is also provided a computer-readable storage medium, the storage medium includes a stored program, wherein the program executes the vehicle control method in Embodiment 1.
实施例5Example 5
根据本发明实施例,还提供了一种处理器,该处理器用于运行程序,其中,程序运行时执行实施例1中的车辆的控制方法。According to an embodiment of the present invention, a processor is also provided, and the processor is used to run a program, wherein the vehicle control method in Embodiment 1 is executed when the program is running.
实施例6Example 6
根据本发明实施例,还提供一种车辆,该车辆用于执行实施例1中任意一项车辆的控制方法。According to an embodiment of the present invention, a vehicle is also provided, and the vehicle is used to implement any one of the vehicle control methods in Embodiment 1.
上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the above embodiments of the present invention are for description only, and do not represent the advantages and disadvantages of the embodiments.
在本发明的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其它实施例的相关描述。In the above-mentioned embodiments of the present invention, the descriptions of each embodiment have their own emphases, and for parts not described in detail in a certain embodiment, reference may be made to relevant descriptions of other embodiments.
在本申请所提供的几个实施例中,应该理解到,所揭露的技术内容,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如单元的划分,可以为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed technical content can be realized in other ways. Wherein, the device embodiments described above are only illustrative. For example, the division of units can be a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or integrated into Another system, or some feature may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of units or modules may be in electrical or other forms.
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。A unit described as a separate component may or may not be physically separated, and a component shown as a unit may or may not be a physical unit, that is, it may be located in one place, or may be distributed over multiple units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本发明各个实施例方法的全部或部分步骤。而前述的存储介质包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on such an understanding, the essence of the technical solution of the present invention or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server or a network device, etc.) execute all or part of the steps of the methods in various embodiments of the present invention. The aforementioned storage media include: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program codes. .
以上仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。The above are only preferred embodiments of the present invention, and it should be pointed out that for those of ordinary skill in the art, some improvements and modifications can also be made without departing from the principle of the present invention, and these improvements and modifications should also be considered Be the protection scope of the present invention.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310390827.7ACN116503941A (en) | 2023-04-12 | 2023-04-12 | Vehicle control method and device and vehicle |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310390827.7ACN116503941A (en) | 2023-04-12 | 2023-04-12 | Vehicle control method and device and vehicle |
| Publication Number | Publication Date |
|---|---|
| CN116503941Atrue CN116503941A (en) | 2023-07-28 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202310390827.7APendingCN116503941A (en) | 2023-04-12 | 2023-04-12 | Vehicle control method and device and vehicle |
| Country | Link |
|---|---|
| CN (1) | CN116503941A (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109145815A (en)* | 2018-08-21 | 2019-01-04 | 深圳大学 | A kind of SAR target identification method, device, computer equipment and storage medium |
| CN109409195A (en)* | 2018-08-30 | 2019-03-01 | 华侨大学 | A kind of lip reading recognition methods neural network based and system |
| CN109840512A (en)* | 2019-02-28 | 2019-06-04 | 北京科技大学 | A kind of Facial action unit recognition methods and identification device |
| CN111242029A (en)* | 2020-01-13 | 2020-06-05 | 湖南世优电气股份有限公司 | Device control method, device, computer device and storage medium |
| CN111831570A (en)* | 2020-07-23 | 2020-10-27 | 深圳慕智科技有限公司 | Test case generation method oriented to automatic driving image data |
| CN112330713A (en)* | 2020-11-26 | 2021-02-05 | 南京工程学院 | Improvement method of speech comprehension in severely hearing impaired patients based on lip recognition |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109145815A (en)* | 2018-08-21 | 2019-01-04 | 深圳大学 | A kind of SAR target identification method, device, computer equipment and storage medium |
| CN109409195A (en)* | 2018-08-30 | 2019-03-01 | 华侨大学 | A kind of lip reading recognition methods neural network based and system |
| CN109840512A (en)* | 2019-02-28 | 2019-06-04 | 北京科技大学 | A kind of Facial action unit recognition methods and identification device |
| CN111242029A (en)* | 2020-01-13 | 2020-06-05 | 湖南世优电气股份有限公司 | Device control method, device, computer device and storage medium |
| CN111831570A (en)* | 2020-07-23 | 2020-10-27 | 深圳慕智科技有限公司 | Test case generation method oriented to automatic driving image data |
| CN112330713A (en)* | 2020-11-26 | 2021-02-05 | 南京工程学院 | Improvement method of speech comprehension in severely hearing impaired patients based on lip recognition |
| Publication | Publication Date | Title |
|---|---|---|
| JP7525460B2 (en) | Computing device and speech processing method for analyzing human speech based on audio data and image data, and program | |
| EP4163913B1 (en) | In-vehicle voice instruction control method, and related device | |
| JP7242520B2 (en) | visually aided speech processing | |
| US11825278B2 (en) | Device and method for auto audio and video focusing | |
| CN111382642A (en) | Face attribute recognition method and device, electronic device and storage medium | |
| CN110705357A (en) | Face recognition method and face recognition device | |
| DE102018125966A1 (en) | SYSTEM AND METHOD FOR RECORDING KEYWORDS IN A ENTERTAINMENT | |
| CN115312061A (en) | Voice question-answer method and device in driving scene and vehicle-mounted terminal | |
| US12307745B2 (en) | Computing device for instance-based image quality processing and operating method therefor | |
| CN116610212A (en) | Multi-mode entertainment interaction method, device, equipment and medium | |
| KR20210048271A (en) | Apparatus and method for performing automatic audio focusing to multiple objects | |
| CN118782044A (en) | Multimodal interaction method, device, electronic device and storage medium | |
| CN116503941A (en) | Vehicle control method and device and vehicle | |
| CN112969053A (en) | In-vehicle information transmission method and device, vehicle-mounted equipment and storage medium | |
| CN118552492A (en) | Image quality identification method, device and equipment | |
| CN114332902A (en) | Video character recognition method, device, equipment and storage medium | |
| Srividya et al. | Smart Glasses for Disabled People | |
| CN117746888B (en) | A voice detection method, device, equipment and readable storage medium | |
| CN116579550B (en) | A 5G visual alarm service method and system | |
| US20230410830A1 (en) | Audio purification method, computer system and computer-readable medium | |
| CN120375824A (en) | Vehicle-mounted human-computer interaction method, device, storage medium and program product | |
| WO2024076343A1 (en) | Masked bounding-box selection for text rotation prediction | |
| CN120183313A (en) | Light curtain display control method, device, equipment and storage medium | |
| CN114495039A (en) | Object recognition method, device, electronic device and storage medium | |
| CN119169885A (en) | A vehicle-mounted teaching method, system, computer device and readable storage medium |
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |