Movatterモバイル変換


[0]ホーム

URL:


CN109243457A - Voice-based control method, device, equipment and storage medium - Google Patents

Voice-based control method, device, equipment and storage medium
Download PDF

Info

Publication number
CN109243457A
CN109243457ACN201811311482.7ACN201811311482ACN109243457ACN 109243457 ACN109243457 ACN 109243457ACN 201811311482 ACN201811311482 ACN 201811311482ACN 109243457 ACN109243457 ACN 109243457A
Authority
CN
China
Prior art keywords
voice
signal
voice signal
echo signal
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811311482.7A
Other languages
Chinese (zh)
Other versions
CN109243457B (en
Inventor
杨亮
雷宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Rubu Technology Co ltd
Original Assignee
Beijing Intelligent Housekeeper Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Intelligent Housekeeper Technology Co LtdfiledCriticalBeijing Intelligent Housekeeper Technology Co Ltd
Priority to CN201811311482.7ApriorityCriticalpatent/CN109243457B/en
Publication of CN109243457ApublicationCriticalpatent/CN109243457A/en
Application grantedgrantedCritical
Publication of CN109243457BpublicationCriticalpatent/CN109243457B/en
Expired - Fee Relatedlegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本发明实施例公开了一种基于语音的控制方法、装置、设备及存储介质。其中,该方法包括:采集至少两个语音信号;依据采集所述至少两个语音信号的时间及各语音信号与当前场景的匹配度,确定用户的意图。本发明实施例提供的技术方案,能够在存在多路语音信号及外部环境干扰的情况下,准确确定用户对终端的控制指令,从而提升了用户的体验。

Embodiments of the present invention disclose a voice-based control method, device, device and storage medium. Wherein, the method includes: collecting at least two voice signals; and determining the user's intention according to the time when the at least two voice signals are collected and the matching degree of each voice signal and the current scene. The technical solutions provided by the embodiments of the present invention can accurately determine the user's control instruction on the terminal in the presence of multi-channel voice signals and external environment interference, thereby improving the user experience.

Description

Voice-based control method, device, equipment and storage medium
Technical field
The present embodiments relate to technical field of voice recognition more particularly to a kind of voice-based control method, device,Equipment and storage medium.
Background technique
Currently, speech recognition technology in automobile driving cabin using more and more extensive.Interactive voice can allow driver andPassenger inside the vehicle can be more natural, faster the interior service of access, while also avoiding leaving road surface due to pilot's line of vision canThe danger and accident that can be generated.
But the existing speech recognition technology more noisy situation of environment in the car, user can not accurately be determined to vehicleThe control instruction of mounted terminal.Such as when user is saying phonetic order, if side have other people chat orGeneration interference voice is made a sound, then is difficult to determine the corresponding control instruction of phonetic order according to above-mentioned multiple voices, causeCar-mounted terminal can not accurately be controlled, to influence the effect of interactive voice.
Summary of the invention
The embodiment of the invention provides a kind of voice-based control method, device, equipment and storage mediums, can be accurateKnow and determines that user to the control instruction of terminal, improves the experience of user.
In a first aspect, the embodiment of the invention provides a kind of voice-based control methods, this method comprises:
Acquire at least two voice signals;
According to the time and at least two voice signal and current scene for acquiring at least two voice signalMatching degree determines that the target control to terminal instructs.
Second aspect, the embodiment of the invention also provides a kind of voice-based control device, which includes:
Acquisition module, for acquiring at least two voice signals;
Target instruction target word determining module, for according to the time and described at least two for acquiring at least two voice signalThe matching degree of voice signal and current scene determines that the target control to terminal instructs.
The third aspect, the embodiment of the invention also provides a kind of equipment, which includes:
One or more processors;
Storage device, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processingDevice realizes any voice-based control method in first aspect.
Fourth aspect, the embodiment of the invention also provides a kind of storage mediums, are stored thereon with computer program, the programAny voice-based control method in first aspect is realized when being executed by processor.
Technical solution provided in an embodiment of the present invention, by each voice signal for being acquired to speech collecting system according to acquisitionTime and the control instruction to terminal is determined with matching for current scene, the program can be there are multi-path voice signals and outerIt is accurate to determine user to the control instruction of terminal, to improve the experience of user in the case where portion's environmental disturbances.
Detailed description of the invention
Figure 1A is a kind of flow chart of the voice-based control method provided in the embodiment of the present invention one;
Figure 1B is a kind of schematic diagram for speech collecting system that the embodiment of the present invention is applicable in;
Fig. 2 is a kind of flow chart of the voice-based control method provided in the embodiment of the present invention two;
Fig. 3 is a kind of flow chart of the voice-based control method provided in the embodiment of the present invention three;
Fig. 4 is a kind of structural block diagram of the voice-based control device provided in the embodiment of the present invention four;
Fig. 5 is a kind of structural schematic diagram of the equipment provided in the embodiment of the present invention five.
Specific embodiment
The embodiment of the present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that thisLocate described specific embodiment and is used only for explaining the embodiment of the present invention, rather than limitation of the invention.It further needs exist forBright, only parts related to embodiments of the present invention are shown for ease of description, in attached drawing rather than entire infrastructure.
Embodiment one
Figure 1A is a kind of flow chart of the voice-based control method provided in the embodiment of the present invention one, this implementation is applicable inIn how accurately determining that user to the control instruction of terminal, is particularly suitable for solving under the fixed vehicle environmental in seat, howCar there are external disturbances such as multiple voices in the case where accurately determine user to the control instruction of terminal.This method can be by thisThe voice-based control device that inventive embodiments provide executes, and the mode which can be used software and/or hardware realizes.Referring to Figure 1A, this method is specifically included:
S110 acquires at least two voice signals.
Wherein, voice signal refers to the signal comprising user speech instruction, and the acquisition equipment such as microphone can be used to acquire.Illustratively, speech collecting system can be used and acquire at least two voice signals, wherein speech collecting system is to construct in advance, for being acquired to voice signal;Optionally, speech collecting system can be by multiple microphones or microphone array groupAt.
For under the fixed environment of vehicle seat, in order to interference such as voices in the car, i.e. there are multi-path voice signalsIn the case of, the intention of any personnel in vehicle, such as the intention of identification driver or copilot is recognized accurately.It can be according to vehicleSeat structure in constructs speech collecting system, and optionally, speech collecting system includes at least two pairs by two microphone groupsAt diamylose unit, the position of each pair of diamylose unit is determined according to the position of corresponding sounding point.
Wherein, two microphones are considered as a pair of of diamylose unit, and sounding point is the mouth of occupant, and sounding point position is justTo the perpendicular bisector of two microphone lines, that is to say, that between two microphones of each pair of diamylose unit on the middle vertical plane of lineIncluding sounding point, corresponding a pair of of the diamylose unit of each sounding point.
Illustratively, the position of each pair of diamylose unit can determine that model performs the following operations determination by user or position,Wherein, position determines that model is the model that can be used for determining each pair of diamylose cell position trained in advance, by the position of sounding point,Preset mounting plane and center position input position determine that model, model will export this to double in conjunction with the parameter of itselfThe installation site of wheat.
A, the subpoint of the sounding point on mounting plane is determined according to the position of sounding point and preset mounting plane.
Wherein, preset mounting plane refers to pre-set for installing the plane of microphone, such as console.It needsBright, due to the construction at interior seat, different sounding points can correspond to different mounting planes, can also correspond to identical peaceFill plane.And the difference high due to the person, it will lead to the change in location of sounding point, be not fixed so as to cause the position of microphone,Therefore sounding point is set using standard heights or average height such as 3-5 degree in controlled range for the ease of fixed microphonePosition.
It, can be in same installation since driver and copilot are respectively positioned on interior front row for example, with reference to shown in Figure 1BCorresponding diamylose unit is set for each personnel in plane.Specifically, two sounding point positions are respectively chief aviation pilot'sMouth position S1With the mouth position S of copilot2;Mounting plane is M1.In vertical plane, by sounding pointVertical line is done to mounting plane in position, and the intersection point of the vertical line and mounting plane is subpoint of the sounding point on mounting plane.ExampleSuch as, referring to shown in Figure 1B, point S is crossed1Vertical line is done to mounting plane M1, the intersection point with mounting plane M1 is subpoint S1/;It crosses a littleS2Vertical line is done to mounting plane M1, the intersection point with mounting plane M1 is subpoint S2/
B, according between the position and center position of subpoint first distance and first distance and second distance itBetween linear relationship, determine the installation site of the corresponding diamylose unit of sounding point.
Wherein, center position is preset according to sounding point, the corresponding central point of each sounding point, specifically, inHeart point position refers to the position in the mounting plane region of occupant's such as driver's face.For example, S in Figure 1B1Corresponding centerPoint position, that is, S1The position O of the mounting plane of face1And S2Corresponding center position, that is, S2The position O of the mounting plane of face2;And three microphones MIC1, MIC0 and MIC2.
For each microphone, the distance between the position of the microphone and center position are second distance,Such as MIC1 and O in Figure 1B1The distance between.Between the position and center position of subpoint away from as first distance,Such as the distance between the S1/ in Figure 1B and O1.Optionally, the first distance between the position and center position of subpoint is50 times of second distance between the position and center position of microphone, such as the S in Figure 1B1/With O1The distance between S1/O1It is MIC1 and O1The distance between 50 times.
Specifically, after determining the corresponding subpoint of sounding point and predefining center position according to sounding point,According to the between the position and center position of first distance and microphone between the position and center position of subpointThe linear relationship of two distances can uniquely determine the position of each microphone in each pair of diamylose unit.
It should be noted that corresponding a pair of of the diamylose unit of lower sounding point of usual situation, if two sounding points are correspondingThere is overlapping in the installation site of two pairs of diamylose units, the form building speech collecting system of shared microphone, such as Figure 1B can be usedTwo pairs of diamylose units that are shown, being made of three microphones, the i.e. corresponding diamylose unit MIC1 and MIC0 of driver, copilotCorresponding diamylose unit MIC2 and MIC0.
Illustratively, diamylose pair identical with position number can be set according to the construction at seat in vehicle.For example, correspondingTool can be arranged two pairs of diamyloses on vehicle control platform and respectively correspond driver and copilot, in front row seat there are five the vehicle of positionThree personnel that three pairs of diamyloses respectively correspond heel row are arranged in position accordingly below.
It should be noted that the speech collecting system being arranged using this kind of building mode, the language of a pair of of diamylose unit acquisitionSound signal is a voice signal, i.e., is a voice letter by a pair of of diamylose corresponding two microphones speech synthesis collectedNumber.According to common speech collecting system, a microphone voice collected is a voice signal.
S120, according to the time of at least two voice signals of acquisition and the matching of at least two voice signals and current sceneDegree determines that the target control to terminal instructs.
Wherein, matching degree refers to the semanteme and the degree of correlation of current environment of voice signal, can be by believing voiceNumber content of text carry out semantic analysis and determine.For example, if current scene is to drive, to the voice of the operating system of vehicleControl such as opens navigation, closes air-conditioning or opens vehicle window voice related to driving scene;And other chat voices then with openParking lot scape is uncorrelated.Terminal is a kind of equipment with intellectual technology, and optionally, terminal is car-mounted terminal, mesh in the present embodimentMark control instruction, which refers to, can control the phonetic order that terminal executes sequence of operations.
Specifically, the voice signal unrelated with current scene can be rejected according to the matching degree of at least two voice signals;It pressesRemaining voice signal is ranked up according to the acquisition time of at least two voice signals, obtains the most preceding voice signal that sorts;According to sorting, most preceding voice signal obtains corresponding control instruction;The intention of user is executed according to control instruction.
Technical solution provided in an embodiment of the present invention, by each voice signal for being acquired to speech collecting system according to acquisitionTime and the control instruction to terminal is determined with matching for current scene, the program can be there are multi-path voice signals and outerIt is accurate to determine user to the control instruction of terminal, to improve the experience of user in the case where portion's environmental disturbances.
Embodiment two
Fig. 2 is a kind of flow chart of the voice-based control method provided in the embodiment of the present invention two, and the present embodiment existsIt is further to optimize on the basis of above-described embodiment one.Referring to fig. 2, this method specifically includes:
S210 acquires at least two voice signals.
Wherein, at least two voice signals of acquisition, which can be, acquires at least two voice signals using speech collecting system,Wherein, speech collecting system includes at least two pairs diamylose units being made of two microphones, and the position of each pair of diamylose unit isWhat the position according to sounding point determined.
S220 is handled at least two voice signals using preset rules, is obtained each at least two voice signalsThe initial time of the corresponding content of text of echo signal in voice signal and each echo signal.
Wherein, preset rules refer to the pre-set rule for being handled the voice signal that each pair of diamylose acquiresThen, which can be used for carrying out each voice signal separating etc. processing and retains voice within the scope of special angle, inhibit otherVoice in range, and speech recognition is carried out to each voice signal after separation.
Echo signal refers to obtain after processing removes non-speech audio part to each voice signal using preset rulesVoice signal, i.e. vocal sections.It is corresponding, at the beginning of the initial time of echo signal is vocal sections;Target letterNumber corresponding content of text refers to that the corresponding voice signal of vocal sections is converted to text is obtained.Optionally, in textAppearance can be corresponding text of echo signal itself, be also possible to keyword etc..
Specifically, handling using the voice signal that preset rules acquire each pair of diamylose unit, each language can be obtainedThe corresponding content of text of echo signal and echo signal of sound signal and the initial time of echo signal.
S230 inputs each content of text to semantic understanding engine, obtains the matching degree of each echo signal and current scene.
Wherein, semantic understanding engine can be a kind of trained semantic analysis model in advance, can be used for the text to inputThis content carries out subordinate sentence, word cutting, extracts the processing such as keyword and semantic analysis.Matching degree refers to the semantic and current of echo signalThe degree of correlation of local environment can carry out semantic analysis by the content of text to echo signal to determine.For example, if working as front courtScape is to drive, then such as opens navigation to the voice control of the operating system of vehicle, closes air-conditioning or open vehicle window voice and openParking lot scape is related;And others chat voice is then uncorrelated to driving scene.Illustratively, if content of text is keyword,Semantic understanding engine directly will carry out semantic analysis, the determining matching degree with current scene to the keyword.
The height of matching degree can be determined by the priority or number of the control instruction for judging to include in echo signal.ExampleSuch as, the corresponding control instruction of navigation is opened prior to the control instruction of opening vehicle window.If the control instruction for including in echo signal AIt is to open navigation, and the control instruction for including is to open vehicle window, then the matching of echo signal A and current scene in echo signal BDegree is higher than the matching degree of echo signal B and current scene.
S240 determines that the target control to terminal instructs according to each matching degree and the initial time of each echo signal.
Specifically, can reject according to the matching degree of each echo signal if existing simultaneously multiple echo signals and work as front courtThe unrelated echo signal of scape;Remaining echo signal is ranked up according to the initial time of echo signal, before obtaining sequence mostEcho signal;According to sorting, the most preceding corresponding content of text of echo signal obtains corresponding control instruction;Refer to according to controlEnable the intention for executing user.
For example, two pairs of diamylose units in driver and copilot direction have collected voice signal, right shown in Figure 1BTwo voice signals are handled after obtaining the initial time of corresponding echo signal and echo signal, and semantic understanding is neededEngine judges that the echo signal in which direction and current scene matching degree are higher, the echo signal of the priority processing direction.
Illustratively, if the initial time of an echo signal is preferential, but semantic understanding engine judges the echo signalIt is unrelated with current scene, then it can handle other echo signals.
Optionally, if there is only an echo signals, and semantic understanding engine judges the echo signal and current sceneCorrelation then can obtain corresponding control instruction according to the corresponding content of text of echo signal;Execute user's according to control instructionIt is intended to.If there is only an echo signal, and semantic understanding engine judges that the echo signal is unrelated with current scene, then withoutAny processing.
Technical solution provided in an embodiment of the present invention is used default by each voice signal acquired to speech collecting systemRule is handled, and the initial time of the corresponding content of text of echo signal in each voice signal and each echo signal is obtained;Each content of text is input to semantic understanding engine and obtains the matching degree of each echo signal and current scene, according to each echo signalInitial time and matching degree determine the target control instruction to terminal, the program can be there are multi-path voice signals and outerIt is accurate to determine user to the control instruction of terminal, to improve the experience of user in the case where portion's environmental disturbances.
Embodiment three
Fig. 3 is a kind of flow chart of the voice-based control method provided in the embodiment of the present invention three, and the present embodiment existsIt is further to optimize on the basis of above-described embodiment.Referring to Fig. 3, this method is specifically included:
S310 acquires at least two voice signals.
S320 handles at least two voice signals using beamforming algorithm, obtains at least two voice signalsIn the corresponding preliminary voice signal of each voice signal.
Wherein, beamforming algorithm is a kind of signal dimensionality reduction or the method and a kind of separation for obtaining particular range signalThe method of signal, the present embodiment can retain the corresponding perpendicular bisector of each pair of diamylose special angle model nearby using beamforming algorithmVoice in enclosing in such as 10 degree, inhibits the voice within the scope of other.Preliminary voice signal refers to the voice signal use to acquisitionThe voice signal obtained after beamforming algorithm processing.
Specifically, each voice can be obtained using beamforming algorithm and believe for the voice signal of each pair of diamylose acquisitionNumber corresponding preliminary voice signal.If speech collecting system is building mode shown in Figure 1B, will using beamforming algorithmThe preliminary voice signal comprising driver or copilot's voice is obtained, i.e., voice can clearly be known using beamforming algorithmThe position in source.
S330 carries out speech terminals detection to each preliminary voice signal and obtains the corresponding echo signal of each preliminary voice signalAnd the initial time of each echo signal.
Wherein, speech terminals detection refers to accurately judges the voice signal of input from ambient noise and ambient noiseOut in voice signal various paragraphs starting point and end point.In other words, as the signal stream under complicated application environmentIn tell voice signal and non-speech audio, and determine the beginning and end of voice signal.
Echo signal refers to the voice signal for removing in preliminary voice signal and obtaining behind non-speech audio part, i.e. voice portionPoint.It is corresponding, at the beginning of the initial time of echo signal is vocal sections.
Specifically, carrying out speech terminals detection for the corresponding preliminary voice signal of voice signal of each pair of diamylose acquisitionIt will obtain the initial time of the preliminary corresponding echo signal of voice signal and each echo signal.
S340 carries out speech recognition to each echo signal, obtains the corresponding content of text of each echo signal.
Wherein, the corresponding content of text of echo signal, which refers to, is converted to vocal sections' corresponding voice signal obtained by textIt arrives.Each echo signal is identified specifically, speech recognition technology can be used, is obtained in the corresponding text of each echo signalHold.Optionally, content of text can be corresponding text of echo signal itself, be also possible to keyword etc..
S350 inputs each content of text to semantic understanding engine, obtains the matching degree of each echo signal and current scene.
S360 determines that the target control to terminal instructs according to each matching degree and the initial time of each echo signal.
Technical solution provided in an embodiment of the present invention uses wave beam by each voice signal acquired to speech collecting systemShaping Algorithm and speech terminals detection are handled, and the corresponding content of text of echo signal and each mesh in each voice signal are obtainedMark the initial time of signal;Each content of text is input to semantic understanding engine and obtains the matching of each echo signal and current sceneDegree determines that the control instruction to terminal, the program can be in the presence of more according to the initial time of each echo signal and matching degreeIt is accurate to determine user to the control instruction of terminal, to improve user in the case that road voice signal and external environment are interferedExperience.
Example IV
Fig. 4 is a kind of structural block diagram of the voice-based control device provided in the embodiment of the present invention four, which canVoice-based control method provided by any embodiment of the invention is executed, have the corresponding functional module of execution method and is hadBeneficial effect.As shown in figure 4, the apparatus may include:
Acquisition module 410, for acquiring at least two voice signals;
Target instruction target word determining module 420, for according to the time and at least two voices for acquiring at least two voice signalsThe matching degree of signal and current scene determines that the target control to terminal instructs.
Technical solution provided in an embodiment of the present invention, by each voice signal for being acquired to speech collecting system according to acquisitionTime and the control instruction to terminal is determined with matching for current scene, the program can be there are multi-path voice signals and outerIt is accurate to determine user to the control instruction of terminal, to improve the experience of user in the case where portion's environmental disturbances.
Illustratively, target instruction target word determining module may include:
Signal time determination unit is obtained at least for being handled using preset rules at least two voice signalsThe initial time of the corresponding content of text of echo signal in two voice signals in each voice signal and each echo signal;
Matching degree determination unit obtains each echo signal and current for inputting each content of text to semantic understanding engineThe matching degree of scene;
Target instruction target word determination unit determines the mesh to terminal for the initial time according to matching degree and each echo signalMark control instruction.
Illustratively, signal time determination unit is specifically used for:
At least two voice signals are handled using beamforming algorithm, obtain each language at least two voice signalsThe corresponding preliminary voice signal of sound signal;
Speech terminals detection is carried out to each preliminary voice signal and obtains the corresponding echo signal of each preliminary voice signal and eachThe initial time of echo signal;
Speech recognition is carried out to each echo signal, obtains the corresponding content of text of each echo signal.
Optionally, acquisition module 410 can be used for: acquiring at least two voice signals using speech collecting system, needsIllustrate, the speech collecting system in the present embodiment includes at least two pairs diamylose units being made of two microphones, each pair ofThe position of diamylose unit is determined according to the position of corresponding sounding point.
Illustratively, the position of each pair of diamylose unit by operating determination as follows:
The subpoint of the sounding point on mounting plane is determined according to the position of sounding point and preset mounting plane;
According to the distance between the position of subpoint and center position and the line between first distance and second distanceSexual intercourse determines the installation site of the corresponding diamylose unit of sounding point, wherein second distance is position and the central point of microphoneThe distance between position, center position are preset according to sounding point.
Embodiment five
Fig. 5 is a kind of structural schematic diagram of the equipment provided in the embodiment of the present invention five, and Fig. 5, which is shown, to be suitable for being used to realizingThe block diagram of the example devices of embodiment of the embodiment of the present invention.The equipment 12 that Fig. 5 is shown is only an example, should not be to thisThe function and use scope of inventive embodiments bring any restrictions.
As shown in figure 5, equipment 12 is showed in the form of universal computing device.The component of equipment 12 may include but unlimitedIn one or more processor or processing unit 16, system storage 28, connecting different system components, (including system is depositedReservoir 28 and processing unit 16) bus 18.
Bus 18 indicates one of a few class bus structures or a variety of, including memory bus or Memory Controller,Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures.It liftsFor example, these architectures include but is not limited to industry standard architecture (ISA) bus, microchannel architecture (MAC)Bus, enhanced isa bus, Video Electronics Standards Association (VESA) local bus and peripheral component interconnection (PCI) bus.
Equipment 12 typically comprises a variety of computer system readable media.These media can be it is any can be by equipment 12The usable medium of access, including volatile and non-volatile media, moveable and immovable medium.
System storage 28 may include the computer system readable media of form of volatile memory, such as arbitrary accessMemory (RAM) 30 and/or cache memory 32.Equipment 12 may further include it is other it is removable/nonremovable,Volatile/non-volatile computer system storage medium.Only as an example, storage system 34 can be used for reading and writing irremovable, non-volatile magnetic media (Fig. 5 do not show, commonly referred to as " hard disk drive ").Although being not shown in Fig. 5, use can be providedIn the disc driver read and write to removable non-volatile magnetic disk (such as " floppy disk "), and to removable anonvolatile optical diskThe CD drive of (such as CD-ROM, DVD-ROM or other optical mediums) read-write.In these cases, each driver canTo be connected by one or more data media interfaces with bus 18.System storage 28 may include that at least one program producesProduct, the program product have one group of (for example, at least one) program module, these program modules are configured to perform of the invention realApply the function of each embodiment of example.
Program/utility 40 with one group of (at least one) program module 42 can store and store in such as systemIn device 28, such program module 42 includes but is not limited to operating system, one or more application program, other program modulesAnd program data, it may include the realization of network environment in each of these examples or certain combination.Program module 42Usually execute the function and/or method in described embodiment of the embodiment of the present invention.
Equipment 12 can also be communicated with one or more external equipments 14 (such as keyboard, sensing equipment, display 24 etc.),Can also be enabled a user to one or more equipment interacted with the equipment 12 communication, and/or with enable the equipment 12 withOne or more of the other any equipment (such as network interface card, modem etc.) communication for calculating equipment and being communicated.It is this logicalLetter can be carried out by input/output (I/O) interface 22.Also, equipment 12 can also by network adapter 20 and one orThe multiple networks of person (such as local area network (LAN), wide area network (WAN) and/or public network, such as internet) communication.As shown,Network adapter 20 is communicated by bus 18 with other modules of equipment 12.It should be understood that although not shown in the drawings, can combineEquipment 12 use other hardware and/or software module, including but not limited to: microcode, device driver, redundant processing unit,External disk drive array, RAID system, tape drive and data backup storage system etc..
Processing unit 16 by the program that is stored in system storage 28 of operation, thereby executing various function application andData processing, such as realize voice-based control method provided by the embodiment of the present invention.
Embodiment six
The embodiment of the present invention six also provides a kind of computer readable storage medium, be stored thereon with computer program (orFor computer executable instructions), it can be realized when which is executed by processor voice-based described in above-mentioned any embodimentControl method.
The computer storage medium of the embodiment of the present invention, can be using any of one or more computer-readable mediaCombination.Computer-readable medium can be computer-readable signal media or computer readable storage medium.It is computer-readableStorage medium for example may be-but not limited to-the system of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, device orDevice, or any above combination.The more specific example (non exhaustive list) of computer readable storage medium includes: toolThere are electrical connection, the portable computer diskette, hard disk, random access memory (RAM), read-only memory of one or more conducting wires(ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer-readable storageMedium can be any tangible medium for including or store program, which can be commanded execution system, device or deviceUsing or it is in connection.
Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal,Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimitedIn electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer canAny computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used forBy the use of instruction execution system, device or device or program in connection.
The program code for including on computer-readable medium can transmit with any suitable medium, including --- but it is unlimitedIn wireless, electric wire, optical cable, RF etc. or above-mentioned any appropriate combination.
Can with one or more programming languages or combinations thereof come write for execute the embodiment of the present invention operationComputer program code, described program design language include object oriented program language-such as Java,Smalltalk, C++, further include conventional procedural programming language-such as " C " language or similar program design languageSpeech.Program code can be executed fully on the user computer, partly be executed on the user computer, as an independenceSoftware package execute, part on the user computer part execute on the remote computer or completely in remote computer orIt is executed on server.In situations involving remote computers, remote computer can pass through the network of any kind --- packetIt includes local area network (LAN) or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as benefitIt is connected with ISP by internet).
Note that the above is only a better embodiment of the present invention and the applied technical principle.It will be appreciated by those skilled in the art thatThe invention is not limited to the specific embodiments described herein, be able to carry out for a person skilled in the art it is various it is apparent variation,It readjusts and substitutes without departing from protection scope of the present invention.Therefore, although being implemented by above embodiments to the present inventionExample is described in further detail, but the embodiment of the present invention is not limited only to above embodiments, is not departing from structure of the present inventionIt can also include more other equivalent embodiments in the case where think of, and the scope of the present invention is determined by scope of the appended claimsIt is fixed.

Claims (10)

CN201811311482.7A2018-11-062018-11-06 Voice-based control method, device, equipment and storage mediumExpired - Fee RelatedCN109243457B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201811311482.7ACN109243457B (en)2018-11-062018-11-06 Voice-based control method, device, equipment and storage medium

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201811311482.7ACN109243457B (en)2018-11-062018-11-06 Voice-based control method, device, equipment and storage medium

Publications (2)

Publication NumberPublication Date
CN109243457Atrue CN109243457A (en)2019-01-18
CN109243457B CN109243457B (en)2023-01-17

Family

ID=65077002

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201811311482.7AExpired - Fee RelatedCN109243457B (en)2018-11-062018-11-06 Voice-based control method, device, equipment and storage medium

Country Status (1)

CountryLink
CN (1)CN109243457B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109785123A (en)*2019-01-212019-05-21中国平安财产保险股份有限公司 A business handling assistance method, device and terminal device
CN110472095A (en)*2019-08-162019-11-19百度在线网络技术(北京)有限公司Voice guide method, apparatus, equipment and medium
CN112133307A (en)*2020-08-312020-12-25百度在线网络技术(北京)有限公司 Human-computer interaction method, device, electronic device and storage medium
CN114999471A (en)*2022-04-202022-09-02青岛海尔空调器有限总公司 Method, device and server for controlling smart device, storage medium
CN116246653A (en)*2022-12-202023-06-09小米汽车科技有限公司 Voice endpoint detection method, device, electronic device and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN104572258A (en)*2013-10-182015-04-29通用汽车环球科技运作有限责任公司Methods and apparatus for processing multiple audio streams at vehicle onboard computer system
US20170004826A1 (en)*2014-06-112017-01-05Honeywell International Inc.Adaptive beam forming devices, methods, and systems
CN206312567U (en)*2016-12-152017-07-07北京塞宾科技有限公司A kind of portable intelligent household speech control system
CN106940997A (en)*2017-03-202017-07-11海信集团有限公司A kind of method and apparatus that voice signal is sent to speech recognition system
CN107742522A (en)*2017-10-232018-02-27科大讯飞股份有限公司Target voice acquisition methods and device based on microphone array
CN108286386A (en)*2018-01-222018-07-17奇瑞汽车股份有限公司The method and apparatus of vehicle window control

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN104572258A (en)*2013-10-182015-04-29通用汽车环球科技运作有限责任公司Methods and apparatus for processing multiple audio streams at vehicle onboard computer system
US20170004826A1 (en)*2014-06-112017-01-05Honeywell International Inc.Adaptive beam forming devices, methods, and systems
CN206312567U (en)*2016-12-152017-07-07北京塞宾科技有限公司A kind of portable intelligent household speech control system
CN106940997A (en)*2017-03-202017-07-11海信集团有限公司A kind of method and apparatus that voice signal is sent to speech recognition system
CN107742522A (en)*2017-10-232018-02-27科大讯飞股份有限公司Target voice acquisition methods and device based on microphone array
CN108286386A (en)*2018-01-222018-07-17奇瑞汽车股份有限公司The method and apparatus of vehicle window control

Cited By (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109785123A (en)*2019-01-212019-05-21中国平安财产保险股份有限公司 A business handling assistance method, device and terminal device
CN110472095A (en)*2019-08-162019-11-19百度在线网络技术(北京)有限公司Voice guide method, apparatus, equipment and medium
CN110472095B (en)*2019-08-162023-03-10百度在线网络技术(北京)有限公司Voice guidance method, device, equipment and medium
CN112133307A (en)*2020-08-312020-12-25百度在线网络技术(北京)有限公司 Human-computer interaction method, device, electronic device and storage medium
CN114999471A (en)*2022-04-202022-09-02青岛海尔空调器有限总公司 Method, device and server for controlling smart device, storage medium
CN114999471B (en)*2022-04-202025-03-21青岛海尔空调器有限总公司 Method, device, server, and storage medium for controlling smart devices
CN116246653A (en)*2022-12-202023-06-09小米汽车科技有限公司 Voice endpoint detection method, device, electronic device and storage medium

Also Published As

Publication numberPublication date
CN109243457B (en)2023-01-17

Similar Documents

PublicationPublication DateTitle
CN109243457A (en)Voice-based control method, device, equipment and storage medium
JP6977004B2 (en) In-vehicle devices, methods and programs for processing vocalizations
US12347448B2 (en)Wearable system speech processing
CN109410978A (en)A kind of speech signal separation method, apparatus, electronic equipment and storage medium
CN111883166B (en)Voice signal processing method, device, equipment and storage medium
US20150325240A1 (en)Method and system for speech input
CN108573702A (en) Speech-enabled systems with domain disambiguation
JP2009080309A (en) Voice recognition apparatus, voice recognition method, voice recognition program, and recording medium on which voice recognition program is recorded
KR20220130739A (en) speech recognition
EP4139920B1 (en)Text-based echo cancellation
CN113674742A (en)Man-machine interaction method, device, equipment and storage medium
CN117133292A (en) In-car voice interaction method, device and vehicle based on audio and visual fusion
CN109308909A (en) A signal separation method, device, electronic device and storage medium
CN110111782A (en)Voice interactive method and equipment
CN109215648A (en)Vehicle-mounted voice identifying system and method
CN112927688B (en) Voice interaction method and system for vehicles
JP2022116285A (en) Speech processing method, device, electronic device, storage medium and computer program for vehicle
CN110737422B (en)Sound signal acquisition method and device
CN109712606A (en)A kind of information acquisition method, device, equipment and storage medium
CN118197315A (en)Cabin voice interaction method, system and computer readable medium
CN113936649A (en) Speech processing method, device and computer equipment
CN116580713A (en)Vehicle-mounted voice recognition method, device, equipment and storage medium
WO2020144857A1 (en)Information processing device, program, and information processing method
JP2020079865A (en) Information processing device, agent system, information processing method, and program
JP7192561B2 (en) Audio output device and audio output method

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
CB02Change of applicant information

Address after:Room 508-598, Xitian Gezhuang Town Government Office Building, No. 8 Xitong Road, Miyun District Economic Development Zone, Beijing 101500

Applicant after:BEIJING ROOBO TECHNOLOGY Co.,Ltd.

Address before:Room 508-598, Xitian Gezhuang Town Government Office Building, No. 8 Xitong Road, Miyun District Economic Development Zone, Beijing 101500

Applicant before:BEIJING INTELLIGENT STEWARD Co.,Ltd.

CB02Change of applicant information
TA01Transfer of patent application right

Effective date of registration:20210823

Address after:Room 301-112, floor 3, building 2, No. 18, YANGFANGDIAN Road, Haidian District, Beijing 100089

Applicant after:Beijing Rubu Technology Co.,Ltd.

Address before:Room 508-598, Xitian Gezhuang Town Government Office Building, No. 8 Xitong Road, Miyun District Economic Development Zone, Beijing 101500

Applicant before:BEIJING ROOBO TECHNOLOGY Co.,Ltd.

TA01Transfer of patent application right
GR01Patent grant
GR01Patent grant
CF01Termination of patent right due to non-payment of annual fee

Granted publication date:20230117

CF01Termination of patent right due to non-payment of annual fee

[8]ページ先頭

©2009-2025 Movatter.jp