CN110136722A

Movatterモバイル変換

Info

Publication number: CN110136722A
Application number: CN201910281493.3A
Authority: CN
Inventors: 李赛; 娄晓磊; 王重乐
Original assignee: Beijing Xiaoniao Tingting Technology Co Ltd
Current assignee: Beijing Xiaoniao Tingting Technology Co Ltd
Priority date: 2019-04-09
Filing date: 2019-04-09
Publication date: 2019-08-16

Abstract

The present invention relates to a kind of audio signal processing method, a kind of speech signal processing device, a kind of electronic equipment and a kind of speech signal processing systems.In this method, for each of the equipment group that is made of multiple equipment equipment: receiving current voice signal；Whether decision needs to respond the voice signal, obtains the result of decision；In the case where the result of decision is to need to respond the voice signal, the voice signal is responded.

Description

Audio signal processing method, device, equipment and system

Technical field

The present invention relates to field of speech recognition, more particularly, to a kind of audio signal processing method, a kind of voice signalProcessing unit, a kind of electronic equipment and a kind of speech signal processing system.

Background technique

With the development of speech recognition technology, more and more electronic equipments start to interact using voice mode, makeProduct becomes more Intelligent portable.Such as in intelligent sound box class product, user can pass through voice mode wake-up device, controlMusic, inquiry weather etc..

Due to the intrinsic feature of voice signal, same voice signal can be received by multiple electronic equipments, and by multiple electricitySub- equipment response, this is easy to cause to perplex to user.For example, for the identical multiple equipment for waking up word, user's progressWhen voice wakes up, the response time even response contents of multiple equipment will appear inconsistent situation, this receives userResponse message is chaotic, to influence the usage experience of user.

Summary of the invention

One purpose of the embodiment of the present invention is to provide a kind of new technical solution of Speech processing.

According to the first aspect of the invention, a kind of audio signal processing method is provided, which is characterized in that for by multipleEach of the equipment group that equipment is constituted equipment:

Receive current voice signal；

Whether decision needs to respond the voice signal, obtains the result of decision；

In the case where the result of decision is to need to respond the voice signal, the voice signal is responded.

Optionally, whether the decision needs to respond the voice signal, comprising:

Obtain the equipment itself the received voice signal setting index；

Obtain other equipment in the equipment group the received voice signal setting index；

According to the setting index of the setting index of the equipment itself and the other equipment, whether decision is neededRespond the voice signal.

Optionally, it is described obtain other equipment in the equipment group the received voice signal setting index,Include:

Receive the setting index of the other equipment sent in preset time period by the other equipment.

Optionally, the setting index includes: at the time of receiving the voice signal and to receive the voice signalIntensity at least one of.

Optionally, wherein described at the time of set index and receive the voice signal described in；It is described to be set according toIt is standby itself the setting index and the other equipment the setting index, whether decision need to respond the voice letterNumber, comprising:

In the case where earliest at the time of the equipment itself the institute received voice signal, the result of decision is determinedTo need to respond the voice signal.

Optionally, the intensity for setting index to receive the voice signal；It is described according to the equipment itselfWhether the setting index of the setting index and the other equipment, decision need to respond the voice signal, comprising:

The equipment itself the received voice signal maximum intensity in the case where, determine the result of decisionTo need to respond the voice signal.

Optionally, the setting index includes described at the time of receive the voice signal and described receiving institute simultaneouslyThe intensity of predicate sound signal；The setting of the setting index and the other equipment according to the equipment itself refers toWhether mark, decision need to respond the voice signal, comprising:

According at the time of each equipment institute received voice signal and intensity, the comprehensive of the voice signal is determinedClose index；

The equipment itself the received voice signal overall target it is optimal in the case where, determine the decisionIt as a result is to need to respond the voice signal.

Optionally, the method also includes:

In the case where the result of decision is not need to respond the voice signal, the voice signal is not rungIt answers, and the equipment itself is set and no longer receives or respond subsequent voice signal.

Optionally, whether the decision needs to respond the voice signal, obtains the result of decision, comprising:

Obtain the received current voice signal setting index, as current criteria；

Obtain institute received first voice signal setting index, as reference index；

Compare the current criteria and described referring to index, obtains comparison result；

In the case where the comparison result meets and imposes a condition, by the result of decision of the correspondence first voice signalThe result of decision as the correspondence current voice signal.

Optionally, the setting condition determines in the following manner:

The setting index for obtaining multiple voice signal, as historical data；

The setting condition is determined according to the historical data.

Determine whether the equipment itself is main equipment in the equipment group；

In the case where determining the equipment itself is the main equipment, the result of decision is determined to need described in responseVoice signal；

Wherein, the main equipment is in the equipment group to the equipment of other equipment push audio data.

According to the second aspect of the invention, a kind of speech signal processing device, the Speech processing dress are additionally providedSetting in each of the equipment group being made of multiple equipment equipment, comprising:

Receiving module, for receiving current voice signal:

Whether decision-making module needs to respond the voice signal for decision, obtains the result of decision；And

Respond module, for the result of decision be need to respond the voice signal in the case where, to the voiceSignal is responded.

According to the third aspect of the invention we, a kind of electronic equipment is additionally provided, including as described in respect of the second aspect of the inventionSpeech signal processing device；Alternatively, the electronic equipment includes:

Memory, for storing executable command；

Processor, for executing any one as described in the first aspect of the invention under the control of the executable commandMethod.

According to the third aspect of the invention we, a kind of speech signal processing system is additionally provided, including multiple such as the present invention theElectronic equipment described in three aspects, and for same voice signal, each electronic equipment is performed both by such as the present invention firstAny one method described in aspect.

By referring to the drawings to the detailed description of exemplary embodiment of the present invention, other feature of the invention and itsAdvantage will become apparent.

A beneficial effect of the invention is, in audio signal processing method provided in this embodiment, equipment groupIn any appliance after receiving voice signal to itself whether needing to respond carry out decision, according to the result of decision to voice signalIt is responded, avoids the multiple equipment problem chaotic for the response of same voice signal, be conducive to improvement user uses bodyIt tests, speech ciphering equipment is made to become more Intelligent portable.

In addition, in audio signal processing method provided in this embodiment, Response Decision of the speech ciphering equipment to voice signalIt is to be made by its own, the decision process of each equipment is relatively independent in group, so as to avoid the decision event of equipment componentHinder the response performance to equipment group to affect greatly, so that audio signal processing method interaction is stablized in the present embodiment, it canBy property height.

In addition, the transmitting of voice signal setting index can carry out in a local network in the present embodiment, it is not necessarily to and serverCommunication, interaction Caton phenomenon caused by can be avoided because of network delay.

Detailed description of the invention

It is combined in the description and the attached drawing for constituting part of specification shows the embodiment of the present invention, and evenWith its explanation together principle for explaining the present invention.

Fig. 1 shows the schematic diagram that can be used for realizing the speech ciphering equipment of the embodiment of the present invention.

Fig. 2 is a kind of schematic diagram of application scenarios of audio signal processing method provided in an embodiment of the present invention.

Fig. 3 is the flow chart for the audio signal processing method that the embodiment of the present invention one provides.

Fig. 4 is the schematic diagram for the speech signal processing device that the embodiment of the present invention five provides.

Fig. 5 is the schematic diagram for the electronic equipment that the embodiment of the present invention six provides.

Specific embodiment

Carry out the various exemplary embodiments of detailed description of the present invention now with reference to attached drawing.It should also be noted that unless in addition havingBody explanation, the unlimited system of component and the positioned opposite of step, numerical expression and the numerical value otherwise illustrated in these embodiments is originallyThe range of invention.

Be to the description only actually of at least one exemplary embodiment below it is illustrative, never as to the present inventionAnd its application or any restrictions used.

Technology known to related fields ordinary skill personage, method and apparatus may be not discussed in detail, but suitableIn the case of, the technology, method and apparatus should be considered as part of specification.

It is shown here and discuss all examples in, any occurrence should be construed as merely illustratively, withoutIt is as limitation.Therefore, other examples of exemplary embodiment can have different values.

It should also be noted that similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang YiIt is defined in a attached drawing, then in subsequent attached drawing does not need that it is further discussed.

Fig. 1 shows the schematic diagram that can be used for realizing the speech ciphering equipment of the embodiment of the present invention.The speech ciphering equipment for example canRecognition of speech signals simultaneously responds.

As shown in Figure 1, speech ciphering equipment 1000 includes processor 1010, memory 1020, communication device 1030, display dressSet 1040, microphone 1050 and loudspeaker 1060.

Processor 1010 is, for example, central processor CPU, Micro-processor MCV etc..Memory 1020 for example including ROM (onlyRead memory), RAM (random access memory), the nonvolatile memory of hard disk etc..Communication device 1030 for example canCarry out wire communication or wireless communication.Display device 1040 is such as can be used for showing played music information, e.g.Liquid crystal display.Microphone 1050 for example can be used for receiving voice signal, e.g. electrodynamic type microphone, Electret Condencer Microphone,Piezoelectric microphone etc..Loudspeaker 1060 for example can be used for playing sound, e.g. dynamic speaker, electromagnetic loudspeaker,Electrostatic loudspeaker, piezo-electric loudspeaker etc..

Information processing system 1000 shown in FIG. 1 is only explanatory, and is never intended to that the limitation present invention, it answersWith or purposes.

Speech ciphering equipment in Fig. 2 includes speech ciphering equipment 210, speech ciphering equipment 220 and speech ciphering equipment 230.These voices are setStandby configuration is for example identical as the configuration of speech ciphering equipment 1000 in Fig. 1.

Multiple speech ciphering equipments in Fig. 2 can be with component devices group.The equipment group is, for example, that can believe same voiceIt number carries out responding and the multiple speech ciphering equipments that can be in communication with each other passes through institute in another example being to constitute multiple speakers of audio groupThe audio group of foundation, can synchronously playing audio signal, such as music Streaming Media between these speakers.Wherein, multiple voicesWhen equipment is in communication with each other, communication between any two equipment, which can be, directly to be carried out, and is also possible to by means of other equipmentSuch as router progress.

As shown in Fig. 2, user generates voice signal by speaking, since sound can pass around in the form of a sound waveIt broadcasts, which can be received by multiple speech ciphering equipments, this is easy to cause the chaotic situation of multiple equipment response.For this purpose,The present embodiment needle provides a kind of audio signal processing method, and this method can be applied to scene shown in Fig. 2.

Audio signal processing method provided in this embodiment is implemented by each of equipment group speech ciphering equipment, such asImplemented simultaneously by each of speech ciphering equipment 210, speech ciphering equipment 220, speech ciphering equipment 230 equipment in Fig. 2.As shown in figure 3, shouldMethod includes the following steps S3100-S3300:

Step S3100 receives current voice signal.

For example, receiving current voice signal by the speech ciphering equipment 210 in Fig. 2.Current voice signal is, for example, to be used forThe wake-up voice signal of speech ciphering equipment is waken up, for example, the brief voice signal being made of several syllables, word or word, for calling outSpeech ciphering equipment wake up further to receive the phonetic order of user's sending.Current voice signal can also be the " tune that user issuesThe voices such as big volume " " inquiry weather " " setting alarm clock ".Speech ciphering equipment 210 can for example receive voice letter by microphoneNumber, and electrical signal form is converted from sound wave form by voice signal.

After being connected to voice signal, speech ciphering equipment executes following steps S3200:

Step S3200, decision whether voice responsive signal, obtain the result of decision.

For example, after speech ciphering equipment 210 receives current voice quotation marks in Fig. 2, to itself whether needing to believe the voiceNumber response carry out decision, the result of decision responded.

In one example, step S3200 includes the following steps S3210-S3230:

Step S3210: speech ciphering equipment obtains the setting index that itself receives voice signal.

For example, the voice signal that the speech ciphering equipment 210 in Fig. 2 is received according to itself, determines that itself receives voice signalSetting index.At the time of the setting index e.g. receives voice signal, in another example be the intensity for receiving voice signal,It can also simultaneously include at the time of receive voice signal and receiving the intensity of voice signal.

When at the time of determining that speech ciphering equipment receives voice signal, it is same in advance can be carried out to multiple speech ciphering equipments the timeStep, then is recorded by each speech ciphering equipment at the time of itself receiving voice signal, with guarantee distinct device record whenBetween be comparable.

When determining that speech ciphering equipment receives the intensity of voice signal, itself can be measured by speech ciphering equipment and receive voiceThe indexs such as magnitude of sound, the loudness of signal, to characterize the intensity of voice signal.

Step S3220 obtains the setting index that other equipment in equipment group receive voice signal.

By taking the speech ciphering equipment in Fig. 2 as an example, the setting index for itself receiving voice signal has been got in speech ciphering equipment 210In the case where, speech ciphering equipment 220,230 has also got the setting index that itself receives voice signal respectively according to same way.At this moment, the setting index that itself receives voice signal can be sent to speech ciphering equipment 210 respectively by speech ciphering equipment 220,230, fromAnd make speech ciphering equipment 210 get speech ciphering equipment 220,230 reception voice signal setting index.

In one example, speech ciphering equipment 210 receives the setting index sent by other equipment in set period of time.It shouldSet period of time is, for example, start of calculation at the time of receiving voice signal from speech ciphering equipment 210, in another example being received from speech ciphering equipment 210The earliest moment of the setting index sent to other equipment starts.The setting of other equipment more than the set period of time is referred toMark, speech ciphering equipment 210 can be received no longer.By selecting suitable setting time segment length, it can be avoided reception other equipmentThe process for setting index expends more time.

Speech ciphering equipment 220,230 can also obtain other equipment in equipment group by similar fashion and receive voice signalSet index.

Step S3230: according to the setting index of the setting index of itself and other equipment, whether decision needs voice responsiveSignal.

For any appliance in equipment group, in the setting index situation for obtaining setting themselves index and other equipmentUnder, decision itself whether can need voice responsive equipment accordingly.The embodiment of step S3230 is, for example:

(1) in step S3230, the intensity for itself receiving voice signal and other equipment are received language by speech ciphering equipmentThe intensity of sound signal is compared, and when the intensity for itself being connected to voice signal is maximum one, is made and is needed voice responsiveThe decision of signal.This mode is advantageously implemented in equipment group to be responded apart from the closer equipment of user.

(2) in step S3230, speech ciphering equipment will receive language with other equipment at the time of itself receiving voice signalIt is compared at the time of sound signal, when being earliest one at the time of itself receives voice signal, makes and need to respond languageThe decision of sound signal.This mode is advantageously implemented equipment group and makes more quick response to voice signal.

(3) in step S3230, speech ciphering equipment is according to successive and intensity at the time of each equipment institute received voice signalSize determines the overall target of the voice signal；Equipment itself received voice signal the optimal situation of overall targetUnder, make the decision for needing voice responsive signal.This mode comprehensive consideration moment index and intensity index, are conducive to optimizeResponse policy.In addition, overall target can more accurately be arrived relative to single index when determining that equipment is at a distance from user.?In one example, the equipment nearest with user distance can be selected according to the overall target and is responded.According to related acousticsRule, it is assumed that the sound intensity of certain point is I, and the distance of the point to sound source is d, and the propagation time of sound is t, on the one hand, sound intensity I andDistance d's square is inversely proportional, that is, I ∝ 1/d², on the other hand, time t is directly proportional to distance d, that is, t ∝ d.Thus may be usedSee, equipment can be reflected with user's distance d by the index of two aspects of sound intensity I and propagation time t.In order to measure distance dDistance, can basisAnd t, and distribute and overall target is calculated with corresponding weight.It is with the speech ciphering equipment in Fig. 2Example, it is assumed that the intensity value I that speech ciphering equipment 210,220,230 receives voice signal is successively 1,2,3, then correspondingIt is successively 1,0.71,0.57.Assuming that speech ciphering equipment 210,220 and 230 is successively 1,2,3 at the time of receiving voice signal, thisIn approximate processing is carried out to propagation time of sound, it is believed that the equipment corresponding propagation time for receiving voice signal earliest is0s, then the corresponding propagation time t of speech ciphering equipment 210,220,230 is followed successively by 0,1,2.For example successively with the weight of tIt is 0.8 and 0.2, then speech ciphering equipment 210,220 and 230 receives the overall target of voice signalSuccessivelyIt is 0.8,0.768,0.856, the overall target numerical value is smaller, and it is smaller at a distance from user to represent response apparatus, i.e. overall targetIts smaller index meaning of numerical value it is more excellent.Therefore the overall target of speech ciphering equipment 220 is optimal in this example, which makes accordingly needsWant the decision of voice responsive signal.In this way, the equipment nearest apart from user can more accurately be selected.

The different embodiments of above-mentioned steps S3230 can be adapted for different equipment group response policies.For example, ifResponse policy be selection equipment group in responded apart from the closer equipment of user, can choose above embodiment (1) orPerson (3), wherein the intensity or intensity and the weighted results at moment for receiving voice signal using equipment are as between equipment and userThe measurement index of distance.In another example if response policy is that the equipment that reaction speed is most fast in group is selected to be responded,It can choose above embodiment (2), wherein as the weighing apparatus of the reaction speed of equipment at the time of receiving voice signal using equipmentFigureofmerit.

After the result of decision responded, speech ciphering equipment executes following steps S3300:

Step S3300 responds voice signal in the case where the result of decision is to need voice responsive signal.

Speech ciphering equipment has determined whether itself needs the result of decision of voice responsive signal by step S3200.In decisionAs a result in the case where responding for needs, speech ciphering equipment can call the response of own hardware progress voice signal.

The mode that equipment responds voice signal is, for example, to issue response voice by loudspeaker, in another example being to pass throughDisplay device shows response figure or response text, in another example being to carry out response prompt by the variation and movement of indicator light.

Speech ciphering equipment is determined by step S3200 itself do not need voice responsive signal in the case where, not to current languageSound signal is responded.

In audio signal processing method provided in this embodiment, any appliance in equipment group is receiving voice signalAfterwards to itself whether needing to respond carry out decision, voice signal is responded according to the result of decision, avoids multiple equipment pairIn the problem that the response of same voice signal is chaotic, be conducive to the usage experience for improving user, speech ciphering equipment is made to become more intelligentIt is convenient.

A specific example of audio signal processing method is as follows in the present embodiment:

As shown in Fig. 2, speech ciphering equipment 210, speech ciphering equipment 220 and speech ciphering equipment 230 are speaker, three speakers are constitutedEquipment group is simultaneously playing same song.At this point, user wants the weather of inquiry tomorrow, and have issued that " how is weather tomorrowThe voice signal (in this case, all the voice signal is responded without three speakers) of sample ".For the voice signal,Three speakers are performed both by step S3100-3300 described previously, wherein set index as receiving device and receive voice signalAt the time of.By decision, equipment 210, which determines, itself to be received earliest at the time of voice signal, therefore is rung to voice signalIt answers, starts the weather for broadcasting tomorrow.Equipment 220 and equipment 230 determine at the time of oneself receiving voice signal be not earliest, becauseThis does not respond voice signal, continues to play song.It can be seen that the audio signal processing method in the present embodiment canKeep the Speech processing of equipment group more orderly, intelligent.

The present embodiment provides a kind of audio signal processing method, the basis of audio signal processing method in example 1On, particular device is selected from equipment group, subsequent voice signal is received and responded by the equipment.

Audio signal processing method in the present embodiment is implemented by any speech ciphering equipment in equipment group, such as by Fig. 2Middle speech ciphering equipment 210, speech ciphering equipment 220, any appliance in speech ciphering equipment 230 are implemented.This approach includes the following stepsS4100-S4400:

Step S4100 receives current voice signal.

Step S4200, decision whether voice responsive signal, obtain the result of decision.

Step S4300 responds voice signal in the case where the result of decision is to need voice responsive signal.

The specific embodiment of above-mentioned steps S4100-S4300 is referred in embodiment one to step S3100-S3300Description and explanation, be not further described.

Step S4400 does not ring voice signal in the case where the result of decision is not need voice responsive signalIt answers, and equipment itself is set and no longer receives or respond subsequent voice signal.

In step S4400, for not needing the speech ciphering equipment of response current speech signal, in addition to not believing this voiceOutside number being responded, also sets up equipment itself and no longer subsequent voice signal is received or responded.For example, speech ciphering equipment is setItself standby mute microphone (MIC) is installed, to no longer receive subsequent voice signal.In another example speech ciphering equipment still maintains microphoneIt opens, but the result of decision of subsequent voice signal is determined as not needing to respond.

The duration that speech ciphering equipment is no longer received or responded to subsequent voice signal can be set.ExampleSuch as, it is set as lasting one hour, continue one day or continues to that equipment is shut down.

In step S4400, for needing to respond the speech ciphering equipment of current speech signal, the equipment is in addition to current speechOutside signal is responded, also subsequent voice signal is received and responded.For example, after the equipment receives follow-up signal,The result of decision is determined as to need to respond.In another example directly being rung after the equipment receives subsequent voice signal without decisionIt answers.

Audio signal processing method in through this embodiment can select particular device from equipment group, be set by thisStandby that voice signal after this voice signal is received and responded, other equipment no longer connect subsequent voice signalIt receives and response also simplifies the response treatment process to subsequent voice signal, be conducive to improve while avoiding response confusionResponse speed of the equipment group to subsequent voice signal.

The present embodiment provides a kind of audio signal processing method, the basis of audio signal processing method in example 1On, adjacent voice signal twice is preferentially responded by identical speech ciphering equipment.

Audio signal processing method in the present embodiment is implemented by any speech ciphering equipment in equipment group, such as by Fig. 2Middle speech ciphering equipment 210, speech ciphering equipment 220, any appliance in speech ciphering equipment 230 are implemented.This approach includes the following stepsS5100-S5300:

Step S5100 receives current voice signal.

Step S5200, decision whether voice responsive signal, obtain the result of decision.

Step S5300 responds voice signal in the case where the result of decision is to need voice responsive signal.

The specific embodiment of above-mentioned steps S5100-S5300 is referred in embodiment one to step S3100-S3300Description and explanation, be not further described.

In the present embodiment, step S5200 further comprises the steps S5210-S5240:

Step S5210: the setting index for receiving current voice signal is obtained, as current criteria.

In this step, the setting index for receiving current voice signal, the setting index including speech ciphering equipment itself,Setting index including other equipment.

Step S5220: obtaining the setting index for receiving first voice signal, as referring to index.

In this step, the setting index of first voice signal is received, including speech ciphering equipment itself institute is received formerlyThe setting index of voice signal, also include other equipment received first voice signal setting index.

The mode for obtaining the setting index for receiving first voice signal is, for example: in the decision process of first voice signalIn, the setting index for receiving voice signal is recorded by speech ciphering equipment, and is adjusted in the decision process of current speech signalTake the record.

Step S5230: compare current criteria with referring to index, obtain comparison result.

Step S5240: in the case where comparison result meets and imposes a condition, by the decision knot of the first voice signal of correspondenceThe result of decision of the fruit as corresponding current voice signal.

In above-mentioned steps S5230 and step S5240, the setting index of last time voice signal is believed as this voiceThe reference of number decision process, when this voice signal setting index relative to last time voice signal setting index comparison knotFruit meet impose a condition when, no matter the result of decision of this voice signal script whether the result of decision one with first voice signalIt causes, all using the result of decision of first voice signal as the result of decision of current speech signal.

Audio signal processing method provided in an embodiment of the present invention can pass through ratio while avoiding response confusionCompared with current criteria and referring to index, so that in the case where setting index variation is relatively little, by identical speech ciphering equipment to phaseAdjacent voice signal twice is responded, and is conducive to the consistency for keeping the response of equipment group, is avoided the equipment responded frequentVariation, therefore it is able to ascend user experience.

In the concrete embodiment of the present embodiment one, the setting condition in step S5240 can determine in the following manner；

The setting index for obtaining multiple voice signal, as historical data；It is determined and is imposed a condition according to historical data.

For example, the setting index for repeatedly receiving voice signal to equipment group records, as historical data.Determination is gone throughThe average value of index, frequency that each numerical value occurs etc. are set in history data, are believed according to the user speech that historical data reflectsNumber the characteristics of determine suitable impose a condition.

It determines and imposes a condition through the above way, be conducive to carry out Speech processing according to the personalization features of user,To further promote user experience.

The present embodiment provides a kind of audio signal processing method, the basis of audio signal processing method in example 1On, it whether is that main equipment determines the result of decision based on equipment group.

Audio signal processing method in the present embodiment is implemented by any speech ciphering equipment in equipment group, such as by Fig. 2Middle speech ciphering equipment 210, speech ciphering equipment 220, any appliance in speech ciphering equipment 230 are implemented.This approach includes the following stepsS6100-S6300:

Step S6100 receives current voice signal.

Step S6200, decision whether voice responsive signal, obtain the result of decision.

Step S6300 responds voice signal in the case where the result of decision is to need voice responsive signal.

The specific embodiment of above-mentioned steps S6100-S6300 is referred in embodiment one to step S3100-S3300Description and explanation, be not further described.

In the present embodiment, step S6200 is further included steps of

Step S6210: determine whether itself is main equipment in equipment group；

Step S6220: in the case where determining itself is main equipment, determine that the result of decision is to need voice responsive signal.

Wherein, main equipment is in equipment group to the equipment of other equipment push audio data.

In the present embodiment, the multiple equipment in equipment group constitutes audio group, from the main equipment in audio group to fromEquipment pushes audio data.Equipment in audio group is, for example, speaker.

The method of determination of main equipment is, for example, in audio group:

(1) it for being in broadcast state when building group or being not at two equipment of broadcast state, is built by first initiatingThe equipment of group request is as main equipment；

(2) two equipment of broadcast state are not in broadcast state, one for one when building group, by playingThe equipment of state is as main equipment.

For the equipment in audio group, in Response Decision, determine whether itself is main equipment by equipment, if based on itselfEquipment then responds voice signal, does not respond to voice signal if itself not being main equipment.

The audio signal processing method provided in the present embodiment can improve decision speed while avoiding responding confusionDegree, and then improve the response speed of equipment group.

The present embodiment provides a kind of speech signal processing devices.As shown in figure 4, speech signal processing device 400 includes:

Receiving module 410, for receiving current voice signal:

Whether decision-making module 420 needs voice responsive signal for decision, obtains the result of decision；And

Respond module 430, for being rung to voice signal in the case where the result of decision is to need voice responsive signalIt answers.

The purposes of modules is referred to the description as described in audio signal processing method in embodiment one in the present embodiment,Which is not described herein again.

In the concrete embodiment of the present embodiment one, decision-making module 420 is also used to:

Obtain equipment itself received voice signal setting index；

Obtain equipment group in other equipment received voice signal setting index；

According to the setting index of the setting index of equipment itself and other equipment, whether decision needs voice responsive signal.

In the concrete embodiment of the present embodiment one, decision-making module 420 is also used to: being received and is sent out in preset time period by other equipmentThe setting index for the other equipment sent.Wherein, setting index includes at the time of receiving voice signal and receiving voice signalIntensity at least one of.

In the concrete embodiment of the present embodiment one, decision-making module 420 is also used to: equipment itself received voice signalIn the case that moment is earliest, determine that the result of decision is to need voice responsive signal.

In the concrete embodiment of the present embodiment one, decision-making module 420 is also used to: equipment itself received voice signalIn the case where maximum intensity, determine that the result of decision is to need voice responsive signal.

In the concrete embodiment of the present embodiment one, decision-making module 420 is also used to: according to the received voice signal of each equipment instituteAt the time of successively and intensity size, determine the overall target of voice signal；Equipment itself received voice signal synthesisIn the case that index is optimal, determine that the result of decision is to need voice responsive signal.

In the concrete embodiment of the present embodiment one, speech signal processing device 400 further includes that subsequent response module (is not shown in figureOut), which is used for: the result of decision be do not need voice responsive signal in the case where, not to voice signal intoRow response, and equipment itself is set and no longer receives or respond subsequent voice signal；

In the case where the result of decision is to need voice responsive signal, setting equipment itself receives and responds subsequent voiceSignal.

In the concrete embodiment of the present embodiment one, speech signal processing device 400 further includes comparison module (not shown),The comparison module is used for:

The setting index for receiving current voice signal is obtained, as current criteria；

The setting index for receiving first voice signal is obtained, as referring to index；

Compare current criteria with referring to index, obtains comparison result；

In the case where comparison result meets and imposes a condition, using the result of decision of the first voice signal of correspondence as correspondenceThe result of decision of current voice signal.

In the concrete embodiment of the present embodiment one, which is also used to:

The setting index for obtaining multiple voice signal, as historical data；

It is determined and is imposed a condition according to historical data.

In the concrete embodiment of the present embodiment one, decision-making module 420 is also used to

Determine whether itself is main equipment in equipment group；

In the case where determining itself is main equipment, determine that the result of decision is to need voice responsive signal；

The present embodiment provides a kind of electronic equipment, which includes the Speech processing dress as described in embodiment fiveIt sets, for details, reference can be made to the descriptions as described in speech signal processing device in embodiment five.

Alternatively, the electronic equipment 500 in the electronic equipment such as Fig. 5, comprising:

Reservoir 510, for storing executable command.

Processor 520, for executing such as any one of embodiment one to embodiment three institute under the control of executable commandThe method stated.The description as described in audio signal processing method into embodiment three that for details, reference can be made to embodiments one.

The present embodiment provides a kind of speech signal processing system, which includes multiple six institutes of embodimentThe electronic equipment stated, and for same voice signal, each electronic equipment, which is performed both by, appoints such as embodiment one into embodiment threeMethod described in one.

The speech signal processing system is, for example, the device cluster that figure is made of the speech ciphering equipment 210,220 and 230 in Fig. 2Group can specifically participate in description of the embodiment one into embodiment three for the equipment group, and which is not described herein again.

The present invention can be system, method and/or computer program product.Computer program product may include computerReadable storage medium storing program for executing, containing for making processor realize the computer-readable program instructions of various aspects of the invention.

Computer readable storage medium, which can be, can keep and store the tangible of the instruction used by instruction execution equipmentEquipment.Computer readable storage medium for example can be-- but it is not limited to-- storage device electric, magnetic storage apparatus, optical storageEquipment, electric magnetic storage apparatus, semiconductor memory apparatus or above-mentioned any appropriate combination.Computer readable storage mediumMore specific example (non exhaustive list) includes: portable computer diskette, hard disk, random access memory (RAM), read-only depositsIt is reservoir (ROM), erasable programmable read only memory (EPROM or flash memory), static random access memory (SRAM), portableCompact disk read-only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanical coding equipment, for example thereonIt is stored with punch card or groove internal projection structure and the above-mentioned any appropriate combination of instruction.Calculating used herein aboveMachine readable storage medium storing program for executing is not interpreted that instantaneous signal itself, the electromagnetic wave of such as radio wave or other Free propagations lead toIt crosses the electromagnetic wave (for example, the light pulse for passing through fiber optic cables) of waveguide or the propagation of other transmission mediums or is transmitted by electric wireElectric signal.

Computer-readable program instructions as described herein can be downloaded to from computer readable storage medium it is each calculate/Processing equipment, or outer computer or outer is downloaded to by network, such as internet, local area network, wide area network and/or wireless networkPortion stores equipment.Network may include copper transmission cable, optical fiber transmission, wireless transmission, router, firewall, interchanger, gatewayComputer and/or Edge Server.Adapter or network interface in each calculating/processing equipment are received from network to be countedCalculation machine readable program instructions, and the computer-readable program instructions are forwarded, for the meter being stored in each calculating/processing equipmentIn calculation machine readable storage medium storing program for executing.

Computer program instructions for executing operation of the present invention can be assembly instruction, instruction set architecture (ISA) instructs,Machine instruction, machine-dependent instructions, microcode, firmware instructions, condition setup data or with one or more programming languagesThe source code or object code that any combination is write, the programming language include the programming language-of object-oriented such asSmalltalk, C++ etc., and conventional procedural programming languages-such as " C " language or similar programming language.ComputerReadable program instructions can be executed fully on the user computer, partly execute on the user computer, be only as oneVertical software package executes, part executes on the remote computer or completely in remote computer on the user computer for partOr it is executed on server.In situations involving remote computers, remote computer can pass through network-packet of any kindIt includes local area network (LAN) or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as benefitIt is connected with ISP by internet).In some embodiments, by utilizing computer-readable program instructionsStatus information carry out personalized customization electronic circuit, such as programmable logic circuit, field programmable gate array (FPGA) or canProgrammed logic array (PLA) (PLA), the electronic circuit can execute computer-readable program instructions, to realize each side of the inventionFace.

Referring herein to according to the method for the embodiment of the present invention, the flow chart of device (system) and computer program product and/Or block diagram describes various aspects of the invention.It should be appreciated that flowchart and or block diagram each box and flow chart and/Or in block diagram each box combination, can be realized by computer-readable program instructions.

These computer-readable program instructions can be supplied to general purpose computer, special purpose computer or other programmable datasThe processor of processing unit, so that a kind of machine is produced, so that these instructions are passing through computer or other programmable datasWhen the processor of processing unit executes, function specified in one or more boxes in implementation flow chart and/or block diagram is producedThe device of energy/movement.These computer-readable program instructions can also be stored in a computer-readable storage medium, these refer toIt enables so that computer, programmable data processing unit and/or other equipment work in a specific way, thus, it is stored with instructionComputer-readable medium then includes a manufacture comprising in one or more boxes in implementation flow chart and/or block diagramThe instruction of the various aspects of defined function action.

Computer-readable program instructions can also be loaded into computer, other programmable data processing units or otherIn equipment, so that series of operation steps are executed in computer, other programmable data processing units or other equipment, to produceRaw computer implemented process, so that executed in computer, other programmable data processing units or other equipmentInstruct function action specified in one or more boxes in implementation flow chart and/or block diagram.

The flow chart and block diagram in the drawings show the system of multiple embodiments according to the present invention, method and computer journeysThe architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generationOne module of table, program segment or a part of instruction, the module, program segment or a part of instruction include one or more useThe executable instruction of the logic function as defined in realizing.In some implementations as replacements, function marked in the boxIt can occur in a different order than that indicated in the drawings.For example, two continuous boxes can actually be held substantially in parallelRow, they can also be executed in the opposite order sometimes, and this depends on the function involved.It is also noted that block diagram and/orThe combination of each box in flow chart and the box in block diagram and or flow chart, can the function as defined in executing or dynamicThe dedicated hardware based system made is realized, or can be realized using a combination of dedicated hardware and computer instructions.It is rightFor art technology personage it is well known that, by hardware mode realize, by software mode realize and pass through software andIt is all of equal value that the mode of combination of hardware, which is realized,.

Various embodiments of the present invention are described above, above description is exemplary, and non-exclusive, andIt is not limited to disclosed each embodiment.Without departing from the scope and spirit of illustrated each embodiment, for this skillMany modifications and changes are obvious for the ordinary skill personage in art field.The selection of term used herein, purportIn principle, the practical application or to the technological improvement in market for best explaining each embodiment, or make the art itsIts ordinary skill personage can understand each embodiment disclosed herein.The scope of the present invention is defined by the appended claims.

Claims

1. a kind of audio signal processing method, which is characterized in that for each of the equipment group being made of multiple equipmentEquipment is performed simultaneously:

Receive current voice signal；

2. according to the method described in claim 1, wherein, whether the decision needs to respond the voice signal, comprising:

Obtain the equipment itself the received voice signal setting index；

According to the setting index of the setting index of the equipment itself and the other equipment, whether decision, which needs, is rungAnswer the voice signal.

3. according to the method described in claim 2, wherein, other equipment institute is received described in the acquisition equipment groupThe setting index of voice signal, comprising:

4. according to the method described in claim 2, wherein, the setting index includes: at the time of receiving the voice signalWith receive the voice signal intensity at least one of.

5. according to the method described in claim 2, wherein, it is described set index received described in the voice signal whenIt carves；It is described whether to be needed according to the setting index of the equipment itself with the setting index of the other equipment, decisionRespond the voice signal, comprising:

In the case where earliest at the time of the equipment itself the institute received voice signal, the result of decision is determined to needRespond the voice signal.

6. according to the method described in claim 2, wherein, the intensity for setting index to receive the voice signal；InstituteThe setting index of the setting index and the other equipment according to the equipment itself is stated, whether decision, which needs, respondsThe voice signal, comprising:

The equipment itself the received voice signal maximum intensity in the case where, determine the result of decision for needRespond the voice signal.

7. according to the method described in claim 2, wherein, the setting index includes described receiving the voice signal simultaneouslyAt the time of and the intensity for receiving the voice signal；The setting index according to the equipment itself and describedWhether the setting index of other equipment, decision need to respond the voice signal, comprising:

According at the time of each equipment institute received voice signal and intensity, determine that the synthesis of the voice signal refers toMark；

The equipment itself the received voice signal the overall target it is optimal in the case where, determine the decisionIt as a result is to need to respond the voice signal.

8. according to the method described in claim 1, wherein, the method also includes:

In the case where the result of decision is not need to respond the voice signal, the voice signal is not responded,And the equipment itself is set and no longer receives or respond subsequent voice signal.

9. obtaining decision according to the method described in claim 1, wherein, whether the decision needs to respond the voice signalAs a result, comprising:

Obtain the received current voice signal setting index, as current criteria；

The comparison result meet impose a condition in the case where, using the result of decision of the correspondence first voice signal asThe result of decision of the corresponding current voice signal.

10. according to the method described in claim 9, wherein, the setting condition determines in the following manner:

The setting index for obtaining multiple voice signal, as historical data；

The setting condition is determined according to the historical data.

11. obtaining decision according to the method described in claim 1, wherein, whether the decision needs to respond the voice signalAs a result, comprising:

In the case where determining the equipment itself is the main equipment, determine that the result of decision is to need to respond the voiceSignal；

12. a kind of speech signal processing device, the speech signal processing device is located at the equipment group being made of multiple equipmentEach of in equipment, comprising:

Receiving module, for receiving current voice signal:

Respond module, for the result of decision be need to respond the voice signal in the case where, to the voice signalIt is responded.

13. a kind of electronic equipment, comprising:

Memory, for storing executable command；

Processor, for executing such as the described in any item methods of claim 1-11 under the control of the executable command.

14. a kind of speech signal processing system, including multiple electronic equipments as claimed in claim 13, the multiple electronics are setStandby constitution equipment group；And for same voice signal, each electronic equipment is performed both by as appointed in claim 1-11Method described in one.