Movatterモバイル変換


[0]ホーム

URL:


CN107910013B - A kind of output processing method and device of voice signal - Google Patents

A kind of output processing method and device of voice signal
Download PDF

Info

Publication number
CN107910013B
CN107910013BCN201711104384.1ACN201711104384ACN107910013BCN 107910013 BCN107910013 BCN 107910013BCN 201711104384 ACN201711104384 ACN 201711104384ACN 107910013 BCN107910013 BCN 107910013B
Authority
CN
China
Prior art keywords
signal
voice
amplitude
noise
background noise
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711104384.1A
Other languages
Chinese (zh)
Other versions
CN107910013A (en
Inventor
杨宗业
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp LtdfiledCriticalGuangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN201711104384.1ApriorityCriticalpatent/CN107910013B/en
Publication of CN107910013ApublicationCriticalpatent/CN107910013A/en
Application grantedgrantedCritical
Publication of CN107910013BpublicationCriticalpatent/CN107910013B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

Translated fromChinese

本发明适用于信号处理技术领域,提供了一种语音信号的输出处理方法及装置,包括方法包括:从实时获取的声音信号中识别出语音信号和背景噪声信号;获取所述语音信号与所述背景噪声信号的幅度差值;基于所述幅度差值对所述声音信号进行降噪处理。使得对语音信号的处理更有针对性,避免了通过AGC对音量进行控制是仅基于语音信号的高低进行处理导致的信噪比不高的问题。

Figure 201711104384

The present invention is applicable to the technical field of signal processing, and provides a method and device for outputting a voice signal, including the method comprising: identifying a voice signal and a background noise signal from a voice signal acquired in real time; obtaining the voice signal and the Amplitude difference value of the background noise signal; noise reduction processing is performed on the sound signal based on the amplitude difference value. This makes the processing of the voice signal more targeted, and avoids the problem that the signal-to-noise ratio is not high due to the fact that the volume control by the AGC is only based on the level of the voice signal.

Figure 201711104384

Description

Voice signal output processing method and device
Technical Field
The invention belongs to the technical field of signal processing, and particularly relates to a method and a device for outputting and processing a voice signal.
Background
A user can use a mobile phone to carry out communication in a hands-free communication mode when driving, in the prior art, the voice signal of the mobile phone is processed by adaptively adjusting the Gain through Automatic Gain Control (AGC), the voice signal can be adaptively reduced under the condition of large voice signal, and the voice signal can be adaptively increased under the condition of small voice signal, so that the amplitude of the output voice signal is automatically kept to be changed in a small range. However, the noise is amplified while the sound signal is increased by the AGC, the transmitted speech noise ratio is poor, and the user experience of conversation is not good.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for processing a speech signal to solve the problem of poor signal-to-noise ratio caused by using AGC to adjust a speech signal in the prior art.
A first aspect of an embodiment of the present invention provides a method for processing an output of a speech signal, including:
recognizing a voice signal and a background noise signal from a sound signal acquired in real time;
acquiring an amplitude difference value of the voice signal and the background noise signal;
and performing noise reduction processing on the sound signal based on the amplitude difference value.
A second aspect of an embodiment of the present invention provides an apparatus for processing an output of a speech signal, including:
the audio signal acquisition unit is used for identifying a voice signal and a background noise signal from a sound signal acquired in real time;
the amplitude difference value calculation unit is used for acquiring the amplitude difference value of the voice signal and the background noise signal;
and the processing unit is used for carrying out noise reduction processing on the sound signal based on the amplitude difference value.
A third aspect of the present application provides a terminal device, comprising:
comprising a memory, a processor and a computer program stored in said memory and executable on said processor, characterized in that said processor implements the steps of the method for output processing of said speech signal as provided in the first aspect of the present application when executing said computer program.
A fourth aspect of the present application provides a computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the steps of the method for processing the output of a speech signal provided in the first aspect of the present application.
A fifth aspect of the present application provides a computer program product comprising a computer program which, when executed by one or more processors, performs the steps of the method of output processing of a speech signal as provided by the first aspect of the present application.
Compared with the prior art, the embodiment of the invention has the following beneficial effects: the method comprises the steps of recognizing a voice signal and a background noise signal from a sound signal acquired in real time; acquiring an amplitude difference value of the voice signal and the background noise signal; and performing noise reduction processing on the sound signal based on the amplitude difference value. The method and the device have the advantages that the processing of the voice signals is more targeted, and the problem that the signal to noise ratio is not high due to the fact that the volume is controlled through AGC and the processing is only carried out based on the height of the voice signals is solved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a schematic flow chart illustrating an implementation of a method for processing an output of a speech signal according to an embodiment of the present invention;
fig. 2 is a schematic flow chart illustrating an implementation of a speech signal output processing method according to a second embodiment of the present invention;
fig. 3 is a schematic diagram of an output processing apparatus for a speech signal according to a third embodiment of the present invention;
fig. 4 is a schematic diagram of a terminal device according to a fourth embodiment of the present invention.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
In order to explain the technical means of the present invention, the following description will be given by way of specific examples.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".
In order to explain the technical means of the present invention, the following description will be given by way of specific examples.
Example one
Fig. 1 is a schematic flow chart of an implementation process of a voice denoising method according to an embodiment of the present invention, where the voice denoising method according to an embodiment of the present invention may be applied to an electronic device with a voice receiving element, such as a mobile phone, a notebook, a tablet computer, a vehicle-mounted system, and a wearable electronic device, and as shown in the figure, the method may include the following steps:
step S101 identifies a speech signal and a background noise signal from the sound signal acquired in real time.
In this embodiment, a user can start the hands-free mode to communicate with the mobile phone during driving, because the mobile phone is generally placed on a mobile phone holder of the automobile in the hands-free mode, and because the mobile phone is far away from the user, the mobile phone receives a voice signal of the user and also receives background noise generated during the operation of the automobile, such as tire noise generated by friction between tires and a road surface, air conditioning noise generated by an air conditioning fan, and wind noise generated by friction between air and gaps and corners of the automobile during the driving of the automobile. The background noise belongs to steady-state noise with small signal amplitude change and high repetition frequency. The noise generally persists during the user's conversation, and the voice apparatus may significantly affect the quality of the voice signal when performing AGC adjustment on the voice signal.
In this embodiment, first, the signal needs to be identified according to the signal characteristics of the speech signal and the background noise signal. The speech and the background noise can be recognized by storing a human voice model and a noise model in advance. The model contains speech characteristics of the sound such as frequency, zero crossing rate, short-term average energy, short-term average amplitude, etc. For example, after sampling the sound signal, the sound signal is matched with the speech model, if the sound signal includes all features in the human sound model, that is, it indicates that a person is speaking at present, if the sound signal cannot match the human sound model, it may be that the sound of the user is too small, or the background noise is too large, that the sound cannot be recognized from the currently acquired sound, and at this time, the mobile phone may issue an error prompt to the user, for example, the mobile phone may issue a prompt message of "the sound of the user cannot be acquired". Similarly, the acquired sound signal may be identified according to a pre-stored tire noise model, air-conditioning noise model, or wind noise model, so that the type of noise contained in the current sound signal can be determined.
Step S102, obtaining the amplitude difference value of the voice signal and the background noise signal;
in this embodiment, after recognizing that the sound signal contains the human voice and the background noise according to the human voice model in step S101, the human voice and the background noise may be extracted according to the features of the above models, and the average amplitude of the voice signal representing the human voice and the average amplitude of the background noise signal in the sound signal collected over a period of time may be calculated according to the recorded sound waveform.
And step S103, performing noise reduction processing on the sound signal based on the amplitude difference value.
In the present embodiment, the absolute value of the speech signal amplitude and the background noise signal amplitude acquired in step S102 is first obtained, and then the difference between the absolute value of the speech signal amplitude and the absolute value of the background noise signal amplitude is calculated. According to the embodiment, the preset processing methods are respectively selected according to the difference, so that the processing of the voice signals is more targeted, and the problem of low signal-to-noise ratio caused by processing based on the voice signals only through AGC is solved.
Optionally, before recognizing the speech signal and the background noise signal, the method includes: pre-filtering the sound signal.
In this embodiment, it is considered that the sound signal may be interfered by random noise, such as gaussian noise. If a sound signal with random noise is identified, an identification error may be generated due to interference of the random noise. Therefore, in the present embodiment, pre-filtering is performed before the sound signal is recognized. Since random noise in the environment is uncorrelated and exhibits high-frequency characteristics in the signal, the acquired sound signal can be low-pass filtered in the frequency domain, so that the high-frequency part obviously belonging to the noise signal in the sound signal is filtered, and the accuracy of subsequent human voice identification and noise identification is improved.
Example two
Based on the first embodiment, specifically, the performing noise reduction processing on the sound signal based on the amplitude difference includes: and if the amplitude difference value is a positive value and is greater than or equal to a first threshold value, performing noise reduction processing on the sound signal, wherein the intensity of the noise reduction processing is in direct proportion to the amplitude difference value.
In this embodiment, when the amplitude of the voice signal of the user is greater than the amplitude of the noise signal, the difference value is determined to be greater than the first threshold, and in this embodiment, the difference value is compared with the first threshold, and corresponding noise reduction methods are respectively adopted according to the comparison result.
When the difference is greater than the first threshold, i.e., the amplitude of the noise signal is significantly less than the amplitude of the speech signal. The method shows that the current speech signal quality is good and is not obviously influenced by noise, and the noise reduction processing can be performed on the sound signal by adopting a conventional noise reduction algorithm, such as amplitude spectrum subtraction, harmonic enhancement method and noise cancellation method which are commonly used in the field, and the noise reduction processing algorithm is not limited. For the specific use of the above noise reduction algorithm, reference may be made to the prior art, which is not described herein in detail. In the present embodiment, the intensity of the noise reduction processing is selected according to the magnitude of the difference. Generally speaking, the stronger the noise reduction process, the better the noise removal effect, but the speech signal emitted by a normal user will be severely distorted. Therefore, optionally, when the difference is larger, it indicates that the voice signal is less affected by noise, and aliasing between noise and voice is not serious, so that the intensity of the noise reduction processing can be increased, and when the difference is smaller, it indicates that the noise signal has a certain influence on the voice signal, i.e., a certain aliasing exists between noise and voice, so that the intensity of the noise reduction processing is correspondingly reduced. The intensity of noise reduction processing is dynamically adjusted according to the difference value between the voice signal amplitude and the noise signal amplitude, and the sound quality of the noise reduction processing is improved.
Optionally, as shown in fig. 2, if the amplitude difference is a positive value, and the amplitude difference is smaller than or equal to the second threshold, and the amplitude of the speech signal is greater than the third threshold, the method includes:
step S201, amplifying the sound signal by a preset gain to obtain a first intermediate signal.
Step S202, noise reduction processing is carried out on the first intermediate signal to obtain a second intermediate signal.
Step S203, attenuating the second intermediate signal according to the preset gain to obtain the noise-reduced sound signal.
In this embodiment, when the difference S is a positive value and is smaller than a second threshold T, it indicates that the current noise condition is relatively serious, where the second threshold T may be equal to the first threshold, or may also be smaller than the first threshold. If the noise reduction processing is directly performed on the current voice signal, the obtained voice signal is not ideal. Therefore, in this embodiment, the sound signal is first amplified according to a preset gain a, that is, both the speech signal and the noise signal in the sound signal are amplified according to the gain a, so as to obtain a first intermediate signal. It can be seen that the difference between the speech signal amplitude and the noise signal amplitude in the first intermediate signal is also amplified by the gain a, and therefore, the noise reduction processing performed after the sound signal is amplified by the gain a, the influence on the normal user speech signal can be significantly reduced, and the value of the gain a can be calculated according to the difference S and the second threshold T. Specifically, the method comprises the following steps: a is more than or equal to T/S. By the calculated gain a. The difference S' after the signal gain adjustment may be greater than the second threshold T, and then the first intermediate signal is subjected to noise reduction processing to obtain a second intermediate signal. And attenuating the second intermediate signal subjected to the noise reduction according to the gain A, and obtaining the sound signal subjected to the noise reduction after attenuation. In order to ensure the noise reduction effect, in this embodiment, the amplitude of the voice signal is greater than the third threshold. The third threshold value can be determined according to the performance of the mobile phone and the noise reduction algorithm used together. The gain a correspondingly cannot exceed a maximum threshold value, so that the amplitude value of the sound signal after the gain is not larger than the clipping point. Because the signal after gain amplification cannot be recorded if the signal after gain exceeds the clipping point, the original signal cannot be obtained even if the signal is attenuated by the same gain.
In this embodiment, when the amplitude difference is a positive value, the amplitude difference is smaller than or equal to the second threshold, and the amplitude of the voice signal is greater than the third threshold, the voice signal is subjected to noise reduction processing after being amplified by a preset gain, and then the voice signal subjected to noise reduction processing is attenuated according to the preset gain, so that not only is the signal size unchanged, but also the signal-to-noise ratio of the voice signal is improved, and the user experience is improved.
Optionally, the voice model includes a voice model of a specific user, and during driving, there may be other passengers speaking in addition to the user who is talking in the hands-free mode, so that when recognizing a voice signal using the voice model, the voice of the specific user is recognized according to the voice model of the specific user first. Whether a human voice other than the specific user is included is then identified according to the acoustic model. If there are voices of other than the specific user and the signal amplitude of the voices of other than the specific user is greater than the sound signal amplitude of the specific user, the speech of the specific user may not be enhanced by the noise reduction method in this embodiment. In this case, a prompt may be sent to the user, and the speaking voice of the current other person is too loud, which may affect the conversation effect.
Optionally, the noise reduction method disclosed above may be combined with a conventional Automatic Gain Control (AGC), specifically, after recognizing that the sound signal includes a user's sound signal and background noise, the obtained sound signal is subjected to AGC processing first, so that the amplitude of the sound signal changes in a small range, and then the amplitude of the sound signal and the amplitude of the background noise signal are obtained; calculating a difference value between the amplitude of the voice signal and the amplitude of the background noise signal, and when the difference value is a positive value and is greater than or equal to a first threshold value, performing noise reduction processing on the voice signal, wherein the intensity of the noise reduction processing is in direct proportion to the difference value; and when the difference value is a positive value and is smaller than a first threshold value, amplifying the sound signal by a preset gain, then carrying out noise reduction processing on the sound signal, and then attenuating the sound signal after the noise reduction processing according to the preset gain. By combining the conventional AGC method with the noise reduction method in the present embodiment, the voice noise reduction effect can be further optimized.
EXAMPLE III
Based on the first embodiment, specifically, the performing noise reduction processing on the sound signal based on the amplitude difference includes: and if the amplitude difference value is not a positive value, outputting prompt information, wherein the prompt information is used for prompting a user to approach a microphone for communication or increasing the speaking volume.
In this embodiment, when the difference S between the amplitude of the voice signal of the user and the amplitude of the noise signal is 0, that is, the voice signal and the noise signal are the same; or when the difference S is a negative value, that is, the amplitude of the speech signal is smaller than that of the noise signal, since the background noise is additive noise, the speech of the user is covered by the noise signal when the amplitude difference is a non-positive value, and in this case, performing noise reduction processing on the noise signal also processes the normal speech signal of the user, thereby causing serious distortion to the speech of the user. Therefore, in this embodiment, when the difference is a non-positive value, a prompt may be sent to the user through the audio output unit of the mobile phone, for example, to prompt the user to speak near the mobile phone or speak loudly when the sound is too loud or too loud.
Optionally, the content of the cell phone prompt may be adjusted by detecting the distance between the user and the cell phone.
In this embodiment, as in the first embodiment, the voice signal including the human voice and the noise signal are identified according to the voice model. And specifically identifying noise signals as tire noise, wind noise and air conditioner noise. If the difference value between the voice signal amplitude and the noise signal amplitude is a non-positive value and the distance between the user and the mobile phone is smaller than the preset value, the fact that the user is close to the mobile phone at present is indicated, and the problem that the call noise cannot be solved by speaking close to the mobile phone is solved, therefore, the source of the noise of the user can be simultaneously prompted while the voice of the user is prompted, so that the user can solve the problem in a targeted mode, for example, the fact that the air conditioner noise amplitude is larger than the voice signal amplitude of the user through voice recognition is found, the user is reminded that the air conditioner noise is too large, and the user is informed that the air conditioner wind speed can be reduced. If the wind noise amplitude is larger than the voice signal amplitude of the user through voice recognition, the user is prompted to have too large wind noise, and the user is advised to reduce the vehicle speed. The embodiment can not solve the noise problem by approaching the mobile phone to speak through the identified noise source, and can remind the user of the main source of the current noise so that the user can correspondingly adopt a solution, the user can conveniently and accurately solve the noise of voice reduction, and the user experience is improved.
Example four
Fig. 3 shows a constituent structure of the speech signal output processing apparatus provided in the present embodiment, and for convenience of explanation, only the portions related to the present embodiment are shown.
In this embodiment, the apparatus is used to implement the method for processing the output of the voice signal in the embodiment of fig. 1, and may be a software unit, a hardware unit or a unit combining software and hardware that is built in the mobile terminal. The mobile terminal includes but is not limited to a smart phone, a tablet computer, a learning machine or a smart car device.
As shown in fig. 3, the speech signal output processing apparatus 3 includes:
an audiosignal acquisition unit 301 for recognizing a speech signal and a background noise signal from a sound signal acquired in real time;
an amplitudedifference calculation unit 302, configured to obtain an amplitude difference between the speech signal and the background noise signal;
aprocessing unit 303, configured to perform noise reduction processing on the sound signal based on the amplitude difference.
Optionally, the apparatus for processing the output of the voice signal further includes:
and the prompting unit is used for outputting prompting information if the amplitude difference value is a non-positive value, wherein the prompting information is used for prompting a user to approach a microphone for conversation or increasing the speaking volume.
Optionally, the processing unit further includes:
the first processing subunit is configured to, if the amplitude difference is a positive value, and the amplitude difference is smaller than or equal to a second threshold, and the amplitude of the voice signal is greater than a third threshold, amplify the voice signal by a preset gain, so as to obtain a first intermediate signal;
carrying out noise reduction processing on the first intermediate signal to obtain a second intermediate signal;
and attenuating the second intermediate signal according to the preset gain to obtain the sound signal subjected to noise reduction processing.
Optionally, the processing unit further includes:
and the second processing subunit is used for performing noise reduction processing on the sound signal if the amplitude difference value is a positive value and is greater than or equal to a first threshold value, and the intensity of the noise reduction processing is in direct proportion to the amplitude difference value.
Optionally, the apparatus for processing an output of a voice signal further includes:
a pre-processing unit for pre-filtering the sound signal before recognizing the speech signal and the background noise signal.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
Fig. 4 is a schematic diagram of a terminal device according to an embodiment of the present invention. As shown in fig. 4, the terminal device 4 of this embodiment includes: aprocessor 40, amemory 41 and acomputer program 42 stored in saidmemory 41 and executable on saidprocessor 40. Theprocessor 40, when executing thecomputer program 42, implements the steps in the above-described respective speech signal output processing method embodiments, such as the steps 101 to 103 shown in fig. 1. Alternatively, theprocessor 40, when executing thecomputer program 42, implements the functions of the units in the device embodiments described above, such as the functions of theunits 301 to 303 shown in fig. 3.
Illustratively, thecomputer program 42 may be partitioned into one or more modules/units that are stored in thememory 41 and executed by theprocessor 40 to implement the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of thecomputer program 42 in the terminal device 4. For example, thecomputer program 42 may be divided into a synchronization module, a summary module, an acquisition module, and a return module (a module in a virtual device), and each module has the following specific functions:
the terminal device 4 may be a computing device with a voice input/output function, such as a notebook, a palm computer, a mobile phone, a tablet computer, and a navigator. The terminal device may include, but is not limited to, aprocessor 40, amemory 41. Those skilled in the art will appreciate that fig. 4 is merely an example of a terminal device 4 and does not constitute a limitation of terminal device 4 and may include more or fewer components than shown, or some components may be combined, or different components, e.g., the terminal device may also include input-output devices, network access devices, buses, etc.
TheProcessor 40 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Thememory 41 may be an internal storage unit of the terminal device 4, such as a hard disk or a memory of the terminal device 4. Thememory 41 may also be an external storage device of the terminal device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the terminal device 4. Further, thememory 41 may also include both an internal storage unit and an external storage device of the terminal device 4. Thememory 41 is used for storing the computer program and other programs and data required by the terminal device. Thememory 41 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. . Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims (7)

Translated fromChinese
1.一种语音信号的输出处理方法,其特征在于,包括:1. an output processing method of a speech signal, is characterized in that, comprising:从实时获取的声音信号中识别出语音信号和背景噪声信号,包括:通过预先存储人声模型和噪声模型的方式来对所述语音和背景噪声进行识别,针对所述模型的特征对所述人声和背景噪声进行提取;其中,所述人声模型包含人声声音的语音特征,包括:频率、过零率、短时平均能量、短时平均幅度;Identifying the speech signal and the background noise signal from the sound signal acquired in real time includes: identifying the speech and background noise by pre-storing a human voice model and a noise model, and identifying the human voice and background noise according to the features of the model. The voice and background noise are extracted; wherein, the human voice model includes the voice features of the human voice, including: frequency, zero-crossing rate, short-term average energy, and short-term average amplitude;获取所述语音信号与所述背景噪声信号的幅度差值,包括:根据记录到的声音波形来计算一段时间内采集到的声音信号中的代表人声的语音信号平均幅度和背景噪声信号的平均幅度,对获取的语音信号平均幅度和背景噪声信号平均幅度求绝对值,然后计算语音信号平均幅度的绝对值和背景噪声信号平均幅度的绝对值之间的差值;Obtaining the amplitude difference between the voice signal and the background noise signal includes: calculating the average amplitude of the voice signal representing the human voice and the average amplitude of the background noise signal in the voice signal collected within a period of time according to the recorded voice waveform Amplitude, find the absolute value of the average amplitude of the acquired voice signal and the average amplitude of the background noise signal, and then calculate the difference between the absolute value of the average amplitude of the voice signal and the absolute value of the average amplitude of the background noise signal;基于所述幅度差值对所述声音信号进行降噪处理,包括:Perform noise reduction processing on the sound signal based on the amplitude difference, including:若所述幅度差值S为正值,且大于或等于第一阈值时,对所述声音信号进行降噪处理,且所述降噪处理的强度与所述幅度差值成正比;If the amplitude difference value S is a positive value and is greater than or equal to a first threshold, noise reduction processing is performed on the sound signal, and the intensity of the noise reduction processing is proportional to the amplitude difference value;若所述幅度差值为正值,且所述幅度差值小于或等于第二阈值T,且所述语音信号的幅度大于第三阈值时,If the amplitude difference is a positive value, and the amplitude difference is less than or equal to the second threshold T, and the amplitude of the speech signal is greater than the third threshold,将所述声音信号放大预设增益A,得到第一中间信号;其中,A≥T/S;Amplify the sound signal with a preset gain A to obtain a first intermediate signal; wherein, A≥T/S;对所述第一中间信号进行降噪处理,得到第二中间信号;performing noise reduction processing on the first intermediate signal to obtain a second intermediate signal;按照所述预设增益衰减所述第二中间信号,得到降噪处理后的所述声音信号。Attenuate the second intermediate signal according to the preset gain to obtain the sound signal after noise reduction processing.2.如权利要求1所述的语音信号的输出处理方法,其特征在于,所述基于所述幅度差值对所述声音信号进行降噪处理,包括:2. The method for outputting a speech signal according to claim 1, wherein the performing noise reduction processing on the sound signal based on the amplitude difference value comprises:若所述幅度差值为非正值,输出提示信息,所述提示信息用于提示用户靠近麦克风通话或者增大说话音量。If the amplitude difference is a non-positive value, prompt information is output, where the prompt information is used to prompt the user to approach the microphone to talk or to increase the speaking volume.3.如权利要求1所述的语音信号的输出处理方法,其特征在于,在识别出语音信号和背景噪声信号之前,包括:3. The output processing method of speech signal as claimed in claim 1, is characterized in that, before recognizing speech signal and background noise signal, comprising:对所述声音信号进行预滤波。The sound signal is pre-filtered.4.一种语音信号的输出处理装置,其特征在于,包括:4. An output processing device for a voice signal, comprising:音频信号获取单元,用于从实时获取的声音信号中识别出语音信号和背景噪声信号,包括:通过预先存储人声模型和噪声模型的方式来对所述语音和背景噪声进行识别,针对所述模型的特征对所述人声和背景噪声进行提取;其中,所述人声模型包含人声声音的语音特征,包括:频率、过零率、短时平均能量、短时平均幅度;The audio signal acquisition unit is used to identify the voice signal and the background noise signal from the real-time acquired sound signal, including: identifying the voice and background noise by pre-storing a human voice model and a noise model, and for the The features of the model are extracted from the human voice and background noise; wherein, the human voice model includes the voice features of the human voice, including: frequency, zero-crossing rate, short-term average energy, and short-term average amplitude;幅度差值计算单元,用于获取所述语音信号与所述背景噪声信号的幅度差值,包括:根据记录到的声音波形来计算一段时间内采集到的声音信号中的代表人声的语音信号平均幅度和背景噪声信号的平均幅度,对获取的语音信号平均幅度和背景噪声信号平均幅度求绝对值,然后计算语音信号平均幅度的绝对值和背景噪声信号平均幅度的绝对值之间的差值;an amplitude difference calculation unit, configured to obtain the amplitude difference between the voice signal and the background noise signal, including: calculating a voice signal representing a human voice in the voice signals collected within a period of time according to the recorded voice waveform The average amplitude and the average amplitude of the background noise signal, the absolute value of the obtained average amplitude of the voice signal and the average amplitude of the background noise signal is calculated, and then the difference between the absolute value of the absolute value of the average amplitude of the voice signal and the absolute value of the average amplitude of the background noise signal is calculated ;处理单元,用于基于所述幅度差值对所述声音信号进行降噪处理,包括:A processing unit, configured to perform noise reduction processing on the sound signal based on the amplitude difference, including:若所述幅度差值S为正值,且大于或等于第一阈值时,对所述声音信号进行降噪处理,且所述降噪处理的强度与所述幅度差值成正比;If the amplitude difference value S is a positive value and is greater than or equal to a first threshold, noise reduction processing is performed on the sound signal, and the intensity of the noise reduction processing is proportional to the amplitude difference value;若所述幅度差值为正值,且所述幅度差值小于或等于第二阈值T,且所述语音信号的幅度大于第三阈值时,If the amplitude difference is a positive value, and the amplitude difference is less than or equal to the second threshold T, and the amplitude of the speech signal is greater than the third threshold,将所述声音信号放大预设增益A,得到第一中间信号;其中,A≥T/S;Amplify the sound signal with a preset gain A to obtain a first intermediate signal; wherein, A≥T/S;对所述第一中间信号进行降噪处理,得到第二中间信号;performing noise reduction processing on the first intermediate signal to obtain a second intermediate signal;按照所述预设增益衰减所述第二中间信号,得到降噪处理后的所述声音信号。Attenuate the second intermediate signal according to the preset gain to obtain the sound signal after noise reduction processing.5.如权利要求4所述的语音信号的输出处理装置,其特征在于,所述处理单元包括:5. The output processing device of a speech signal according to claim 4, wherein the processing unit comprises:提示单元,用于若所述幅度差值为非正值,输出提示信息,所述提示信息用于提示用户靠近麦克风通话或者增大说话音量。A prompting unit, configured to output prompt information if the amplitude difference is a non-positive value, where the prompt information is used to prompt the user to talk close to the microphone or increase the speaking volume.6.一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1至3任一项所述方法的步骤。6. A terminal device, comprising a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor implements the computer program as claimed in the claims Steps of any one of 1 to 3 of the method.7.一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至3任一项所述方法的步骤。7. A computer-readable storage medium storing a computer program, wherein the computer program implements the steps of the method according to any one of claims 1 to 3 when the computer program is executed by a processor .
CN201711104384.1A2017-11-102017-11-10 A kind of output processing method and device of voice signalActiveCN107910013B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201711104384.1ACN107910013B (en)2017-11-102017-11-10 A kind of output processing method and device of voice signal

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201711104384.1ACN107910013B (en)2017-11-102017-11-10 A kind of output processing method and device of voice signal

Publications (2)

Publication NumberPublication Date
CN107910013A CN107910013A (en)2018-04-13
CN107910013Btrue CN107910013B (en)2021-09-24

Family

ID=61844674

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201711104384.1AActiveCN107910013B (en)2017-11-102017-11-10 A kind of output processing method and device of voice signal

Country Status (1)

CountryLink
CN (1)CN107910013B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN108831500B (en)*2018-05-292023-04-28平安科技(深圳)有限公司Speech enhancement method, device, computer equipment and storage medium
CN109102800A (en)*2018-07-262018-12-28广州酷狗计算机科技有限公司A kind of method and apparatus that the determining lyrics show data
CN110164423B (en)2018-08-062023-01-20腾讯科技(深圳)有限公司Azimuth angle estimation method, azimuth angle estimation equipment and storage medium
CN109637543B (en)*2018-12-122024-09-24平安科技(深圳)有限公司Voice data processing method and device of voice card
CN111383647B (en)*2018-12-282022-10-25展讯通信(上海)有限公司Voice signal processing method and device and readable storage medium
CN109639904B (en)*2019-01-252021-02-02努比亚技术有限公司Mobile phone mode adjusting method, system and computer storage medium
CN111768794A (en)*2019-03-152020-10-13上海博泰悦臻网络技术服务有限公司Voice noise reduction method, voice noise reduction system, equipment and storage medium
CN111796790B (en)*2019-04-092023-09-08深圳市冠旭电子股份有限公司 A sound effect adjustment method, device, readable storage medium and terminal equipment
CN110097884B (en)*2019-06-112022-05-17大众问问(北京)信息科技有限公司 A voice interaction method and device
CN112669866B (en)*2019-09-302025-01-28广州慧睿思通科技股份有限公司 Speech noise reduction method, system and computer storage medium based on loudness level
CN112911441A (en)*2021-01-182021-06-04上海闻泰信息技术有限公司Noise reduction method, apparatus, audio device, and computer-readable storage medium
CN117795981A (en)*2021-08-102024-03-29三星电子株式会社Electronic device for correcting sound signal and method for controlling electronic device
CN115727473A (en)*2021-08-312023-03-03佛山市顺德区美的电子科技有限公司Air conditioning equipment control method and device and air conditioning equipment
DE102021211879B4 (en)*2021-10-212025-05-08Sivantos Pte. Ltd. Hearing aid and method for operating such a device
CN115019836A (en)*2022-04-292022-09-06东风汽车有限公司东风日产乘用车公司 Vehicle prompt sound playback control method, storage medium and electronic device
CN116168719A (en)*2022-12-262023-05-26杭州爱听科技有限公司Sound gain adjusting method and system based on context analysis

Citations (9)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101272131A (en)*2007-03-132008-09-24瑞昱半导体股份有限公司Programmable gain amplifier with noise elimination function
CN101976566A (en)*2010-07-092011-02-16瑞声声学科技(深圳)有限公司Voice enhancement method and device using same
US8321215B2 (en)*2009-11-232012-11-27Cambridge Silicon Radio LimitedMethod and apparatus for improving intelligibility of audible speech represented by a speech signal
US8364477B2 (en)*2005-05-252013-01-29Motorola Mobility LlcMethod and apparatus for increasing speech intelligibility in noisy environments
CN104064185A (en)*2013-03-182014-09-24联想(北京)有限公司Information processing method and system and electronic device
CN104103278A (en)*2013-04-022014-10-15北京千橡网景科技发展有限公司Real time voice denoising method and device
CN105845151A (en)*2016-05-302016-08-10百度在线网络技术(北京)有限公司Audio gain adjustment method and audio gain adjustment device applied to speech recognition front-end
CN106782586A (en)*2016-11-142017-05-31阔地教育科技有限公司A kind of acoustic signal processing method and device
CN107092461A (en)*2017-06-012017-08-25深圳天珑无线科技有限公司The way of recording, device and computer-readable recording medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US7099821B2 (en)*2003-09-122006-08-29Softmax, Inc.Separation of target acoustic signals in a multi-transducer arrangement
US8949120B1 (en)*2006-05-252015-02-03Audience, Inc.Adaptive noise cancelation
CN101859568B (en)*2009-04-102012-05-30比亚迪股份有限公司Method and device for eliminating voice background noise
CN104376848B (en)*2013-08-122018-03-23展讯通信(上海)有限公司Audio signal processing method and device
CN104810024A (en)*2014-01-282015-07-29上海力声特医学科技有限公司Double-path microphone speech noise reduction treatment method and system
CN106898360B (en)*2017-04-062023-08-08北京地平线信息技术有限公司Audio signal processing method and device and electronic equipment

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US8364477B2 (en)*2005-05-252013-01-29Motorola Mobility LlcMethod and apparatus for increasing speech intelligibility in noisy environments
CN101272131A (en)*2007-03-132008-09-24瑞昱半导体股份有限公司Programmable gain amplifier with noise elimination function
US8321215B2 (en)*2009-11-232012-11-27Cambridge Silicon Radio LimitedMethod and apparatus for improving intelligibility of audible speech represented by a speech signal
CN101976566A (en)*2010-07-092011-02-16瑞声声学科技(深圳)有限公司Voice enhancement method and device using same
CN104064185A (en)*2013-03-182014-09-24联想(北京)有限公司Information processing method and system and electronic device
CN104103278A (en)*2013-04-022014-10-15北京千橡网景科技发展有限公司Real time voice denoising method and device
CN105845151A (en)*2016-05-302016-08-10百度在线网络技术(北京)有限公司Audio gain adjustment method and audio gain adjustment device applied to speech recognition front-end
CN106782586A (en)*2016-11-142017-05-31阔地教育科技有限公司A kind of acoustic signal processing method and device
CN107092461A (en)*2017-06-012017-08-25深圳天珑无线科技有限公司The way of recording, device and computer-readable recording medium

Also Published As

Publication numberPublication date
CN107910013A (en)2018-04-13

Similar Documents

PublicationPublication DateTitle
CN107910013B (en) A kind of output processing method and device of voice signal
US11017799B2 (en)Method for processing voice in interior environment of vehicle and electronic device using noise data based on input signal to noise ratio
CN106486131B (en) Method and device for voice denoising
CN110459234A (en) Voice recognition method and system for vehicle
JP4283212B2 (en) Noise removal apparatus, noise removal program, and noise removal method
US10553236B1 (en)Multichannel noise cancellation using frequency domain spectrum masking
EP3698360A1 (en)Noise reduction using machine learning
WO2014063104A2 (en)Keyword voice activation in vehicles
JP2009104140A (en) Dynamic noise reduction
CN110556125B (en)Feature extraction method and device based on voice signal and computer storage medium
CN104103278A (en)Real time voice denoising method and device
US20140244245A1 (en)Method for soundproofing an audio signal by an algorithm with a variable spectral gain and a dynamically modulatable hardness
CN110503973B (en)Audio signal transient noise suppression method, system and storage medium
CN104637489A (en)Method and device for processing sound signals
CN110600048B (en)Audio verification method and device, storage medium and electronic equipment
CN110970051A (en)Voice data acquisition method, terminal and readable storage medium
US12119015B2 (en)Systems, methods, apparatus, and storage medium for processing a signal
CN106251856A (en)A kind of environment noise based on mobile terminal eliminates system and method
CN118800268B (en) Voice signal processing method, voice signal processing device and storage medium
CN104505099A (en)Method and equipment for removing known interference in voice signal
EP2752848A1 (en)Method and apparatus for generating a noise reduced audio signal using a microphone array
CN106782586B (en)Audio signal processing method and device
TWI523006B (en)Method for using voiceprint identification to operate voice recoginition and electronic device thereof
CN114255779A (en) Audio noise reduction method, electronic device and storage medium for VR device
CN103824563A (en)Hearing aid denoising device and method based on module multiplexing

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
CB02Change of applicant information
CB02Change of applicant information

Address after:Changan town in Guangdong province Dongguan 523860 usha Beach Road No. 18

Applicant after:OPPO Guangdong Mobile Communications Co.,Ltd.

Address before:Changan town in Guangdong province Dongguan 523860 usha Beach Road No. 18

Applicant before:GUANGDONG OPPO MOBILE TELECOMMUNICATIONS Corp.,Ltd.

GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp