Detailed Description
The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present invention are shown in the drawings.
Example 1
Fig. 1 is a flowchart of an echo cancellation method according to a first embodiment of the present invention, where the method is applicable to solve a problem of an excessive time difference between audio playing and audio acquisition during real-time audio-video communication, and the method may be performed by an echo cancellation device according to any embodiment of the present invention, where the device may be composed of hardware and/or software, and may be generally integrated in an electronic device.
As shown in fig. 1, the echo cancellation method provided in this embodiment may include:
s110, in the real-time communication process, when the acquisition thread acquires the audio acquisition data, synchronous indication information is sent to the playing thread, and the audio acquisition data is sent to the echo cancellation thread.
The real-time communication can be real-time audio communication or real-time audio/video communication.
An acquisition thread refers to a thread for acquiring audio data; the playing thread refers to a thread for playing audio data; echo cancellation thread refers to a thread that performs echo cancellation on audio acquisition data. In this embodiment, the three threads of the acquisition thread, the playing thread and the echo cancellation thread are independent from each other.
The synchronous indication information is a signal quantity sent to the playing thread by the acquisition thread and is used for indicating the playing thread to execute the operation of acquiring the audio data for playing so as to realize the synchronization between the acquisition thread and the playing thread.
The audio collection data refers to audio data collected by a sound card.
When the real-time communication is carried out, the acquisition thread reads the audio acquisition data from the sound card, then sends synchronous indication information to the playing thread, and sends the read audio acquisition data to the echo cancellation thread.
S120, after receiving the synchronization indication information, the playing thread acquires audio playing data for playing and sends the audio playing data to the echo cancellation thread.
Audio play data refers to audio data received through a network.
And after receiving the synchronization indication information, the playing thread reads the audio playing data from the corresponding buffer queue to play, and simultaneously sends the read audio playing data to the echo cancellation thread.
In an alternative embodiment, the synchronization start indication information is sent via a blocking queue.
When the acquisition thread acquires the audio acquisition data, the synchronization instruction information is written into the blocking queue, and the playing thread reads the synchronization instruction information in the blocking thread, so that the acquisition thread sends the synchronization instruction information to the playing thread.
S130, performing echo cancellation processing on the audio acquisition data according to the audio play data through an echo cancellation thread.
The audio collection data and the audio playing data are sent to an echo cancellation thread, and the echo cancellation thread carries out filtering processing on the audio collection data according to the audio playing data, so that the audio collection data does not comprise audio components corresponding to the audio playing data, and further echo cancellation processing on the audio collection data is realized.
Further, after performing echo cancellation processing on the audio acquisition data according to the audio play data by an echo cancellation thread, the method may further include:
transmitting the target audio data subjected to echo cancellation processing to a network data transmitting thread through an echo cancellation thread; and sending the target audio data through the network data sending thread.
The target audio data refers to the audio acquisition data after echo cancellation processing.
The echo cancellation thread carries out echo cancellation processing on the audio acquisition data to obtain target audio data, and sends the target audio data to the network data sending thread, and the network data sending thread sends the received target audio data to the opposite terminal through the network.
In the embodiment of the invention, when the acquisition thread acquires the audio acquisition data in real time communication, the acquisition thread sends the synchronization indication information to the playing thread, and sends the audio acquisition data to the echo cancellation thread, the playing thread acquires the audio playing data for playing after receiving the synchronization indication information, and sends the audio playing data to the echo cancellation thread, and then the echo cancellation thread carries out echo cancellation processing on the audio acquisition data according to the audio playing data, thereby ensuring the data synchronization between the acquisition thread and the playing thread, solving the problem of overlarge time difference between audio playing and audio acquisition, and improving the accuracy of audio echo cancellation. Furthermore, after the audio acquisition data subjected to the echo cancellation processing is sent to the opposite terminal through the network data sending thread, the conversation quality can be improved.
Example two
Fig. 2 is a flowchart of an echo cancellation method according to a second embodiment of the present invention. The embodiment is embodied on the basis of the foregoing embodiment, where when acquiring the audio acquisition data by the acquisition thread, sending synchronization indication information to the playing thread may be specifically: when acquiring audio acquisition data according to a preset period, the acquisition thread sends synchronous indication information to the playing thread.
As shown in fig. 2, the echo cancellation method provided in this embodiment may include:
s210, in the real-time communication process, when the acquisition thread acquires the audio acquisition data according to a preset period, synchronous indication information is sent to the playing thread, and the audio acquisition data is sent to the echo cancellation thread.
The preset period refers to a period of reading audio collection data by a preset collection thread, and is generally taken as a period of milliseconds, for example, may be 20 milliseconds.
When the acquisition thread acquires the audio acquisition data every time according to a preset period, synchronous indication information is sent to the playing thread, and the read audio acquisition data is sent to the echo cancellation thread.
Optionally, before the audio collection data is acquired by the collection thread according to the preset period and the synchronization indication information is sent to the playing thread, the method further includes:
when acquiring audio acquisition data, the acquisition thread sends synchronous start indication information to the playing thread; and after receiving the synchronization start indication information, the playing thread performs initialization operation.
The synchronization start indication information refers to a semaphore sent by the acquisition thread to the playing thread and is used for indicating the playing thread to perform initialization operation so as to realize synchronization between the subsequent acquisition thread and the playing thread.
The initialization operation may be a preparation operation of the playing thread before the audio playing data is acquired, which is not limited in detail in this embodiment.
And after receiving the synchronization start indication information, the playing thread immediately performs an initialization operation.
Optionally, the acquisition thread sends synchronization start indication information to the playing thread when the audio acquisition data is acquired for the first time, or the acquisition thread sends synchronization start indication information to the playing thread after the audio acquisition data has been acquired for several times (for example, twice and three times).
After sending the synchronization start indication information to the playing thread, the acquisition thread sends the synchronization indication information to the playing thread when acquiring the audio acquisition data every time according to a preset period.
In an alternative embodiment, the synchronization start indication information and the synchronization indication information are sent through a blocking queue. The synchronization start indication information and the synchronization indication information are two different values, for example, may be two different integer values, for example, an integer value of "1" identifies the synchronization indication information, and an integer value of "0" identifies the synchronization start indication information.
S220, after receiving the synchronization indication information, the playing thread acquires audio playing data for playing and sends the audio playing data to the echo cancellation thread.
S230, performing echo cancellation processing on the audio acquisition data according to the audio play data through an echo cancellation thread.
S240, the target audio data after echo cancellation processing is sent to a network data sending thread through an echo cancellation thread.
S250, sending the target audio data through a network data sending thread.
The present embodiment is not explained in detail with reference to the foregoing embodiments, and will not be described in detail herein.
In a specific embodiment, the real-time communication is a real-time communication based on an Android (Android) system, for example, may be a real-time audio communication based on the Android system or a real-time audio-video communication based on the Android system.
At present, the echo cancellation problem of the Real-time communication software of the Android terminal basically adopts two echo cancellation modules, namely Speex and WebRTC (Web Real-Time Communication, web page Real-time communication). In comparison, the echo cancellation module of WebRTC has better echo cancellation effect than Speex.
WebRTC, a real-time audio and video communication technology based on a web browser application, has powerful functions, mainly comprises a voice engine module (voice engine), a video engine module (video engine), a Transport layer module (Transport), and a session management module (session management), and uses AEC (Acoustic Echo Cancellation) in the voice engine module (voice engine), which provides a good foundation for solving echo cancellation in real-time communication of an Android terminal.
The AEC module of WebRTC implements echo cancellation by using an adaptive filtering algorithm, and the principle of the adaptive filtering algorithm is shown in fig. 3, if y (n) represents the voice data collected by the far-end microphone 21, y (n) is sent to the adaptive filter 30 while being sent to the near-end speaker 12 for playing, so as to generate an echo estimated value d (n). If x (n) represents near-end useful speech, the data collected by the near-end microphone 11 is the superposition of echoes d (n) generated by x (n) and y (n), i.e., x (n) +d (n). After filtering processing, the final near-end voice data is: u (n) =x (n) +d (n) -d ζ (n), where e (n) =d (n) -d ζ (n), referred to as echo cancellation error, is sent to the far-end speaker 22, which sends to the adaptive filter 30. In an ideal situation, e (n) should be 0, but in reality e (n) cannot be 0, and the adaptive filter 30 will automatically adjust the filter coefficients according to the value of e (n).
Since Android is not a real-time operating system, a certain time delay exists between the collection of the microphone and the playing of the loudspeaker, and each Android device may have different time delay. The basic principle of WebRTC echo cancellation AEC is to cancel what has been played in the audio collected by the microphone with reference to the data played by the speaker. If the time difference between the audio playing and the audio collecting is too large, the audio playing data and the audio collecting data referred to in the process of echo cancellation have too large dislocation, so that the problem of non-convergence of the weight value occurs in the process of adaptive filtering, and the echo cannot be properly eliminated.
In the application development of Android real-time communication, an acquisition thread (ReadThread) is generally used for audio acquisition, a playing thread (WriteThread) is used for remote audio playing, a receiving thread (InThread) and a sending thread (OutThread) are used for receiving and sending network data, and an echo cancellation thread (AecThread) is used for performing echo cancellation operation.
Because the acquisition thread and the playing thread are two mutually independent threads, the data synchronization between the acquisition thread and the playing thread cannot be ensured, and therefore, if the time difference between the audio playing and the audio acquisition at a certain moment is overlarge, overlarge dislocation exists between the audio playing data and the audio acquisition data which are referred during echo cancellation, the weight of a filter diverges in the process of self-adaptive filtering, and the problem that the echo cancellation is not clean or even diverges thoroughly can occur.
In view of this, in this embodiment, data synchronization between the acquisition thread and the playback thread is achieved through the synchronization instruction information, that is, an association relationship between the acquisition thread and the playback thread is established through the synchronization instruction information, so that time consistency between the far-end data and the near-end data for echo cancellation is ensured, and the effect of echo cancellation is effectively improved. Referring to fig. 4, the processing flow of the echo cancellation method specifically includes: the receiving thread receives network audio data, wherein data packets of the network audio data can be buffered and rearranged by the packet receiving, the collecting thread sends synchronous indication information to the playing thread when the audio collecting data is read, the playing thread reads the buffered audio playing data after receiving the synchronous indication information, the audio collecting data read by the collecting thread and the audio playing data read by the playing thread are sent to the echo cancellation thread together, the echo cancellation thread carries out echo cancellation on the audio collecting data according to the audio playing data and then sends the audio collecting data to the sending thread, and the sending thread sends the received audio data after echo cancellation through a network.
For example, taking the acquisition thread as an example to acquire audio data every 20ms, the acquisition thread reads the audio data of 20ms from the sound card every time, then sends a synchronization indication message to the playing thread, informs the playing thread to take out the audio data of 20ms from the corresponding buffer queue for playing, and sends the 20ms data acquired by the acquisition thread from the sound card and the 20ms data taken out by the playing thread from the buffer queue to the echo cancellation thread for echo cancellation, so that the time consistency of the audio acquisition data and the audio playing data is ensured, the divergence of an adaptive filter caused by time dislocation between data is avoided, and the phenomenon that echo cancellation cannot occur.
According to the technical scheme, the mechanism that the acquisition thread and the playing thread are synchronized through the synchronization indication information is provided, the problem that echo cancellation cannot be achieved due to divergence of the adaptive filter caused by inconsistent time of audio acquisition data and audio playing data sent to the AEC echo cancellation module in the WebRTC in real-time communication of the Android terminal is solved, the acoustic echo suppression effect on the Android terminal is improved, further requirements of the Android terminal on voice can be met well in practical application, and conversation quality of the Android terminal is improved.
Example III
Fig. 5 is a schematic block diagram of an echo cancellation device according to a third embodiment of the present invention, where the present embodiment is applicable to solving the problem of an excessive time difference between audio playing and audio acquisition during real-time audio/video communication, and the device may be implemented in a software and/or hardware manner and may be generally integrated in an electronic device. As shown in fig. 5, the apparatus includes: a thread synchronization indication sending module 310, a thread synchronization indication receiving module 320, and an echo cancellation module 330. Wherein,,
the thread synchronization instruction sending module 310 is configured to send synchronization instruction information to a playing thread when acquiring audio acquisition data through an acquisition thread in a real-time communication process, and send the audio acquisition data to an echo cancellation thread;
the thread synchronization instruction receiving module 320 is configured to obtain audio playing data for playing after receiving the synchronization instruction information through the playing thread, and send the audio playing data to the echo cancellation thread;
and the echo cancellation module 330 is configured to perform echo cancellation processing on the audio acquisition data according to the audio play data by using the echo cancellation thread.
In the embodiment of the invention, when the acquisition thread acquires the audio acquisition data in real time communication, the acquisition thread sends the synchronization indication information to the playing thread, and sends the audio acquisition data to the echo cancellation thread, the playing thread acquires the audio playing data for playing after receiving the synchronization indication information, and sends the audio playing data to the echo cancellation thread, and then the echo cancellation thread carries out echo cancellation processing on the audio acquisition data according to the audio playing data, thereby ensuring the data synchronization between the acquisition thread and the playing thread, solving the problem of overlarge time difference between audio playing and audio acquisition, and improving the accuracy of audio echo cancellation.
Optionally, the thread synchronization instruction sending module 310 is specifically configured to send synchronization instruction information to the playing thread when the audio acquisition data is acquired by the acquisition thread according to a preset period in a real-time communication process.
Further, the device further comprises: the synchronous start instruction information sending module is used for sending synchronous start instruction information to the playing thread when the audio acquisition data is acquired through the acquisition thread before sending synchronous instruction information to the playing thread according to a preset period; and after receiving the synchronization start indication information, carrying out initialization operation by the play thread.
Optionally, the synchronization start indication information and the synchronization indication information are sent through a blocking queue.
Further, the device further comprises: the echo cancellation audio data sending module is used for sending the target audio data after echo cancellation processing to a network data sending thread through the echo cancellation thread after echo cancellation processing is carried out on the audio acquisition data according to the audio play data through the echo cancellation thread; and sending the target audio data through the network data sending thread.
Optionally, the real-time communication is a real-time communication based on an android system.
Optionally, the real-time communication includes a real-time audio communication or a real-time audio-video communication.
The echo cancellation device provided by the embodiment of the invention can execute the echo cancellation method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example IV
Fig. 6 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present invention, and as shown in fig. 6, the electronic device includes a processor 410, a memory 420, an input device 430 and an output device 440; the number of processors 410 in the electronic device may be one or more, one processor 410 being taken as an example in fig. 6; the processor 410, memory 420, input device 430, and output device 440 in the electronic device may be connected by a bus or other means, for example in fig. 6.
The memory 420 is used as a computer readable storage medium for storing software programs, computer executable programs, and modules, such as program instructions/modules corresponding to the echo cancellation method in the embodiment of the present invention (for example, the thread synchronization instruction sending module 310, the thread synchronization instruction receiving module 320, and the echo cancellation module 330 in the echo cancellation device shown in fig. 5). The processor 410 executes various functional applications of the electronic device and data processing, i.e., implements the echo cancellation method described above, by running software programs, instructions, and modules stored in the memory 420.
Memory 420 may include primarily a program storage area and a data storage area, wherein the program storage area may store an operating system, at least one application program required for functionality; the storage data area may store data created according to the use of the electronic device, etc. In addition, memory 420 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some examples, memory 420 may further include memory remotely located relative to processor 410, which may be connected to the electronic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 430 may be used to receive input numeric or character information and to generate key signal inputs related to user settings and function control of the electronic device. The output 440 may include a display device such as a display screen.
Example five
A fifth embodiment of the present invention also provides a computer-readable storage medium storing a computer program which, when executed by a computer processor, is configured to perform an echo cancellation method, the method comprising:
in the real-time communication process, when acquiring audio acquisition data, an acquisition thread sends synchronous indication information to a playing thread and sends the audio acquisition data to an echo cancellation thread;
after receiving the synchronization indication information, the playing thread acquires audio playing data for playing and sends the audio playing data to the echo cancellation thread;
and carrying out echo cancellation processing on the audio acquisition data according to the audio playing data through the echo cancellation thread.
Of course, the computer readable storage medium storing the computer program provided by the embodiments of the present invention is not limited to the above method operations, and may also perform the related operations in the echo cancellation method provided by any embodiment of the present invention.
From the above description of embodiments, it will be clear to a person skilled in the art that the present invention may be implemented by means of software and necessary general purpose hardware, but of course also by means of hardware, although in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, etc., comprising several instructions for causing an electronic device (which may be a personal computer, a mobile terminal, a server, or a network device, etc.) to execute the method of the embodiments of the present invention.
It should be noted that, in the embodiment of the echo cancellation device, each unit and module included are only divided according to the functional logic, but not limited to the above division, so long as the corresponding function can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present invention.
Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.