Audio processing in a processing system
FIELD OF THE INVENTION
The invention relates to audio processing in a processing system and in particular, but not exclusively, to audio processing in a general processing system, such as a Personal Computer (PC).
BACKGROUND OF THE INVENTION
Many processing systems are arranged to provide processing in a layered structure wherein lower layers provide support and functions that can be accessed by higher layer processes. Such a layered structure typically includes an operating system layer that sits between the hardware resources of the processing system and the application layer. Thus, the individual applications may access hardware functionality by calling or executing various operating system processes. The operating system may further provide control and management functionality, such as e.g. functionality for managing resource between different applications, sharing processes between different applications etc. For example, generic processing systems, such as Personal Computers (PCs), typically include operating systems that provide a variety of functions that can be called by the applications. Such operating systems include for example Windows , Linux etc.
In many processing systems, the operating system supports an audio system that allows audio to be input and output from the processing system. Typically, the audio hardware is supported by audio drivers that are part of the operating system. In many systems, the audio hardware may vary for different processing systems and even with time for the same processing system. Furthermore, the audio hardware may be provided by different manufacturers which even may be different for the audio input and the audio output path. For example, for a PC, a user may plug in audio input hardware (e.g. a microphone and associated input circuitry) from one manufacturer to one USB port and audio output hardware (e.g. headphones and associated output circuitry) from a different manufacturer to another USB port.
In order to provide a flexible system that can support many different types of audio hardware with different characteristics and from different manufacturers, processing systems often provide audio drivers that operate as part of the operating system. These audio drivers may be selected to match the specific hardware used and are often provided by the hardware manufacturer. The audio drivers are typically installed when the presence of the associated hardware is present and may be provided and distributed independently of the main parts of the operating system. For example, when purchasing new audio hardware, a corresponding audio driver may be provided. This may then be installed as part of the operating system. Furthermore, the audio drivers are typically not permanently included in the operating system but are only initialized when the presence of the corresponding audio hardware is detected (e.g. when hardware is plugged into a USB port). Traditionally, the operating system provides functionality for input and audio output drivers to communicate with each other. This is used to provide various additional functionality.
For example, audio data fed from the audio output driver to the audio input driver can be used to provide acoustic echo cancellation. In a system where a microphone of the audio input path is situated such that it can pick up sound radiated from a loudspeaker of the audio output path, an acoustic feedback is generated between the loudspeaker and the microphone. For many applications, this acoustic feedback can result in disadvantageous effects including self oscillation or annoying echoes. An acoustic echo cancellation process will seek to attenuate the loudspeaker sound component in the input signal in order to mitigate these effects.
Such acoustic echo cancellation processes are widely used and are for example essential for many voice communication applications. For example, enabled by the widespread availability of broadband internet and the maturity of Voice over Internet Protocol (VOIP) applications, PCs are now widely used for Internet telephony. An important difference between a regular telephone and a PC is the large variety in loudspeaker and microphone hardware that may be connected to, or built into, the PC. This variety causes the electro-acoustical properties of a PC used for telephony to be rather unpredictable.
PCs are often used for hands-free telephony, in particular when combined with video. For example, a microphone built into a webcam captures the near end voice, while the far end speech is reproduced by a set of PC loudspeakers.
Despite having been researched and developed for many years, hands-free telephony is still technically challenging. High speech quality and true full-duplex behavior is desired and this requires a highly efficient acoustic echo cancellation to cancel or reduce the acoustic feedback from the speaker to the microphone. Also, enhancements such as directive input microphones or speech improvement processing has been widely used to improve quality.
However, many such algorithms require information of the audio output signal simultaneously radiated from the speaker. In many operating systems, such processing is performed by the audio input driver and in order to do this the operating system provides a direct connection from the audio output path to the audio input path. Indeed, many operating systems support an audio output driver feeding audio output data directly to the audio input driver thereby allowing the audio input driver to use this data when performing the processing of the input signal. Furthermore, if the audio input and output devices are from different manufacturers, the operating system traditionally allows the manufacturer of the audio input device to install an additional driver on the audio output for the purpose of establishing such connection.
However, recently operating systems have been introduced that do not provide or indeed allow any direct connection between the output and input drivers. In particular, some new operating systems do not allow the manufacturer of an audio input device to install a driver in the audio output path unless the manufacturer is also the provider of that audio output device. This prevents the input device manufacturer from installing an audio output driver to provide such a connection. Indeed, operating systems have been introduced which intentionally prevents data to be fed from any part of the audio output path to the audio input path. Such systems are designed to provide a fully transparent and predictable audio path without any third party processing. However, in such systems, any processing of input signals that require information of the audio output is intended to be performed by the specific audio application itself. For example, acoustic echo cancelling for a VoIP hands-free communication application will need to be performed at the application level by the application itself. However, although this may be acceptable in some scenarios, it tends to be suboptimal in many other scenarios. For example, as the application must be used with many different types of hardware from many different manufacturers, the processing is relatively limited as it cannot (easily) be optimized for the specific hardware. Accordingly, a reduced quality audio experience typically results. Hence, an improved processing system would be advantageous and in particular a system allowing increased flexibility, increased audio quality, improved and/or facilitated audio processing, increased adaptation to hardware characteristics, facilitated operation and/or improved performance would be advantageous. SUMMARY OF THE INVENTION
Accordingly, the Invention seeks to preferably mitigate, alleviate or eliminate one or more of the above mentioned disadvantages singly or in any combination.
According to an aspect of the invention there is provided a processing system comprising: an operating system providing an interface to hardware functions for applications of an application layer, the operating system comprising: an audio input driver for receiving audio input data from a hardware audio input and providing processed audio input data to the application layer, an audio loopback function for providing audio loopback data to the application layer, the audio loopback data being audio output data for outputting on a hardware output port; and an application layer link function for receiving the audio loopback data from the audio loopback function and for providing audio loopback data to the audio input driver, wherein the audio input driver is arranged to generate the processed audio input data by processing the audio input data in response to the audio loopback data.
The invention may provide improved and/or facilitated audio processing in many scenarios. The invention may for example provide improved audio quality and/or may allow processing to both take into account characteristics of the audio output as well as characteristics of the specific hardware being used. The invention may in many embodiments allow the audio input data to be processed in accordance with the specific characteristics of the audio input hardware, and may e.g. allow this processing to be controlled or designed by the hardware manufacturer. The invention may in many embodiments allow audio input processing in closer functional proximity to the hardware thereby facilitating and/or improving this processing.
Furthermore, the invention may allow such improved audio behavior to be provided for operating systems that do not allow any direct operating system level connection between an audio output processing (or path) and an audio input processing (or path). In particular, the improved audio processing can be achieved by an audio input driver operating in an operating system that does not comprise any functionality for feeding any audio output data to the audio input driver.
The invention may allow the manufacturer of audio input devices/ hardware to provide additional and/or value added processing and especially processing that is optimized for the specific hardware/ device.
In particular, the Inventors have realized that in contrast to conventional approaches wherein the operating system is used to support application level processes, improved audio processing can be achieved by introducing an application layer function that supports operating system level processes. Specifically, the Inventors have realized that in some processing systems and for some operating systems, improved audio processing is achieved by introducing an application layer function to provide communication of audio data between two operating system level processes, such as from an audio output driver to an audio input driver.
In accordance with an optional feature of the invention, the operating system does not support operating system level audio data communication from a hardware audio output path to a hardware audio input path comprising the audio input driver.
The invention may provide improved and/or facilitated audio processing for many operating systems that do not support operating system level audio data communication from a hardware audio output path to a hardware audio input path. The processing system may specifically provide no application independent way of feeding audio output data to audio input data.
In accordance with an optional feature of the invention, the operating system is arranged to limit active drivers of an audio output path to drivers that are associated with a provider/ manufacturer of an active audio output hardware device.
The operating system may specifically prevent output audio path drivers that are not linked to a provider/ manufacturer of the active audio output hardware device from being included in the audio output path. The operating system may specifically prevent output audio path drivers that are not linked to a provider/ manufacturer of the active audio output hardware device from being installed.
The invention may provide improved and/or facilitated audio processing for many operating systems that restrict audio output path drivers to drivers that are linked to a provider/manufacturer of the audio output hardware device currently coupled to the processing system.
In accordance with an optional feature of the invention, the operating system comprises an audio output driver for a hardware audio output, and the audio loopback data corresponds to output data of the audio output driver. This may provide improved performance in many scenarios and may in particular allow improved audio processing by the audio input driver as it may be provided with audio output data that more closely corresponds to the signal received as a result of an acoustic feedback path from the audio output device to the audio input device. In accordance with an optional feature of the invention, the processing system is further arranged to initialize the application layer link function upon start-up of the processing system.
This may facilitate operation in many embodiments. In particular, it may provide an efficient initialization of the functionality required to support specific hardware without requiring the attachment of this to instigate an application level setup. The initialization of the application layer link function may be unconditional. Thus, in some embodiments, the application layer link function may be initialized when the processing system starts up regardless of any other parameters, conditions or characteristics. In some embodiments, the system may be arranged to initialize the application layer link function upon start-up of the operating system.
In accordance with an optional feature of the invention, the audio input driver is initialized upon detection of an attachment of a hardware audio input unit associated with the audio input driver. This may facilitate operation in many embodiments and/or may reduce resource usage and/or avoid conflict between audio input drivers.
In accordance with an optional feature of the invention, the processing system further comprises a unit for detecting an attachment of a hardware audio input unit and for initializing the application layer link function and the audio input driver in response to the attachment.
This may facilitate operation in many embodiments and/or may reduce resource usage and/or avoid conflict between audio input drivers.
In accordance with an optional feature of the invention, the application layer link function is arranged to write the audio loopback data to a memory file, and the audio input driver is arranged to read the audio loopback data from the memory file.
This may provide a particularly efficient way of providing a link between an operating system output path and an operating system input path. In particular, it may allow an easy to implement means of communicating between the application layer link function and the audio input driver. The approach may further provide audio data buffering for the link and/or may reduce the need for implementing operating system function calls.
In accordance with an optional feature of the invention, the application layer link function is arranged to synchronize the audio loopback data to the audio input data.
This may provide improved and/or facilitated audio processing in a processing system. In accordance with an optional feature of the invention, the audio input driver is arranged to synchronize the audio loopback data and the audio input data.
This may provide improved and/or facilitated audio processing in a processing system. In accordance with an optional feature of the invention, the audio input driver is arranged to perform an audio echo cancelling for the audio input data in response to the audio loopback data.
This may provide improved and/or facilitated audio processing in a processing system. In particular, the invention may allow improved audio echo canceling that can be adapted and/or optimized to the specific characteristics of the audio input hardware/ device. An improved audio echo cancellation can be provided by the system thereby allowing applications to achieve improved audio quality.
In accordance with an optional feature of the invention, the audio input driver is arranged to perform audio beam forming processing of the audio input based on the audio loopback data.
This may provide improved and/or facilitated audio processing in a processing system. In particular, the invention may allow improved audio beam forming, stereo processing and/or noise reduction that can be adapted and/or optimized to the specific characteristics of the audio input hardware/ device. In accordance with an optional feature of the invention, operating system functions associated with a hardware audio input are isolated from functions associated with a hardware audio output.
The invention may provide improved and/or facilitated audio processing for many operating systems that do not support operating system level audio data communication from a hardware audio output path to a hardware audio input path. The processing system may specifically provide no application independent way of feeding audio output data to audio input data. The isolation may be an operating system isolation such that the operating system provides no means for any operating system function processing audio output data to provide any audio data to any operating system function processing audio input data. According to an aspect of the invention there is provided a sound system for a processing system including an operating system providing an interface to hardware functions for applications of an application layer, the sound system comprising: an operating system audio input driver for receiving audio input data from a hardware audio input and providing processed audio input data to the application layer; an operating system audio loopback function for providing audio loopback data to the application layer, the audio loopback data being audio output data for outputting on a hardware output port; an application layer link function for receiving the audio loopback data from the audio loopback function and for providing audio loopback data to the audio input driver, wherein the audio input driver is arranged to generate the processed audio input data by processing the audio input data in response to the audio loopback data.
According to an aspect of the invention there is provided a method of audio processing in a processing system comprising an operating system providing an interface to hardware functions for applications of an application layer, the method comprising: an audio input driver of the operating system receiving audio input data from a hardware audio input and providing processed audio input data to the application layer; an audio loopback function of the operating system providing audio loopback data to the application layer, the audio loopback data being audio output data for outputting on a hardware output port; and an application layer link function receiving the audio loopback data from the audio loopback function and providing audio loopback data to the audio input driver, wherein the audio input driver generates the processed audio input data by processing the audio input data in response to the audio loopback data.
These and other aspects, features and advantages of the invention will be apparent from and elucidated with reference to the embodiment(s) described hereinafter.
BRIEF DESCRIPTION OF THE DRAWINGS
Embodiments of the invention will be described, by way of example only, with reference to the drawings, in which
Fig. 1 illustrates an example of an architecture of a processing system in accordance with some embodiments of the invention; and
Fig. 2 illustrates an example of elements of a processing system in accordance with embodiments of the invention.
DETAILED DESCRIPTION OF THE EMBODIMENTS The following description focuses on embodiments of the invention applicable to a Personal Computer (PC). However, it will be appreciated that the invention is not limited to this application but may be applied to many other processing systems.
General purpose processing systems designed to be used with a variety of applications have become very popular as evidenced by e.g. the ubiquitousness of PCs. Such systems generally apply a layered distribution of functionality as illustrated in Fig. 1. At the lowest level, the processing system provides a hardware layer comprising the hardware of the system. This hardware layer includes (semi)permanent elements of the system, such as the processing element, memory, storage (e.g. hard disk), display interface etc. In addition, the hardware layer may comprise dynamically changing hardware configurations, such as hardware peripherals. For example, PCs typically comprise interfaces, such as Universal Serial Bus (USB) interfaces, to which a hardware peripheral can be dynamically coupled.
The system furthermore comprises an operating system 103 that provides an interface to an application layer 105. The operating system comprises a variety of control, interface and management operations that ensure the proper running of the system and which provides a range of functions which can be used by applications of the application layer 105. In particular, the operating system provides an interface to the hardware functions of the hardware layer 101 to the applications of the application layer 105. Thus, in the system, the various applications do not directly access any hardware functionality (whether permanent hardware or dynamically attached peripherals). Rather, the operating system 103 provides a range of functions that can be called by the applications to generate the required hardware effect. Specifically, an audio application can generate an audio output by calling an audio output function of the operating system with the audio data to be output. The operating system 103 then processes this data and interfaces with the appropriate hardware output device. Similarly, the audio application can receive an audio input by calling an audio input function of the operating system 101 which then returns audio input data from the appropriate hardware input device.
The application layer provides services for application programs/ processes to ensure that effective communication with other application programs/processes is possible. An application program/process may be any program designed to perform a specific function directly for a user or, in some cases, for another application program/ process. Examples of such application programs/processes include word processors, database programs, web browsers, drawing/paint/image editing programs, communication programs etc. In order to achieve this purpose, the application programs/processes make use of requests for services through a defined Application Program Interface (API) provided by the operating system.
Operating systems are generally designed to be flexible and to accommodate very different configurations of hardware including audio input and output hardware functionality that may be provided by different manufacturers and which may dynamically change, e.g. when audio peripherals are attached and removed. Thus, the operating system is designed to be flexible and to adapt to the specific current hardware configuration. This is (at least partially) achieved by the use of audio input and output drivers that are installed to support the specific hardware used. These drivers are typically provided by the manufacturer and may accordingly be optimized for the specific hardware device. It will be appreciated that the lowest level interface to at least parts of the permanent hardware is often provided by a Basic Input Output System (BIOS) and that this BIOS may be considered part of the operating system 103, part of the hardware layer 101 or as being the interface between the operating system 103 and the hardware layer 101.
Fig. 2 illustrates an example of elements of a processing system that uses the layered structure of Fig. 1. Fig. 2 specifically illustrates elements of the processing system associated with an audio system of the processing system.
Fig. 2 illustrates an audio capture device 201 which in the specific example includes one or more microphones as well as associated input circuitry. The audio capture device 201 is coupled to the processing system via a suitable hardware interface/port which in the specific example is a USB interface. The data from the audio capture device 201 is fed to an audio input driver 203. The audio input driver 203 is provided by the manufacturer of the audio capture device 201 and is optimized for the specific device 201. The audio input driver 203 is arranged to process the input data from the audio capture device 201 to generate an audio input data stream. This processing may for example include level adjustment and filtering but will in the system of Fig. 2 also include complex signal processing algorithms as will be described later.
The audio input driver 203 is coupled to an o/s audio input function 205 which represents elements of the operating system (outside the audio input driver 203) that processes the audio input data stream. Specifically, the o/s audio input function 205 may select or combine (mix) the inputs from a plurality of audio input capture devices, adjust the audio input output signal dependent on user settings (e.g. an overall sound level setting), format the audio input data into a data stream that is compatible with the operating system interface to the application layer 105 etc.
In the processing system, an audio application 207 is executed by the processing system. The application 207 runs at the application layer level and thus interfaces to the hardware of the processing system via the operating system 103. In the specific example, the audio application 207 is specifically a VoIP communication application, such as an Internet telephony communication. The audio application 207 receives the audio input data from the o/s audio input function 205. Thus, specifically the VoIP communication application receives the microphone input data from the operating system 103 after this has been processed by the operating system 103 (whether by the o/s audio input function 205 or the audio input driver 203).
The VoIP communication application may specifically proceed to communicate an audio signal to an Internet telephony application at another processing system, such as another PC coupled to the Internet. Thus, specifically, the VoIP communication application may generate an audio output stream which is fed to the operating system 103 for communication over the Internet. Similarly, the audio application 207 may receive audio data to be output to a user of the processing system. Specifically, the VoIP communication application may receive a VoIP stream from the other processing system. The VoIP stream may be received from the Internet and provided to the VoIP communication application from the operating system 103.
In order to output the audio, the audio application 207 feeds output data to an audio output path of the operating system. Specifically, in the example of Fig. 2, the audio application 207 feeds the audio output data to an o/s audio output function 209. The o/s audio output function 209 may further receive output data streams from other audio applications 211 and mix these together. In addition, the o/s audio output function 209 may represent other operating system level processing of the audio data, including e.g. reformatting of the audio data, setting of an overall sound level etc. The o/s audio output function 209 is coupled to an audio output driver 213 which generates the audio output signal for a hardware audio output device 215. In the specific example the audio output device 215 may be a loudspeaker, such as a PC loudspeaker or a loudspeaker of a USB hands-free VoIP telephone.
The audio output driver 213 is provided by the manufacturer of the audio output device 215 and the processing of the audio output data is optimized for the specific characteristics of the hardware audio output device 215.
In many scenarios it is desirable to perform processing of input audio dependent on the audio output. For example, for hands-free operation with the VoIP telephony application of the specific example, the audio input capture device 201 may pick up the sound radiated from the hardware audio output device 215. However, such an acoustic feedback path may result in the audio from the remote PC which being output by the hardware audio output device 215 being picked up by the audio input capture device 201 and returned to the remote PC. This will result in user of the remote PC hearing an echo of itself which can be very disturbing and inconvenient. Accordingly, the processing system should employ audio echo canceling that attenuates the signal components of the audio input signal that originates from the audio output device 215.
Traditionally, such echo canceling has been performed by the operating system forming a direct connection between the audio output path and the audio input driver. Specifically, traditional operating systems have allowed for additional drivers to be implemented in the audio output path. Such drivers for example allow the manufacturer of an audio input device to not only provide an audio input driver but also to install a driver function in the audio output path even if the audio output device is provided by another manufacturer. This driver may then cooperate with the audio input driver to provide this with information of the audio data being output. Thus, in such systems the audio input device manufacturer may install operating system level drivers in the audio output path that cooperate with the audio input driver to provide processing that takes into account the actual audio output data. This may for example allow efficient acoustic echo cancellation. Such functionality is for example provided in the popular operating system known as Windows™ XP™ from Microsoft™ Corporation in which the output path drivers are known audio class drivers.
However, recently operating systems have been designed that do not provide or even allow such functionality. In particular, operating systems have been designed that do not allow any drivers to be installed in the audio output path by any other entity than the provider/manufacturer of the hardware audio output device being used. As the manufacturer of the audio input device is often different than the manufacturer of the audio output device this effectively prevents the audio input driver from performing any processing based on the audio output data.
Indeed, such operating systems do not support any operating system level audio data communication from the hardware audio output path to the hardware audio input path. Rather the operating system functions that are associated with the hardware audio input are isolated from the operating system functions that are associated with hardware audio output. Thus, the operating system functions of the audio input path and the audio output path are not able to exchange any audio data between each other. Indeed, such operating systems are designed to meet the operating system manufacturer's desire to have a fully transparent and predictable audio path without any third party processing. Indeed, the only functionality of the input and output paths that are not provided by the designer of the operating system are the audio input and output drivers that are provided by the manufacturer of the respective audio devices. An example of such an operating system is Windows Vista from Microsoft™ Corporation.
For operating systems employing such separated input and output paths, the intention is that any audio processing must be performed at the application layer level. For example, the acoustic echo cancellation for a VoIP application is intended to be performed by the VoIP application itself. However, the inventors have realized that such an approach may be disadvantageous in many scenarios. Specifically, it makes it impractical for the audio processing to be adapted and optimized for the specific input hardware device since the application is intended to be used in many different processing systems and with different hardware.
Furthermore, the inventors have realized that such disadvantages can be mitigated or reduced by introducing an application layer link function to provide a link between operating system audio functions of the input and audio output paths. Specifically, the system of Fig. 2 includes an operating system audio loopback function which provides audio loopback data to the application layer 105. In the specific example, the o/s audio output function 209 is not only arranged to receive and process audio data to the audio output driver but also makes an audio output data stream available to applications of the application layer. Thus, in the example the o/s audio output function 209 includes an audio loopback function which can be called by applications and which returns audio data that is being output from the processing system. Such functionality have been proposed for operating systems (such as Windows™ Vista™) in order to allow applications to perform more advanced processing that can take into account audio processing of the audio path.
It will be appreciated that in some embodiments, the audio loopback function may be functionally located after the output driver 213. Thus, the audio data from the output path (the audio loopback data) may be retrieved following the audio output driver 213 and thus very close to the hardware layer 101. This may provide a more accurate representation of the radiated audio as it may reflect the audio processing of most of the audio path including the impact of the audio output driver 213. Furthermore, in the example, the audio loopback function is a permanent and integral part of the operating system and is implemented by the operating system provider. Thus, the output path is maintained transparent and without any drivers that are not generic or provided by the specific manufacturer of the audio output device 215.
The inventors have realized that this audio loopback data and operating system loopback function may be used differently than intended. Indeed, the inventors have realized that the function can be utilized to provide a audio data link between operating system level functions, and specifically to provide a link between the operating system audio output path and the audio input path.
Specifically, the system of Fig. 2 comprises an application layer link function 217 which receives the audio loopback data from the audio loopback function and then forwards it to the audio input driver 203. The audio input driver 203 is arranged to process the audio input data in response to the audio loopback data. For example, acoustic echo cancellation may be performed by the audio input driver 203 using audio loopback data which is provided from the operating system output path via an application layer application. Thus, in the system, the restrictions and limitations imposed by the operating system design are overcome by using the application layer to provide a link between operating system functions. Thus, in contrast to the general layered structure wherein the operating system layer provides support for the application layer, the system of Fig. 2 includes application layer functionality that supports and enables operating system level functionality. In particular, operating system level audio functionality (in terms of the processing of the audio input driver 203) is based on application layer processes.
This approach may provide substantially improved audio performance. For example, audio echo cancellation can be performed by the audio input driver 203 instead of (or in addition to) the audio application 207. This may provide a substantially improved echo cancellation as this can be targeted to the specific characteristics of the actual hardware device 201. For example, the echo cancellation can be performed to take into account the presence of multiple microphones as well as the physical characteristics of these (including the number of and distance between microphones). In addition, the echo cancellation may take into account e.g. the frequency response of the microphones. It will be appreciated that many methods and algorithms for performing acoustic echo cancellation will be known to the person skilled in the art and that any suitable method may be used without detracting from the invention. For example, the audio input driver 203 may scale the received audio loopback data and synchronize it to the incoming data. The synchronized and scaled audio data may then be subtracted from the input signal. As another example, the audio input driver 203 may be arranged to perform acoustic beam forming. Such acoustic beam forming can be achieved by combining input signals from a plurality of microphones after the individuals signals have been appropriately scaled and phase offset. It will be appreciated that many different acoustic beam forming algorithms will be known to the skilled person and that any suitable algorithm may be used. Such acoustic beam forming requires the use of a plurality of microphones and the exact processing and beam forming is highly dependent on not only the number of microphones in the microphone array but also on the physical arrangement of these. For example, the actual phase offset to apply to each individual microphone signal in order to generate a beam in a desired direction is a function of the distance between the microphones in the array. Accordingly, such beam forming is not possible at the application layer which does not have knowledge of the specific hardware device used. Furthermore, such acoustic beam forming may be very useful to attenuate any acoustic feedback from the audio output device 215. For example, the acoustic beam forming may be used to generate a notch in the beam pattern in the direction of the audio output device 215. Furthermore, the beam forming algorithm is typically an adaptive algorithm that automatically adjusts the weights for the individual microphone signals, and this adaptation can effectively be achieved using the audio loopback data received from application layer link function 217. As a simple example, the weight of the individual microphone signals may be adapted such that the correlation between the audio loopback data and the combined signal is minimized. This will result in the contribution from the sound of the audio output device 215 being minimized for the audio input data, typically corresponding to a beam forming notch in the direction of the audio output device 215.
Furthermore, the acoustic echo cancellation and the beam forming processes may in many embodiments be advantageously combined. Thus, the audio input driver 203 may perform both acoustic echo cancellation and beam forming. This may provide a very efficient attenuation of the contribution from the audio output device 215 in the audio input data. Furthermore, as the effects of the two processes will impact each other, the acoustic echo cancellation and the beam forming should be performed jointly. This is possible in the system of Fig. 2 but cannot be achieved at the hardware generic application level. Thus, the system of Fig. 2 provides improved audio quality and in particular may provide an increased attenuation of signal components resulting from the acoustic feedback between the audio output device 215 and the audio input device 201. For example, for a VoIP telephony application a substantially reduced echo effect can be achieved. Furthermore, such improved performed is achieved despite the operating system not providing or allowing any connection between the audio output path and the input path.
Indeed, in such joint beam forming and echo canceling embodiments, the beam forming may not necessarily itself be based on the audio loopback data. Indeed, the beam forming can be used to reduce other interferences in the microphone signal, like background noise or competing speech (undesired speech). For example, a null can be placed in the direction of an undesired sound source. In such cases the beam forming itself is not or only partially based on the audio loopback data. However, even in such a case, it is important that the beam forming and the echo cancellation are combined into one algorithm. One reason for this is that the beam forming process may be time variant and/or nonlinear, which would have disadvantageous effects on the echo canceller unless this is integrated with the beam former.
It will be appreciated that in different embodiments different methods of communicating between the application layer link function 217 and the audio input driver 203 can be used. For example, in some embodiments the application layer link function 217 may regularly call the audio input driver 203 with the audio loopback data being included as parameters of the function call. This may be efficient in many embodiments and may in particular be useful for operating systems that do not allow functions of the operating system to proactively control functions of the application layer. However, in the specific embodiment of Fig. 2, the application layer link function 217 is arranged to write the audio loopback data to a memory file. This memory file may for example be a memory mapped file or buffer stored in a Random Access Memory, solid-state memory or may for example be memory of a device, such as a hard disk. In the system, the audio input driver 203 is arranged to read the audio loopback data from the memory file. Indeed, the memory file may be any memory means in which data can be stored and it will be appreciated that it may be formatted and managed in accordance with any suitable approach.
Thus, in the system, the communication between the application layer link function 217 and the audio input driver 203 is not achieved by the application layer link function 217 calling the audio input driver 203 (or vice versa) but rather a more indirect approach is used. Specifically, the application layer link function 217 may write to the memory file whenever an appropriate amount of audio loopback data has been received. Furthermore, the audio input driver 203 may read audio loopback data from the memory file whenever it requires further data. Thus, the approach allows for an asynchronous link between the application layer function and the operating system function. In particular, it may allow the writing and reading of audio loopback data to be asynchronous and may provide a buffering function for the audio loopback data. In addition, the approach may facilitate the (reverse direction) interaction between the application layer and the operating system layer. For example, for operating systems that do not allow an operating system driver to call or otherwise control an application layer process, the approach may allow the audio input driver 203 to retain control over the path from the application layer link function 217 to the audio input driver 203. E.g. the application layer link function 217 may be arranged to only write data to the memory file if the amount of audio loopback data that is currently stored therein does not exceed a given threshold. If the threshold is exceeded, the application layer link function 217 temporarily stores the audio loopback data (or it may potentially delete data after a given time). Thus, the approach allows the audio input driver 203 to read the audio loopback data as and when it is needed while allowing the application layer link function 217 to adjust accordingly. As another example, the memory file may implement a circular buffer into which the application layer link function 217 asynchronously writes the audio loopback data. The audio input driver 203 may then asynchronously and autonomously read audio loopback data as and when it is required.
In many operating systems, hardware drivers for e.g. peripherals are not running continuously but are only executed when the corresponding hardware device is available to the system. Indeed, many operating systems comprise functionality for detecting that a new hardware device has been connected to the system. When such an attachment is detected, the operating system proceeds to initialize and start the corresponding driver.
In some embodiments the operating system of the system of Fig. 2 comprises such functionality for detecting that the audio input device 201 has been attached to the system. In response to this detection, the operating system proceeds to initialize and start up the audio input driver 203 thereby allowing the audio input data to be received from the audio input device 201 and forwarded to the application layer 105.
In some embodiments, the system is also arranged to initialize the application layer link function 217 in response to such a detection. This start-up or initialization may be in parallel to and/or independent of the start-up of the audio input driver 203. For example, the operating system may be arranged to start the application layer link function 217 when the presence of the audio input device 201 is detected. However, in other embodiments, the audio input driver 203 may initialize the start-up of the application layer link function 217. Specifically, as part of the start-up process of the audio input driver 203, a function call may be made to a system process that starts the application layer link function 217.
Such an approach may be efficient in many embodiments as it ensures that the application layer link function 217 is only executed when necessary. In other embodiments, the audio input driver 203 may be initialized upon detection of an attachment of the audio input device 201 but without this causing an initialization of the application layer link function 217. Indeed, in some embodiments, the application layer link function 217 is arranged to be continuously executed when the processing system is operational. In such systems, the application layer link function 217 may be initialized upon start-up of the processing system and may continuously be running as a background application. Such an approach will ensure that the application layer link function 217 is always available to support the operating system functionality when needed. Furthermore it may facilitate implementation in many systems and for many operating systems. Specifically, many operating systems may not allow operating system drivers to initialize application layer processes and may not provide any functionality for initialization of an application layer process in response to a detection of a hardware attachment. In such system, the approach of continuously running the application layer link function 217 as a background process. In some such embodiments, the application layer link function 217 may be running in a reduced mode when the audio input driver 203 is not active. For example, the application layer link function 217 may not perform any functions except for regularly polling whether the audio input driver 203 has been loaded. When it is detected that the audio input driver 203 is active, the application layer link function 217 may proceed to retrieve audio loopback data and write to the memory file. This may reduce complexity and resource usage.
In other embodiments, the application layer link function 217 may perform the same functions regardless of whether the audio input device 203 is launched or not. For example, the application layer link function 217 may continuously retrieve audio loopback data and write it to the memory file. This will ensure that audio loopback data is readily available to the audio input driver 203 when this starts-up. Furthermore, the resource required for such functionality is typically insignificant in many processing systems.
In some embodiments, the processing system may furthermore comprise functionality for synchronizing the audio loopback data and the audio input data. Typically such synchronization comprises a sample clock synchronization between the audio loopback data and the audio input data.
In general the sample clock for the audio output path and the audio input path is not identical. For example, the audio input and the audio output may use different sample rates. However, even if the same sample rate is used, there are typically sample clock phase and frequency variations between the sample clocks. This may for example result in a slight frequency variation between the sample clocks of the audio input path and the audio output path. Since the audio loopback data corresponds to audio output data, the sample clock deviation will also manifest itself in a sample clock deviation between the audio input data and the loopback audio data received by the audio input driver 203.
In some embodiments, the application layer link function 217 is arranged to synchronize the audio loopback data to the audio input data. For example, the application layer link function 217 may receive information about the audio input sample rate from the o/s audio input function 203 and may compare this to the audio loopback data. The sample rate information may consist of, for example, the time stamps associated with the individual audio packets and the number of audio samples contained in each packet. From such data, the clock drift can be calculated, i.e. the difference between the input sample rate and the loopback sample rate can be determined. The application layer link function 217 may then proceed to synchronize the audio loopback data to the sample clock of the audio input data. The synchronization may for example be achieved by a re-sampling of the audio loopback data and/or the addition of a dummy sample or the deletion of an audio sample at appropriate times.
In other embodiments, the audio input driver 203 may perform a synchronization of the audio loopback data and the audio input data. The synchronization may be achieved in the same way as described for the application layer link function 217 and/or may be achieved by performing such operations to the audio input data.
It will be appreciated that the above description for clarity has described embodiments of the invention with reference to different functional units and processors. However, it will be apparent that any suitable distribution of functionality between different functional units or processors may be used without detracting from the invention. For example, functionality illustrated to be performed by separate processors or controllers may be performed by the same processor or controllers. Hence, references to specific functional units are only to be seen as references to suitable means for providing the described functionality rather than indicative of a strict logical or physical structure or organization. The invention can be implemented in any suitable form including hardware, software, firmware or any combination of these. The invention may optionally be implemented at least partly as computer software running on one or more data processors and/or digital signal processors. The elements and components of an embodiment of the invention may be physically, functionally and logically implemented in any suitable way. Indeed the functionality may be implemented in a single unit, in a plurality of units or as part of other functional units. As such, the invention may be implemented in a single unit or may be physically and functionally distributed between different units and processors.
Although the present invention has been described in connection with some embodiments, it is not intended to be limited to the specific form set forth herein. Rather, the scope of the present invention is limited only by the accompanying claims. Additionally, although a feature may appear to be described in connection with particular embodiments, one skilled in the art would recognize that various features of the described embodiments may be combined in accordance with the invention. In the claims, the term comprising does not exclude the presence of other elements or steps.
Furthermore, although individually listed, a plurality of means, elements or method steps may be implemented by e.g. a single unit or processor. Additionally, although individual features may be included in different claims, these may possibly be advantageously combined, and the inclusion in different claims does not imply that a combination of features is not feasible and/or advantageous. Also the inclusion of a feature in one category of claims does not imply a limitation to this category but rather indicates that the feature is equally applicable to other claim categories as appropriate. Furthermore, the order of features in the claims do not imply any specific order in which the features must be worked and in particular the order of individual steps in a method claim does not imply that the steps must be performed in this order. Rather, the steps may be performed in any suitable order. In addition, singular references do not exclude a plurality. Thus references to "a", "an", "first", "second" etc do not preclude a plurality. Reference signs in the claims are provided merely as a clarifying example shall not be construed as limiting the scope of the claims in any way.