Movatterモバイル変換


[0]ホーム

URL:


USRE39080E1 - Rate loop processor for perceptual encoder/decoder - Google Patents

Rate loop processor for perceptual encoder/decoder
Download PDF

Info

Publication number
USRE39080E1
USRE39080E1US10/218,232US21823202AUSRE39080EUS RE39080 E1USRE39080 E1US RE39080E1US 21823202 AUS21823202 AUS 21823202AUS RE39080 EUSRE39080 EUS RE39080E
Authority
US
United States
Prior art keywords
coefficients
signal
coding
noise
threshold
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US10/218,232
Inventor
James David Johnston
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia of America Corp
Original Assignee
Lucent Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filedlitigationCriticalhttps://patents.darts-ip.com/?family=25293693&utm_source=google_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=USRE39080(E1)"Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Application filed by Lucent Technologies IncfiledCriticalLucent Technologies Inc
Priority to US10/218,232priorityCriticalpatent/USRE39080E1/en
Assigned to JPMORGAN CHASE BANK, AS COLLATERAL AGENTreassignmentJPMORGAN CHASE BANK, AS COLLATERAL AGENTSECURITY AGREEMENTAssignors: LUCENT TECHNOLOGIES INC.
Priority to US11/248,622prioritypatent/USRE40280E1/en
Application grantedgrantedCritical
Publication of USRE39080E1publicationCriticalpatent/USRE39080E1/en
Assigned to LUCENT TECHNOLOGIES INC.reassignmentLUCENT TECHNOLOGIES INC.TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENT RIGHTSAssignors: JPMORGAN CHASE BANK, N.A. (FORMERLY KNOWN AS THE CHASE MANHATTAN BANK), AS ADMINISTRATIVE AGENT
Assigned to CREDIT SUISSE AGreassignmentCREDIT SUISSE AGSECURITY INTEREST (SEE DOCUMENT FOR DETAILS).Assignors: ALCATEL-LUCENT USA INC.
Anticipated expirationlegal-statusCritical
Assigned to ALCATEL-LUCENT USA INC.reassignmentALCATEL-LUCENT USA INC.RELEASE BY SECURED PARTY (SEE DOCUMENT FOR DETAILS).Assignors: CREDIT SUISSE AG
Expired - Lifetimelegal-statusCriticalCurrent

Links

Images

Classifications

Definitions

Landscapes

Abstract

A method and apparatus for quantizing audio signals is disclosed which advantageously produces a quantized audio signal which can be encoded within an acceptable range. Advantageously, the quantizer uses a scale factor which is interpolated between a threshold based on the calculated threshold of hearing at a given frequency and the absolute threshold of hearing at the same frequency.

Description

This is a reissue application of U.S. Pat. No.5,627,938 filed Sep.22,1994 as application Ser. No.08/310,898 which is a continuation of application Ser. No. 07/844,811, filed on Mar. 2, 1992, now abandoned, which is a continuation-in-part of application Ser. No.07/844,967 filed Feb.28,1992, now abandoned, which is a continuation of Ser. No.07/292,598 filed Dec.30,1988 now abandoned.
CROSS-REFERENCE TO RELATED APPLICATIONS AND MATERIALS
The following U.S. patent applications filed concurrently with the present application and assigned to the assignee of the present application are related to the present application and each is hereby incorporated herein as if set forth in its entirety: “A METHOD AND APPARATUS FOR THE PERCEPTUAL CODING OF AUDIO SIGNALS,” by A. Ferreira and J. D. Johnston, application Ser. No. 07/844,819, now abandoned, which in turn was parent of application Ser. No. 08/334,889, allowed Jul. 11, 1996: “A METHOD AND APPARATUS FOR CODING AUDIO SIGNALS BASED ON PERCEPTUAL MODEL,” by J.D. Johnston, application Ser. No. 07/844,804, now U.S. Pat. No. 5,285,498, issued Feb. 8, 1994; and “AN ENTROPY CODER,” by J.D. Johnston and J.A. Reeds, application Ser. No. 07/844,809, now U.S. Pat. No. 5,227,788, issued Jul. 13, 1993.
FIELD OF THE INVENTION
The present invention relates to processing of signals, and more particularly, to the efficient encoding and decoding of monophonic and stereophonic audio signals, including signals representative of voice and music for storage or transmission.
BACKGROUND OF THE INVENTION
Consumer, industrial, studio and laboratory products for storing, processing and communicating high quality audio signals are in great demand. For example, so-called compact disc (“CD”) and digital audio tape (“DAT”) recordings for music have largely replaced the long-popular phonograph record and cassette tape. Likewise, recently available digital audio tape (“DAT”) recording promise to provide greater flexibility and high storage density for high quality audio signals. See, also, Tan and Vermeulen, “Digital audio tape for data storage”, IEEE Spectrum, pp. 34-38 (October 1989). A demand is also arising for broadcast applications of digital technology that offer CD-like quality.
While these emerging digital techniques are capable of producing high quality signals, such performance is often achieved only at the expense of considerable data storage capacity or transmission bandwidth. Accordingly, much work has been done in an attempt to compress high quality audio signals for storage and transmission.
Most of the prior work directed to compressing signals for transmission and storage has sought to reduce the redundancies that the source of the signals places on the signal. Thus, such techniques as ADPCM, sub-band coding and transform coding described, e.g., in N. S. Jayant and P. Noll, “Digital Codin of Waveforms,” Prentice-Hall, Inc. 1984, have sought to eliminate redundancies that otherwise would exist in the source signals.
In other approaches, the irrelevant information in source signals is sought to be eliminated using techniques based on models of the human perceptual system. Such techniques are described, e.g., in E. F. Schroeder and J. J. Platte “MSC”; Stereo Audio Coding with CD-Quality and 256 kBIT/SEC, “IEEE Trans. on Consumer Electronics, Vol. CE-33, No. 4, November 1987; and Johnston, Transform Coding of Audio Signals Using Noise Criteria, Vol. 6, No. 2, IEEE J.S.C.A. (February 1988).
Perceptual coding, as described, e.g., in the Johnston paper related to a technique for lowering required bitrates (or reapportioning available bits) or total number of bits in representing audio signals. In this form of coding, a masking threshold for unwanted signals is identified as a function of frequency of the desired signal. Then, inter alia, the coarseness of quantizing used to represent a signal component of the desired signal is selected such that the quantizing noise introduced by the coding does not rise above the noise threshold, though it may be quite near this threshold. The introduced noise is therefore masked in the perception process. While traditional signal-to-noise ratios for such perceptually coded signals may be relatively low, the quality of these signals upon decoding, as perceived by a human listener, is nevertheless high.
Brandenburg et al, U.S. Pat. No. 5,040,217, issued Aug. 13, 1991, describes a system for efficiently coding and decoding high quality audio signals using such perceptual consideration. In particular, using a measure of the “noise-like” or “tone-like” quality of the input signals, the embodiments described in the latter system provides a very efficient coding for monophonic audio signals.
It is, of course, important that the coding techniques used to compress audio signals do not themselves introduce offensive components or artifacts. This is especially important when coding stereophonic audio information where coded information corresponding to one stereo channel, when decoded for reproduction, can interfere or interact with coding information corresponding to the other stereo channel. Implementation choices for coding two stereo channels include so-called “dual-mono” coders using two independent coders operating at fixed bit rates. By contrast, “joint mono” coders use two monophonic coders but share one combined bit rate, i.e., the bit rate for the two coders is constrained to be less than or equal to a fixed rate, but trade-offs can be made between the bit rates for individual coders. “Joint stereo” coders are those that attempt to use interchannel properties for the stereo pair for realizing additional coding gain.
It has been found that the independent coding of the two channels of a stereo pair, especially at low bit-rates, can lead to a number of undesirable psychoacoustic artifacts. Among them are those related to the localization of coding noise that does not match the localization of the dynamically imaged signal. Thus the human stereophonic perception process appears to add constraints to the encoding process if such mismatched localization is to be avoided. This finding is consistent with reports on binaural masking-level differences that appear to exist, at least for low frequencies, such that noise may be isolated spatially. Such binaural masking-level differences are considered to unmask a noise component that would be masked in a monophonic system. See, for example, B.C.J. Morre, “An Introduction to the Psychology of Hearing, Second Edition,” especiallychapter 5, Academic Press, Orlando, Fla., 1982.
One technique for reducing psychoacoustic artifacts in the stereophonic context employs the ISO-WG11-MPEG-Audio Psychoacoustic II [ISO] Model. In this model, a second limit of signal-to-noise ratio (“SNR”) is applied to signal-to-noise ratios inside the psychoacoustic model. However, such additional SNR constraints typically require the expenditure of additional channel capacity or (in storage applications) the use of additional storage capacity, at low frequencies, while also degrading the monophonic performance of the coding.
SUMMARY OF THE INVENTION
Limitations of the prior art are overcome and a technical advance is made in a method and apparatus for coding a stereo pair of high quality audio channels in accordance with aspects of the present invention. Interchannel redundancy and irrelevancy are exploited to achieve lower bit-rates while maintaining high quality reproduction after decoding. While particularly appropriate to stereophonic coding and decoding, the advantages of the present invention may also be realized in conventional dual monophonic stereo coders.
An illustrative embodiment of the present invention employs a filter bank architecture using a Modified Discrete Cosine Transform (MDCT). In order to code the full range of signals that may be presented to the system, the illustrative embodiment advantageously uses both L/R (Left and Right) and M/S (Sum/Difference) coding, switched in both frequency and time in a signal dependent fashion. A new stereophonic noise masking model advantageously detects and avoids binaural artifacts in the coded stereophonic signal. Interchannel redundancy is exploited to provide enhanced compression for without degrading audio quality.
The time behavior of both Right and Left audio channels is advantageously accurately monitored and the results used to control the temporal resolution of the coding process. Thus, in one aspect, an illustrative embodiment of the present invention, provides processing of input signals to terms of either a normal MDCT window, or, when signal conditions indicate, shorter windows. Further, dynamic switching between RIGHT/LEFT or SUM/DIFFERENCE coding modes is provided both in time and frequency to control unwanted binaural noise localization, to prevent the need for overcoding of SUM/DIFFERENCE signals, and to maximize the global coding gain.
A typical bitstream definition and rate control loop are described which provide useful flexibility in forming the coder output. Interchannel irrelevancies, are advantageously eliminated and stereophonic noise masking improved, thereby to achieved improved reproduced audio quality in jointly coded stereophonic pairs. The rate control method used in an illustrative embodiment uses an interpolation between absolute threshold and masking threshold for signals below the rate-limit of the coder, and a threshold elevation strategy under rate-limited conditions.
In accordance with an overall coder/decoder system aspect of the present invention, it provides advantageously to employ an improved Huffman-like entropy coder/decoder to further reduce the channel bit rate requirements, or storage capacity for storage applications. The noiseless compression method illustratively used employs Huffman coding along with a frequency-partitioning scheme to efficiently code the frequency samples for L,R,M and S, as may be dictated by the perceptual threshold.
The present invention provides a mechanism for determining the scale factors to be used in quantizing the audio signal (i.e., the MDCT coefficients output from the analysis filter bank) by using an approach different from the prior art, and while avoiding many of the restriction and costs of prior quantizer/rate-loops. The audio signals quantized pursuant to the present invention introduce less noise and encode into fewer bits than the prior art.
These results are obtained in an illustrative embodiment of the present invention whereby the utilized scale factor, is iteratively derived by interpolating between a scale factor derived from a calculated threshold of hearing at the frequency corresponding to the frequency of the respective spectral coefficient to be quantized and a scale factor derived from the absolute threshold of hearing at said frequency until the quantized spectral coefficients can be encoded within permissible limits.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 presents an illustrative prior art audio communication/storage system of a type is which aspects of the present invention find application, and provides improvement and extension.
FIG. 2 presents an illustrative perceptual audio coder (PAC) in which the advances and teachings of the present invention find application, and provide improvement and extension.
FIG. 3 shows a representation of a useful masking level difference factor used in threshold calculations.
FIG. 4 presents an illustrative analysis filter bank according to an aspect of the present invention.
FIG.5(a) through5(e) illustrate the operation of various window functions.
FIG. 6 is a flow chart illustrating window switching functionality.
FIG. 7 is a block/flow diagram illustrating the overall processing of input signals to derive the output bitstream.
FIG. 8 illustrates certain threshold variations.
FIG. 9 is a flow chart representation of certain bit allocation functionality.
FIG. 10 shows bitstream organization.
FIGS. 11a through 11c illustrate certain Huffman coding operations.
FIG. 12 shows operations at a decoder that are complementary to those for an encoder.
FIG. 13 is a flowchart illustrating certain quantization operations in accordance with an aspect of the present invention.
FIG.14(a) through14(g) are illustrative windows for use with the filter bank of FIG.4.
DETAILED DESCRIPTION
1. Overview
To simplify the present disclosure, the following patents, patent applications and publications are hereby incorporated by reference in the present disclosure as if fully set forth herein: U.S. Pat. No. 5,040,217, issued Aug. 13, 1991 by K. Brandenburg et al, U.S. patent application Ser. No. 07/292,598, entitled Perceptual Coding of Audio Signals, filed Dec. 30, 1988; J. D. Johnston, Transform Coding of Audio Signals Using Perceptual Noise Criteria, IEEE Journal on Selected Areas in Communications, Vol. 6, No. 2 February 1988); International Patent Application (PCT) WO 88/01811, filed Mar. 10, 1988; U.S. patent application Ser. No. 07/491,373, entitled Hybrid Perceptual Coding, filed Mar. 9, 1990, Brandenburg et al, Aspec: Adaptive Spectral Entropy Coding of High Quality Music Signals, AES 90th Convention (1991); Johnston, J., Estimation of Perceptual Entropy Using Noise Masking Criteria, ICASSP, (1988); J. D. Johnston, Perceptual Transform Coding of Wideband Stereo Signals, ICASSP (1989); E. F. Schroeder and J. J. Platte, “MSC”: Stereo Audio Coding with CD-Quality and 256 kBIT/SEC,” IEEE Trans. on Consumer Electronics, Vol. CE-33, No. 4, November 1987; and Johnston, Transform Coding of Audio Signals Using Noise Criteria, Vol. 6, No. 2, IEEE J.S.C.A. (February 1988).
For clarity of explanation, the illustrative embodiment of the present invention is presented as comprising individual functional blocks (including functional blocks labeled as “processors”). The functions these blocks represent may be provided through the use of either shared or dedicated hardware, including, but not limited to, hardware capable of executing software. (Use of the term “processor” should not be construed to refer exclusively to hardware capable of executing software.) Illustrative embodiments may comprise digital signal processor (DSP) hardware, such as the AT&T DSP16 or DSP32C, and software performing the operations discussed below. Very large scale integration (VLSI) hardware embodiments of the present invention, as well as hybrid DSP/VLSI embodiments, may also be provided.
FIG. 1 is an overall block diagram of a system useful for incorporating an illustrative embodiment of the present invention. At the level shown, the system ofFIG. 1 illustrates systems known in the prior art, but modifications, and extensions described herein will make clear the contributions of the present invention. InFIG. 1, ananalog audio signal101 is fed into apreprocessor102 where it is sampled (typically at 48 KHz) and converted into a digital pulse code modulation (“PCM”) signal103 (typically 16 bits) in standard fashion. ThePCM signal103 is fed into a perceptual audio coder104 (“PAC”) which compresses the PCM signal and outputs the compressed PAC signal to a communications channel/storage medium106. From the communications channel/storage medium the compressed PAC signal (105) is fed into aperceptual audio decoder108 which decompresses the compressed PAC signal and outputs aPCM signal107 which is representative of the compressedPAC signal105. From the perceptual audio decoder, thePCM signal108 is fed into a post-processor110 which creates an analog representation of thePCM signal107.
An illustrative embodiment of theperceptual audio coder104 is shown in block diagram form in FIG.2. As in the case of the system illustrated inFIG. 1, the system ofFIG. 2, without more, may equally describe certain prior art systems, e.g., the system disclosed in the Brandenburg, et al U.S. Pat. No. 5,040,217. However, with the extensions and modifications described herein, important new results are obtained. The perceptual audio coder ofFIG. 2 may advantageously be viewed as comprising ananalysis filter bank202, aperceptual model processor204, a quantizer/rate-loop processor206 and anentropy encoder208.
Thefilter bank202 inFIG. 2 advantageously transforms an input audio signal in time/frequency in such manner as to provide both some measure of signal processing gain (i.e. redundancy extraction) and a mapping of the filter bank inputs in a way that is meaningful in light of the human perceptual system. Advantageously the well-known Modified Discrete Cosine Transform (MDCT) described, e.g., in J. P. Princen and A. B. Bradley, “Analysis/Synthesis Filter Bank Design Based on Time Domain Aliasing Cancellation,” IEEE Trans. ASSP, Vol. 34, No. 5, October, 1986, may be adapted to perform such transforming of the input signals.
Features of the MDCT that make it useful in the present context include its critical sampling characteristic, i.e. for every n samples into the filter bank, n samples are obtained from the filter bank. Additionally, the MDCT typically provides half-overlap, i.e. the transform length is exactly twice the length of the number of samples, n, shifted into the filterbank. The half-overlap provides a good method of dealing with the control of noise injected independently into each filter tap as well as providing a good analysis window frequency response. In addition, in the absence of quantization, the MDCT provides exact reconstruction of the input samples, subject only to a delay of an integral number of samples.
One aspect in which the MDCT is advantageously modified for use in connection with a highly efficient stereophonic audio coder is the provision of the ability to switch the length of the analysis window for signal sections which have strongly non-stationary components in such a fashion that it retains the critically sampled and exact reconstruction properties. The incorporated U.S. patent application by Ferreira and Johnston, entitled, “A METHOD AND APPARATUS FOR THE PERCEPTUAL CODING OF AUDIO SIGNALS,” (referred to hereinafter as the “filter bank application”) filed of even date with this application, describes a filter bank appropriate for performing the functions ofelement202 in FIG.2.
Theperceptual model processor204 shown inFIG. 2 calculates an estimate of the perceptual importance, noise masking properties, or just noticeable noise floor of the various signal components in the analysis bank. Signals representative of these quantities are then provided to other system elements to provide improved control of the filtering operations and organizing of the data to be sent to the channel or storage medium. Rather than using the critical band by critical band analysis described in J. D. Johnston, “Transform Coding of Audio Signals Using Perceptual Noise Criteria,” IEEE J. on Selected Areas in Communications, February 1988, an illustrative embodiment of the present invention advantageously uses finer frequency resolution in the calculation of thresholds. Thus instead of using an overall tonality metric as in the last-cited Johnston paper, a tonality method based on that mentioned in K. Brandenburg and J. D. Johnston, “Second Generation Perceptual Audio Coding: The Hybrid Coder,” AES 89th Convention, 1990 provides a tonality estimate that varies over frequency, thus providing a better fit for complex signals.
The psychoacoustic analysis performed in theperceptual model processor204 provides a noise threshold for the L (Left), R (Right), M (Sum) and S (Difference) channels, as may be appropriate, for both the normal MDCT window and the shorter windows. Use of the shorter windows is advantageously controlled entirely by the psychoacoustic model processor.
In operation, an illustrative embodiment of theperceptual model processor204 evaluates thresholds for the left and right channels, denoted THRland THRr. The two thresholds are then compared in each of the illustrative35 coder frequency partitions (56 partitions in the case of an active window-switched block). In each partition where the two thresholds vary between left and right by less than some amount, typically 2 dB, the coder is switched into M/S mode. That is, the left signal for that band of frequencies is replaced by M=(L+R)/2, and the right signal is replaced by S=(L−R)/2. The actual amount of difference that triggers the last-mentioned substitution will vary with bitrate constraints and other system parameters.
The same threshold calculation used for L and R thresholds is also used for M and S thresholds, with the threshold calculated on the actual M and S signals. First, the basic thresholds, denoted BTHRmand MLDsare calculated. Then, the following steps are used to calculate the stereo masking contribution of the M and S signals.
1. An additional factor is calculated for each of the M and S thresholds. This factor, called MLDm, and MLDs, is calculated by multiplying the spread signal energy, (as derived, e.g., in J. D. Johnston, “Transform Coding of Audio Signals Using Perceptual Noise Criteria,” IEEE J. on Selected Areas in Communications, February 1988; K. Brandenburg and J. D. Johnston, “Second Generation Perceptual Audio Coding: The Hybrid Coder,” AES 89th Convention, 1990; and Brandenburg, et al U.S. Pat. No. 5,040,217) by a masking level difference factor shown illustratively in FIG.3. This calculates a second level of detectability of noise across frequency in the M and S channels, based on the masking level differences shown in various sources.
2. The actual threshold for M (THRm) is calculated as THRm=max(BTHRm, min(BTHRs,MLDs)) and the threshold m=max(BTHRm, min(BTHRs,MLDs)) and the threshold for S is calculated as THRs=max(BTHRs,min(BTHRm, MLDm)).
In effect, the MLD signal substitutes for the BTHR signal in cases where there is a chance of stereo unmasking. It is not necessary to consider the issue of M and S threshold depression due to unequal L and R thresholds, because of the fact that L and R thresholds are known to be equal.
The quantizer/rate loop processor206 used in the illustrative coder ofFIG. 2 takes the outputs from the analysis bank and the perceptual model, and allocate bits, noise, and controls other system parameters so as to meet the required bit rate for the given application. In some example coders this may consist of nothing more than quantization so that the just noticeable difference of the perceptual model is never exceeded, with no (explicit) attention to bit rate; in some coders this may be a complex set of iteration loops that adjusts distortion and bitrate in order to achieve a balance between bit rate and coding noise. Also desirably performed by therate loop processor206, and described in the rate loop application, is the function of receiving information from the quantized analyzed signal and any requisite side information, inserting synchronization and framing information. Again, these same functions are broadly described in the incorporated Brandenburg, et al, U.S. Pat. No. 5,040,217.
Entropy encoder208 is used to achieve a further noiseless compression in cooperation with therate loop processor206. In particular,entropy encoder208, in accordance with another aspect of the present invention, advantageously receives inputs including a quantized audio signal output from quantizer/rate loop206, performs a lossless encoding, on the quantized audio signal, and outputs a compressed audio signal to the communications channel/storage medium106.
Illustrative entropy encoder208 advantageously comprises a novel variation of the minimum-redundancy Huffman coding technique to encode each quantized audio signal. The Huffman codes are described, e.g., in D. A. Huffman, “A Method for the Construction of Minimum Redundancy Codes”, Proc. IRE, 40: 1098-1101 (1952) and T. M. Cover and J. A. Thomas, .us Elements of Information Theory, pp. 92-101 (1991). The useful adaptations of the Huffman codes advantageously used in the context of the coder ofFIG. 2 are described in more detail in the incorporated U.S. patent application by by J. D. Johnston and J. Reeds (hereinafter the “entropy coder application”) filed of even date with the present application and assigned to the assignee of this application. Those skilled in the data communications arts will readily perceive how to implement alternative embodiments ofentropy encoder208 using other noiseless data compression techniques, including the well-known Lempel-Ziv compression methods.
The use of each of the elements shown inFIG. 2 will be described in greater detail in the context of the overall system functionality; details of operation will be provided for theperceptual model processor204.
2.1. The Analysis Filter Bank
Theanalysis filter bank202 of theperceptual audio coder104 receives as input pulse code modulated (“PCM”) digital audio signals (typically 16-bit signals sampled at 48 KHz), and outputs a representation of the input signal which identifies the individual frequency components of the input signal. Specifically, an output of theanalysis filter bank202 comprises a Modified Discrete Cosine Transform (“MDCT”) of the input signal. See, J. Princen et al, “Sub-band Transform Coding Using Filter Bank Designs Based on Time Domain Aliasing Cancellation,” IEEE ICASSP, pp. 2161-2164 (1987).
An illustrativeanalysis filter bank202 according to one aspect of the present invention is presented in FIG.4.Analysis filter bank202 comprises aninput signal buffer302, awindow multiplier304, awindow memory306, anFFT processor308, anMDCT processor310, a concatenator311, adelay memory312 and adata selector314.
Theanalysis filter bank202 operates on frames. A frame is conveniently chosen as the 2N PCM input audio signal samples held byinput signal buffer302. As stated above, each PCM input audio signal sample is represented by M bits. Illustratively, N=512 and M=16.
Input signal buffer302 comprises two sections: a first section comprising N samples inbuffer locations1 to N, and a second section comprising N samples in buffer locations N+1 to 2N. Each frame to be coded by theperceptual audio coder104 is defined by shifting N consecutive samples of the input audio signal into theinput signal buffer302. Older samples are located at higher buffer locations than newer samples.
Assuming that, at a given time, theinput signal buffer302 contains a frame of 2N audio signal samples, the succeeding frame is obtained by (1) shifting the N audio signal samples inbuffer locations1 to N into buffer locations N+1 to 2N, respectively, (the previous audio signal samples in location N+1 to 2N may be either overwritten or deleted), and (2) by shifting into theinput signal buffer302, atbuffer locations1 to N, N new audio signal samples frompreprocessor102. Therefore, it can be seen that consecutive frames contain N samples in common: the first of the consecutive frames having the common samples inbuffer locations1 to N, and the second of the consecutive frames having the common samples in buffer locations N+1 to 2N.Analysis filter bank202 is a critically sampled system (i.e., for every N audio signal samples received by theinput signal buffer302, theanalysis filter bank202 outputs a vector of N scalers to the quantizer/rate-loop206).
Each frame of the input audio signal is provided to thewindow multiplier304 by theinput signal buffer302 so that thewindow multiplier304 may apply seven distinct data windows to the frame.
Each data window is a vector of scalers called “coefficients”. While all seven of the data windows have 2N coefficients (i.e., the same number as there are audio signal samples in the frame), four of the seven only have N/2 non-zero coefficients (i.e., one-fourth the number of audio signal samples in the frame). As is discussed below, the data window coefficients may be advantageously chosen to reduce the perceptual entropy of the output of theMDCT processor310.
The information for the data window coefficients is stored in thewindow memory306. Thewindow memory306 may illustratively comprise a random access memory (“RAM”), read only memory (“ROM”), or other magnetic or optical media. Drawings of seven illustrative data windows, as applied bywindow multiplier304, are presented in FIG.14. Typical vectors of coefficients for each of the seven data windows presented in FIG.14. As may be seen inFIG. 14, some of the data window coefficients may be equal to zero.
Keeping in mind that the data window is a vector of 2N scalers and that the audio signal frame is also a vector of 2N scalers, the data window coefficients are applied to the audio signal frame scalers through point-to-point multiplication (i.e., the first audio signal frame scaler is multiplied by the first data window coefficient, the second audio signal frame scaler is multiplied by the second data window coefficient, etc.).Window multiplier304 may therefore comprise seven microprocessors operating in parallel, each performing 2N multiplications in order to apply one of the seven data window to the audio signal frame held by theinput signal buffer302. The output of thewindow multiplier304 is seven vectors of 2N scalers to be referred to as “windowed frame vectors”.
The seven windowed frame vectors are provided bywindow multiplier304 toFFT processor308. TheFFT processor308 performs an odd-frequency FFT on each of the seven windowed frame vectors. The odd-frequency FFT is an Discrete Fourier Transform evaluated at frequencies:kfH2N
where k=1,3,5, . . . ,2N, and fNequals one half the sampling rate. Theillustrative FFT processor308 may comprise seven conventional decimation-in-time FFT processors operating in parallel, each operating on a different windowed frame vector. An output of theFFT processor308 is seven vectors of 2N complex elements, to be referred to collectively as “FFT vectors”.
FFT processor308 provides the seven FFT vectors to both theperceptual model processor204 and theMDCT processor310. Theperceptual model processor204 uses the FFT vectors to direct the operation of thedata selector314 and the quantizer/rate-loop processor206. Details regarding the operation ofdata selector314 andperceptual model processor204 are presented below.
MDCT processor310 performs an MDCT based on the real components of each of the seven FFT vectors received fromFFT processor308.MDCT processor310 may comprise seven microprocessors operating in parallel. Each such microprocessor determines one of the seven “MDCT vectors” of N real scalars based on one of the seven respective FFT vectors. For each FFT vector, F(k), the resulting MDCT vector, X(k), is formed as follows:X(k)=Re[F(k)]cos[π(2k+1)(1+N)4N]1kN.
The procedure need run k only to N, not 2N, because of redundancy in the result. To wit, for N<k≦2N:
X(k)=−X(2N−k).
MDCT processor310 provides the seven MDCT vectors to concatenator311 and delaymemory312.
As discussed above with reference towindow multiplier304, four the seven data windows have N/2 non-zero coefficients (see FIGS.4c-f). This means that four of the windowed frame vectors contain only N/2 non-zero values. Therefore, the non-zero values of these four vectors may be concatenated into a single vector oflength 2N by concatenator311 upon output fromMDCT processor310. The resulting concatenation of these vectors is handled as a single vector for subsequent purposes. Thus,delay memory312 is presented with four MDCT vectors, rather than seven.
Delay memory312 receives the four MDCT vectors fromMDCT processor310 and concatenator311 for the purpose of providing temporary storage.Delay memory312 provides a delay of one audio signal frame (as defined by input signal buffer302) on the flow of the four MDCT vectors through thefilter bank202. The delay is provided by (i) storing the two most recent consecutive sets of MDCT vectors representing consecutive audio signal frames and (ii) presenting as input todata selector314 the older of the consecutive sets of vectors.Delay memory312 may comprise random access memory (RAM) of size:
M×2×4×N
where 2 is the number of consecutive sets of vectors, 4 is the number of vectors in a set, N is the number of elements in an MDCT vector, and M is the number of bits used to represent an MDCT vector element.
Data selector314 selects one of the four MDCT vectors provided bydelay memory312 to be output from thefilter bank202 to quantizer/rate-loop206. As mentioned above, theperceptual model processor204 directs the operation ofdata selector314 based on the FFT vectors provided by theFFT processor308. Due to the operation ofdelay memory312, the seven FFT vectors provided to theperceptual model processor204 and the four MDCT vectors concurrently provided todata selector314 are not based on the same audio input frame, but rather on two consecutive input signal frames—the MDCT vectors based on the earlier of the frames, and the FFT vectors based on the later of the frames. Thus, the selection of a specific MDCT vector is based on information contained in the next successive audio signal frame. The criteria according to which theperceptual model processor204 directs the selection of an MDCT vector is described in Section 2.2, below.
For purposes of an illustrative stereo embodiment, theabove analysis filterbank202 is provided for each of the left and right channels.
2.2. The Perceptual Model Processor
A perceptual coder achieves success in reducing the number of bits required to accurately represent high quality audio signals, in part, by introducing noise associated with quantization of information bearing signals, such as the MDCT information from thefilter bank202. The goal is, of course, to introduce this noise in an imperceptible or benign way. This noise shaping is primarily a frequency analysis instrument, so it is convenient to convert a signal into a spectral representation (e.g., the MDCT vectors provided by filter bank202), compute the shape and amount of the noise that will be masked by these signals and injecting it by quantizing the spectral values. These and other basic operations are represented in the structure of the perceptual coder shown in FIG.2.
Theperceptual model processor204 of theperceptual audio coder104 illustratively receives its input from theanalysis filter bank202 which operates on successive frames. The perceptual model processor inputs then typically comprise seven Fast Fourier Transform (FFT) vectors from theanalysis filter bank202. These are the outputs of theFFT processor308 in the form of seven vectors of 2N complex elements, each corresponding to one of the windowed frame vectors.
In order to mask the quantization noise by the signal, one must consider the spectral contents of the signal and the duration of a particular spectral pattern of the signal. These two aspects are related to masking in the frequency domain where signal and noise are approximately steady state—given the integration period of the hearing system- and also with masking in the time domain where signal and nose are subjected to different cochlear filters. The shape and length of these filters are frequency dependent.
Masking in the frequency domain is described by the concept of simultaneous masking. Masking in the time domain is characterized by the concept of premasking and postmasking. These concepts are extensively explained in the literature; see, for example, E. Zwicker and H. Fastl, “Psychoacoustics, Facts, and Models,” Springer-Verlag, 1990. To make these concepts useful to perceptual coding, they are embodied in different ways.
Simultaneous masking is evaluated by using perceptual noise shaping models. Given the spectral contents of the signal and its description in terms of noise-like or tone-like behavior, these models produce an hypothetical masking threshold that rules the quantization level of each spectral component. This noise shaping represents the maximum amount of noise that may be introduced in the original signal without causing any perceptible difference. A measure called the PERCEPTUAL ENTROPY (PE) uses this hypothetical masking threshold to estimate the theoretical lower bound of the bitrate for transparent encoding. J. D. Jonston, Estimation of Perceptual Entropy Using Noise Masking Criteria,” ICASSP, 1989.
Premasking characterizes the (in)audibility of a noise that starts some time before the masker signal which is louder than the noise. The noise amplitude must be more attenuated as the delay increases. This attenuation level is also frequency dependent. If the noise is the quantization noise attenuated by the first half of the synthesis window, experimental evidence indicates the maximum acceptable delay to be about 1 millisecond.
This problem is very sensitive and can conflict directly with achieving a good coding gain. Assuming stationary conditions—which is a false premiss—The coding gain is bigger for larger transforms, but, the quantization error spreads till the beginning of the reconstructed time segment. So, if a transform length of 1024 points is used, with a digital signal sampled at a rate of 48000 Hz, the noise will appear at most 21 milliseconds before the signal. This scenario is particularly critical when the signal takes the form of a sharp transient in the time domain commonly known as an “attack”. In this case the quantization noise is audible before the attack. The effect is known as pre-echo.
Thus, a fixed length filter bank is a not a good perceptual solution nor a signal processing solution for non-stationary regions of the signal. It will be shown later that a possible way to circumvent this problem is to improve the temporal resolution of the coder by reducing the analysis/synthesis window length. This is implemented as a window switching mechanism when conditions of attack are detected. In this way, the coding gain achieved by using a long analysis/synthesis window will be affected only when such detection occurs with a consequent need to switch to a shorter analysis/synthesis window.
Postmasking characterizes the (in)audibility of a noise when it remains after the cessation of a stronger masker signal. In this case the acceptable delays are in the order of 20 milliseconds. Given that the bigger transformed time segment lasts 21 milliseconds (1024 samples), no special care is needed to handle this situation.
WINDOW SWITCHING
The PERCEPTUAL ENTROPY (PE)_measure of a particular transform segment gives the theoretical lower bound of bits/sample to code that segment transparently. Due to its memory properties, which are related to premasking protection, this measure shows a significant increase of the PE value to its previous value—related with the previous segment—when some situations of strong non-stationarity of the signal (e.g. an attack) are presented. This important property is used to activate the window switching mechanism in order to reduce pre-echo. This window switching mechanism is not a new strategy, having been used, e.g., in the ASPEC coder, described in the ISO/MPEG Audio Coding Report, 1990, but the decision technique behind it is new using the PE information to accurately localize the non-stationarity and define the right moment to operate the switch.
Two basic window lengths: 1024 samples and 256 samples are used. The former corresponds to a segment duration of about 21 milliseconds and the latter to a segment duration of about 5 milliseconds. Short windows are associated in sets of 4 to represent as much spectral data as a large window (but they represent a “different” number of temporal samples). In order to make the transition from large to short windows and vice-versa it proves convenient to use two more types of windows. A START window makes the transition from large (regular) to short windows and a STOP window makes the opposite transition, as shown in FIG.5b. See the above-cited Princen reference for useful information on this subject. Both windows are 1024 samples wide. They are useful to keep the system critically sampled and also to guarantee the time aliasing cancellation process in the transition region.
In order to exploit interchannel redundancy and irrelevancy, the same type of window is used for RIGHT and LEFT channels in each segment.
The stationarity behavior of the signal is monitored at two levels. First by large regular windows, then if necessary, by short windows. Accordingly, the PE of large (regular) window is calculated for every segment while the PE of short windows are calculated only when needed. However, the tonality information for both types is updated for every segment in order to follow the continuous variation of the signal.
Unless stated otherwise, a segment involves 1024 samples which is the length of a large regular window.
The diagram ofFIG. 5a represents all the monitoring possibilities when the segment from the point N/2 till thepoint 3N/2 is being analysed. Related to the digram ofFIG. 5 is the flowchart ofFIG. 6 which describes the monitoring sequence and decision technique. We need to keep in buffer three halves of a segment in order to be able to insert a START window prior to a sequence of short windows when necessary.FIGS. 5a-e explicitly considers the 50% overlap between successive segments.
The process begins by analysing a “new” segment with 512 new temporal samples (the remaining 512 samples belong to the previous segment). As shown inFIG. 6, the PE of this new segment and the differential PE to the previous segment are calculated (601). If the latter value reaches a predefined threshold (602), then the existence of a non-stationarity inside the current segment is declared and details are obtained by processing four short windows with positions as represented in FIG.5a. The PE value of each short window is calculated (603) resulting in the ordered sequence: PE1, PE2, PE3 and PE4. From these values, the exact beginning of the strong non-stationarity of the signal is deduced. Only five locations are possible, identified inFIG. 5a as L1, L2, L3, L4 and L5. As it will become evident, if the non-stationarity had occurred somewhere from the point N/2 till the point 15N/16, that situation would have been detected in the previous segment. It follows that the PE1 value does not contain relevant information about the stationarity of the current segment. The average PE of the short windows is compared with the PE of the large window of the same segment (605). A smaller PE reveals a more efficient coding situation. Thus if the former value is not smaller than the latter, then we assume that we are facing a degenerate situation and the window switching process is aborted.
It has been observed that for short windows the information about stationarity lies more on its PE value than on the differential to the PE value of the precedent window. Accordingly, the first window that has a PE value larger than a predefined threshold is detected. PE2 is identified with location L1, PE3 with L2 and PE4 with location L3. In either case, a START window (608) is placed before the current segment that will be coded with short windows. A STOP window is needed to complete the process (616). There are, however, two possibilities. If the identified location where the strong non-stationarity of the signal begins is L1 or L2 then, this is well inside the short window sequence, no coding artifacts result and the coding sequence is depicted in FIG.5b. If the location if L4 (612), then, in the worst situation, the non-stationarity may begin very close to the right edge of the last short window. Previous results have consistently shown that placing a STOP window—in coding conditions—in these circumstances degrades significantly the reconstruction of the signal in this switching point. For this reason, another set of four short windows is placed before a STOP window (614). The resulting coding sequence is represented in FIG.5e.
If none of the short PEs is above the threshold, the remaining possibilities are L4 or L5. In this case, the problem lies ahead of the scope of the short window sequence and the first segment in the buffer may be immediately coded using a regular large window.
To identify the correct location, another short window must be processed. It is represented inFIG. 5a by a dotted curve and its PE value, PE1n+1, is also computed. As it is easily recognized, this short window already belongs to the next segment. If PE1n+1is above the threshold (611), then, the location is L4 and, as depicted inFIG. 5c, a START window (613) may be followed by a STOP window (615). In this case the spread of the quantization noise will be limited to the length of a short window, and a better coding gain is achieved. In the rare situation of the location being L5, then the coding is done according to the sequence of FIG.5d. The way to prove that in this case that is right solution is by confirming that PE2n+1will be above the threshold. PE2n+1is the PE of the short window (not represented inFIG. 5) immediately following the window identified with PE1n+1.
As mentioned before for each segment, RIGHT and LEFT channels use the same type of analysis/synthesis window. This means that a switch is done for both channels when at least one channel requires it.
It has been observed that for low bitrate applications the solution ofFIG. 5c, although representing a good local psychoacoustic solution, demands an unreasonably large number of bits that may adversely affect the coding quality of subsequent segments. For this reason, that coding solution may eventually be inhibited.
It is also evident that the details of the reconstructed signal when short windows are used are closer to the original signal than when only regular large window are used. This is so because the attack is basically a wide bandwidth signal and may only be considered stationary for very short periods of time. Since short windows have a greater temporal resolution than large windows, they are able to follow and reproduce with more fidelity the varying pattern of the spectrum. In other words, this is the difference between a more precise local (in time) quantization of the signal and a global (in frequency) quantization of the signal.
The final masking threshold of the stereophonic coder is calculated using a combination of monophonic and stereophonic thresholds. While the monophonic threshold is computed independently for each channel, the stereophonic one considers both channels.
The independent masking threshold for the RIGHT of the LEFT channel is computed using a psychoacoustic model that includes an expression for tone masking noise and noise masking tone. The latter is used as a conservative approximation for a noise masking noise expression. The monophonic threshold is calculated using the same procedure as previous work. In particular, a tonality measure considers the evolution of the power and the phase of each frequency coefficient across the last three segments to identify the signal as being more tone—like or noise—like. Accordingly, each psychoacoustic expression is more of less weighted than the other. These expressions found in the literature were updated for better performance. They are defined as:TMNdB=19.5+bark18.026.0NMTdB=6.56-bark3.0626.0
where bark is the frequency in Bark scale. The scale is related to what we may call the cochlear filters or critical bands which, in turn, are identified with constant length segments of the basilar membrane. The final threshold is adjusted to consider absolute thresholds of masking and also to consider a partial premasking protection.
A brief description of the complete monophonic threshold calculation follows. Some terminology must be introduced in order to simplify the description of the operations involved.
The spectrum of each segment is organized in three different ways, each one following a different purpose.
1. First, it may be organized in partitions. Each partition has associated one single Bark value. These partitions provide a resolution of approximately either one MDCT line or ⅓ of a critical band, whichever is wider. At low frequencies a single line of the MDCT will constitute a coder partition. At high frequencies, many lines will be combined into one coder partition. In this case the Bark value associated is the median Bark point of the partition. This partitioning of the spectrum is necessary to insure an acceptable resolution for the spreading function. As will be shown later, this function represents the masking influence among neighboring critical bands.
2. Secondly, the spectrum may be organized in bands. Bands are defined by a parameter file. Each band groups a number of spectral lines that are associated with a single scale factor that results from the final masking threshold vector.
3. Finally, the spectrum may also be organized in sections. It will be shown later that sections involve an integer number of bands and represent a region of the spectrum coded with the same Huffman code book.
Three indices for data values are used. These are:
    • ω→indicates that the calculation is indexed by frequency in the MDCT line domain.
    • b→indicates that the calculation is indexed in the threshold calculation partition domain. In the case where we do a convolution or sum in that domain, bb will be used as the summation variable.
    • n→indicates that the calculation is indexed in the coder band domain.
Additionally some symbols are also used:
    • 1. The index of the calculation partition, b.
    • 2. The lowest frequency line in the partition, ωlowb.
    • 3. The highest frequency line in the partition, ωhighb.
    • 4. The median bark value of the partition, bvalb.
    • 5. The value for tone masking noise (in dB) for the partition, TMNb.
    • 6. The value for noise masking tone (in dB) for the partition, NMTb.
Several points in the following description refer to the “spreading function”. It is calculated by the following method:
tmpx=1.05(j−i),
Where i is the bark value of the signal being spread, j the bark value of the band being spread into, and tmpx is a temporary variable.
x=8minimum((tmpx−0.5)2−2(tmpx−0.5),0)
Where x is a temporary variable, and minimum(a,b) is a function returning the more negative of a or b.
tmpy=15.811389+7.5(tmpx+0.474)−17.5(1.+(tmpx+0.474)2)0.5
where tmpy is another temporary variable.if(tmpy<-100)then[sprdngf(i,j)=0]else[sprdngf(i,j)=10(x+tmpy)10.].
Steps in Threshold Calculation
The following steps are the necessary steps for calculation the SMRnused in the coder.
    • 1.Concatenate 512 new samples of the input signal to from another 1024 samples segment. Please refer to FIG.5a.
    • 2. Calculate the complex spectrum of the input signal using the O-FFT as described in 2.0 and using a sine window.
    • 3. Calculate a predicted r and φ.
The polar representation of the transform is calculated rω and φω represent the magnitude and phase components of a spectral line of the transformed segment.
A predicted magnitude, {circumflex over (r)}ω, and phase {circumflex over (φ)}ω, are calculated from the preceding two threshold calculation blocks' r and φ:
{circumflex over (r)}ω=2r107(t−1)−rω(t−2)
{circumflex over (φ)}ω=2φ107(t−1)−φω(t−2)
where t represents the current block number, t−1 indexes the previous block's data, and t−2 indexes the data from the threshold calculation block before that,
    • 4. Calculate the unpredictability measure cω cω, the unpredictability measure, is:cω=((rωcosϕω-r^ωcosϕ^ω)2+(rωsinϕω-r^ωsinϕ^ω)2)5rω+abs(r^ω)
    • 5. Calculate the energy and unpredictability in the threshold calculation partitions.
The energy in each partition, eb, is:eb=ω=ωlowbωhighbrω2
and the weighted unpredictability, cb, is:cb=ω=ωlowbωhighbrω2cω
    • 6. Convolve the partitioned energy and unpredictability with the spreading function.ecbb=bb=1bmaxebbsprdngf(bvalbb,bvalb)ctb=bb=1bmaxcbbsprdngf(bvalbb,bvalb)
Because ctbis weighted by the signal energy, it must be renormalized to cbb.cbb=ctbecbb
At the same time, due to the non-normalized nature of the spreading function, ecbbshould be renormalized and the normalized energy enb, calculated.enb=ecbbrnormb
The normalization coefficient, rnormbis:rnormb=1bb=0bmaxsprdngf(bvalbb,bvalb)
    • 7. Convert cbbto tbb.
      tbb=−0.299−0.43 loga(cbb)
    •  Each tbbis limited to the range of 0≦tbb≦1.
    • 8. Calculate the required SNR in each partition.TMNb=19.5+bvalb18.026.0NMTb=6.56-bvalb3.0626.0
Where TMNbis the tone masking noise in dB and NMTbis the noise masking tone value in dB.
The required signal to noise ratio, SNRb, is:
SNRb=tbbTMNb+(1−tbb)NMTb
    • 9. Calculate the power ratio.
The power ratio, bcb, is:bcb=10-SNRb10
    • 10. Calculation of actual energy threshold, nbb.
      nbb=enbbcb
    • 11. Spread the threshold energy over MDCT lines, yielding nbωnbω=nbbωhighb-ωlowb+1
    • 12. Include absolute thresholds, yielding the final energy threshold of audibility, thrω
      thrωmax(nbωabsthrω).
The dB values of absthr shown in the “Absolute Threshold Tables” are relative to the level that a sine wave of ±½ lsb has in the MDCT used for threshold calculation. The dB values must be converted into the energy domain after considering the MDCT normalization actually used.
    • 13. Pre-echo control
    • 14. Calculate the signal to mask ratios, SMRn.
The table of “Bands of the Coder” shows
    • 1. The index, n, of the band.
    • 2. The upper index, ωhighnof the band n. The lower index, ωlown, is computed from the previous band as ωhighn-1+1.
To further classify each band, another variable is created. The width index, widthn, will assume a value widthn=1 if n is a perceptually narrow band, and widthn=0 if n is a perceptually wide band. The former case occurs if
bvalωhighb−bvalωlowb<bandlength
bandlength is a parameter set in the initialization routine. Otherwise the latter case is assumed.
Then, if (widthn=1), the noise level in the coder band, nbandnis calculated as:nbandn=ω=ωlownωhighnthrωωhighn-ωlown+1,
else,
nbandn=minimum(thrωlown, . . . ,thrωhighn)
Where, in this case, minimum(a, . . . ,z) is a function returning the most negative or smallest positive argument of the arguments a . . . z.
The ratios to be sent to the decoder, SMRn, are calculated as:SMRn=10·log10([12.0*nbandn]0.5minimum(absthr))
It is important to emphasize that since the tonality measure is the output of a spectrum analysis process, the analysis window has a sine form for all the cases of large or short segments. In particular, when a segment is chosen to be coded as a START or STOP window, its tonality information is obtained considering a sine window; the remaining operations, e.g. the threshold calculation and the quantization of the coefficients, consider the spectrum obtained with the appropriate window.
STEREOPHONIC THRESHOLD
The stereophonic threshold has several goals. It is known that most of the time the two channels sound “alike”. Thus, some correlation exists that may be converted in coding gain. Looking into the temporal representation of the two channels, this correlation is not obvious. However, the spectral representation has a number of interesting features that may advantageously be exploited. In fact, a very practical and useful possibility is to create a new basis to represent the two channels. This basis involves two orthogonal vectors, the vector SUM and the vector DIFFERENCE defined by the following linear combination:[SUMDIF]=12[111-1]·[RIGHTLEFT]
These vectors, which have the length of the window being used, are generated in the frequency domain since the transform process is by definition a linear operation. This has the advantage of simplifying the computational load.
The first goal is to have a more decorrelated representation of the two signals. The concentration of most of the energy in one of these new channels is a consequence of the redundancy that exists between RIGHT and LEFT channels and on average, leads always to a coding gain.
A second goal is to correlate the quantization noise of the RIGHT and LEFT channels and control the localization of the noise or the unmasking effect. This problem arises if RIGHT and LEFT channels are quantized and coded independently. This concept is exemplified by the following context: supposing that the threshold by masking for a particular signal has been calculated, two situations may be created. First, we add to the signal an amount of noise that corresponds to the threshold. If we present this same signal with this same noise to the two ears then the noise is masked. However, if we add an amount of noise that corresponds to the threshold to the signal and present this combination to one ear; do the same operation for the other ear but with noise uncorrelated with the previous one, then the noise is not masked. In order to achieve masking again, the noise at both ears must be reduced by a level given by the masking level differences (MLD).
The unmasking problem may be generalized to the following form: the quantization noise is not masked if it does not follow the localization of the masking signal. Hence, in particular, we may have two limit cases: center localization of the signal with unmasking more noticeable on the sides of the listener and side localization of the signal with unmasking more noticeable on the center line.
The new vectors SUM and DIFFERENCE are very convenient because they express the signal localized on the center and also on both sides of the listener. Also, they enable to control the quantization noise with center and side image. Thus, the unmasking problem is solved by controlling the protection level for the MLD through these vectors. Based on some psychoacoustic information and other experiments and results, the MLD protection is particularly critical for very low frequencies to about 3 KHz. It appears to depend only on the signal power and not on its tonality properties. The following expression for the MLD proved to give good results:MLDdB(i)=25.5[cosπb(i)32.0]2
where i is the partition index of the spectrum (see [7]), and b(i) is the bark frequency of the center of the partition i. This expression is only valid for b(i)≦16.0 i.e. for frequencies below 3 KHz. The expression for the MLD threshold is given by:THRMLD(i)=C(i)10-MLDdB(i)10
C(i) is the spread signal energy on the basilar membrane, corresponding only to the partition i.
A third and last goal is to take advantage of a particular stereophonic signal image to extract irrelevance from directions of the signal that are masked by that image. In principle, this is done only when the stereo image is strongly defined in one direction, in order to not compromise the richness of the stereo signal. Based on the vectors SUM and DIFFERENCE, this goal is implemented by positioning the following two dual principles:
    • 1. If there is a strong depression of the signal (and hence of the noise) on both sides of the listener, then an increase of the noise on the middle line (center image) is perceptually tolerated. The upper bound is the side noise.
    • 2. If there is a strong localization of the signal (and hence of the noise) on the middle line, then an increase of the (correlated) noise on both sides is perceptually tolerated. The upper bound is the center noise.
However, any increase of the noise level must be corrected by the MLD threshold.
According to these goals, the final stereophonic threshold is computed as follows. First, the thresholds for channels SUM and DIFFERENCE are calculated using the monophonic models for noise-masking-tone and tone-masking-noise. The procedure is exactly the one presented in pages 25 and 26. At this point we have the actual energy threshold per band, nbbfor both channels. By convenience, we call then THRnSUMand THRnDIF, respectively for the channel SUM and the channel DIFFERENCE.
Secondly, the MLD threshold for both channels i.e. THRnMLD,SUMand THRnMLD,DIF, are also calculated by:THRnMLD,SUM=enb,SUM10-MLDndB10THRnMLD,DIF=enb,DIF10-MLDndB10
The MLD protection and the stereo irrelevance are considered by computing:
nthrSUM=MAX[THRnSUM, MIN(THRnDIF, THRnMLD,DIF)]
nthrDIF=MAX[THRnDIF, MIN(THRnSUM, THRnMLD,SUM)]
After these operations, the remaining steps after the 11th, as presented in 3.2 are also taken for both channels. In essence, these last thresholds are further adjusted to consider the absolute threshold and also a partial premasking protection. It must be noticed that this premasking protection was simply adopted from the monophonic case. It considers a monaural time resolution of about 2 milliseconds. However, the binaural time resolution is as accurate as 6 microseconds! To conveniently code stereo signals with relevant stereo image based on interchannel time differences, is a subject that needs further investigation.
STEREOPHONIC CODER
The simplified structure of the stereophonic coder allows for the encoding of the stereo signals which are subsequently decoded by the stereophonic decoder which, is presented in FIG.12. For each segment of data being analysed, detailed information about the independent and relative behavior of both signal channels may be available through the information given by large and short transforms. This information is used according to the necessary number of steps needed to code a particular segment. These steps involve essentially the selection of the analysis window, the definition on a band basis of the coding mode (R/L or S/D), the quantization (704) and Huffman coding (705) of the coefficients (708) and scale factors (707) and finally, the bitstream composing (706) with a bit stream organization as depicted in FIG.10.
Coding Mode Selection
When a new segment is read, the tonality updating for large and short analysis windows is done. Monophonic thresholds and the PE values are calculated according to the technique described previously. This gives the first decision about the type of window to be used for both channels.
Once the window sequence is chosen, an orthogonal coding decision is then considered. It involves the choice between independent coding of the channels, mode RIGHT/LEFT (R/L) or joint coding using the SUM and DIFFERENCE channels (S/D). This decision is taken on a band basis of the coder. This is based on the assumption that the binaural perception is a function of the output of the same critical bands at the two ears. If the threshold at the two channels is very different, then there is no need for MLD protection and the signals will not be more decorrelated if the channels SUM and DIFFERENCE are considered. If the signals are such that they generate a stereo image, then a MLD protection must be activated and additional gains may be exploited by choosing the S/D coding mode. A convenient way to detect this latter situation is by comparing the monophonic threshold between RIGHT and LEFT channels. If the thresholds in a particular band do not differ by more than a predefined value, e.g. 2 dB, then the S/D coding mode is chosen. Otherwise the independent mode R/L is assumed. Associated which each band is a one bit flag that specifies the coding mode of that band and that must be transmitted to the decoder as side information. From now on it is called a coding mode flag.
The coding mode decision is adaptive in time since for the same band it may differ for subsequent segments, and is also adaptive in frequency since for the same segment, the coding mode for subsequent bands may be different. An illustration of a coding decision is given in FIG.13. This illustration is valid for long and also short segments.
At this point it is clear that since the window switching mechanism involves only monphonic measures, the maximum number of PE measures per segment is 10 (2 channels *[1large window+4 short windows]). However, the maximum number of thresholds that we may need to compute per segment is 20 and therefore 20 tonality measures must be always updated per segment (4 channels *[1large window+4 short windows]).
Bitrate Adjustment
It was previously said that the decisions for window switching and for coding mode selection are orthogonal in the sense that they do not depend on each other. Independent to these decisions is also the final step of the coding process that involves quantization, Huffman coding and bitstream composing: i.e. there is no feedback path. This fact has the advantage of reducing the whole coding delay to a minimum value (1024/48000=21.3 milliseconds) and also to avoid instabilities due to unorthodox coding situations.
The quantization process effects both spectral and coefficients and scale factors. Spectral coefficients are clustered in bands, each band having the same step size or scale factor. Each step size is directly computed from the masking threshold corresponding to its band. The quantized values, which are integer numbers, are then converted to variable word length of Huffman codes. The total number of bits to code the segment, considering additional fields of the bitstream, is computed. Since the bitrate must be kept constant, the quantization process must be iteratively done till that number of bits is within predefined limits. After the number of bits needed to code the whole segment, considering the basic masking threshold, the degree of adjustment is dictated by a buffer control unit. This control unit shares the deficit or credit of additional bits among several segments, according to the needs of each one.
The technique of the bitrate adjustment routine is represented by the flowchart of FIG.9. It may be seen that after the total number of available bits to be used by the current segment is computed, an iterative procedure tries to find a factor α such that if all the initial thresholds are multiplied by this factor, the final total number of bits is smaller then and within an error δ of the available number of bits. Even if the approximation curve is so hostile that α is not found within the maximum number of iterations, one acceptable solution is always available.
The main steps of this routine are depicted in FIG.7 andFIG. 9 as follows. First, an interval including the solution is found. Then, a loop seeks to rapidly converge to the best solution. At each iteration, the best solution is updated. Thus, the total number of bits to represent the present whole segment (710) using the basic masking threshold is evaluated. Next, the total number of bits available to be used by the current segment is computed based on the current buffer status from the buffer control (703). A comparison (903) is made between the total number of bits available in the buffer and the calculated total number of bits to represent the current whole segment. If the required number of bits is less than the available number of bits in the buffer, a further comparison is made to determine if the final total number of bits required is within an error factor of the available number of bits (904). If within the error factor, the total number of bits required to represent the current whole segment are transmitted (916) to the entropy encoder (208). If not within the error factor, an evaluation is done based upon the number of bits required to represent the whole segment at the absolute threshold values (905). If the required number of bits to represent the whole segment at the absolute threshold values are less than the total number of bits available (906) they are transmitted (916) to the entropy encoder (208).
If at this point, neither the basic masking threshold nor absolute thresholds have provided an acceptable bit representation of the whole segment, an iterative procedure (as shown in907 through915) is employed to establish the interpolation factor used as a multiplier and discussed previously. If successful, the iterative procedure will establish a bit representation of the whole segment which is within the buffer limit and associated error factor. Otherwise, after reaching a maximum number of iterations (908) the iterative process will return the last best approximation (915) of the whole segment as output (916).
In order to use the same procedure for segments coded with large and short windows, in this latter case, the coefficients of the 4 short windows are clustered by concatenating homologue bands. Scale factors are clustered in the same.
The bitrate adjustment routine (704) calls another routine that computes the total number of bits to represent all the Huffman coded words (705) (coefficients and scale factors). This latter routine does a spectrum partioning according to the amplitude distribution of the coefficients. The goal is to assign predefined Huffman code books to sections of the spectrum. Each section groups a variable number of bands and its coefficients are Huffman coded with a convenient book. The limits of the section and the reference of the code book must be sent to the decoder as side information.
The spectrum partioning is done using a minimum cost strategy. The main steps are as follows. First, all possible sections are defined -the limit is one section per band- each one having the code book that best matches the amplitude distribution of the coefficients within that section. As the beginning and the end of the whole spectrum is known, if K is the number of sections, there are K−1 separators between sections. The price to eliminate each separator is computed. The separator that has a lower price is eliminated (initial prices may be negative). Prices are compared again before the next iteration. This process is repeated till a maximum allowable number of sections is obtained and the smallest price to eliminate another separator is higher than a predefined value.
Aspects of the processing accomplished by quantizer/rate-loop206 inFIG. 2 will now be presented. In the prior art, rate-loop mechanisms have contained assumptions related to the monophonic case. With the shift from monophonic to stereophonic perceptual coders, the demands placed upon the rate-loop are increased.
The inputs to quantizer/rate-loop206 inFIG. 2 comprise spectral coefficients (i.e., the MDCT coefficients) derived byanalysis filter bank202, and outputs ofperceptual model204, including calculated thresholds corresponding to the spectral coefficients.
Quantizer/rate-loop206 quantizes the spectral information based, in part, on the calculated thresholds and the absolute thresholds of hearing and in doing so provides a bitstream toentropy encoder208. The bitstream includes signals divided into three part: (1) a first part containing the standardized side information; (2) a second part containing the scaling factors for the 35 or 56 bands and additional side information used for so-called adaptive-window switching, when used (the length of this part can vary depending on information in the first part) and (3) a third part comprising the quantized spectral coefficients.
A “utilized scale factor”, Δ, is iteratively derived by interpolating between a calculated scale factor and a scale factor derived from the absolute threshold of hearing at the frequency corresponding to the frequency of the respective spectral coefficient to be quantized until the quantized spectral coefficients can be encoded within permissible limits.
An illustrative embodiment of the present invention can be seen in FIG.13. As shown at1301 quantizer/rate-loop receives a spectral coefficient, Cy, and an energy threshold, E, corresponding to that spectral coefficient. A “threshold scale factor”, Δois calculated byΔ0=12E
An “absolute scale factor”, ΔA, is also calculated based upon the absolute threshold of hearing (i.e., the quietest sound that can be heard at the frequency corresponding to the scale factor). Advantageously, an interpolation constant, α, and interpolation bounds αhighand αloware initialized to aid in the adjustment of the utilized scale factor.
    • αhigh=1
    • αlow=0
    • α=αhigh
Next, as shown in1305, the utilized scale factor is determined from:
Δ=Δoα×ΔA(1-alpha)
Next, as shown in1307, the utilized scale factor is itself quantized because the utilized scale factor as computed above is not discrete but is advantageously discrete when transmitted and used.
Δ=Q−1(Q(Δ))
Next, as shown in1309, the spectral coefficient is quantized using the utilized scale factor to create a “quantized spectral coefficient” Q (Cy, Δ).Q(Cf,Δ)=NINT(CfΔ)
where “NINT” is the nearest integer function. Because quantizer/rate loop206 must transmit both the quantized spectral coefficient and the utilized scale factor, a cost, C, is calculated which is associated with how many bits it will take to transmit them both. As shown in FIG.1311,
C=FOO(Q(Cy, Δ), Q(Δ))
where FOO is a function which, depending on the specific embodiment, can be easily determined by persons having ordinary skill in the art of data communications. As shown in1313, the cost, C is tested to determine whether it is in a permissible range PR. When the cost is within the permissible range, Q (Cy, Δ) and Q(Δ) are transmitted toentropy coder208.
Advantageously, and depending on the relationship of the cost C to the permissible range PR the interpolation constant and bounds are adjusted until the utilized scale factor yields a quantized spectral coefficient which has a cost within the permissible range. Illustratively, as shown inFIG. 13 at1313, the interpolation bounds are manipulated to produce a binary search. Specifically,
when C>PR, αhigh=α,
alternately,
when C<PR, αlow=α.
In either case, a new interpolation constant is calculated by:α=αlow+αhigh2
The process then continues at1305 iteratively until the C comes within the permissible range PR.
STEREOPHONIC DECODER
The stereophonic decoder has a very simple structure as shown in FIG.12. Its main functions are reading the incoming bitstream (1202), decoding all the data (1203), inverse quantization and reconstruction of RIGHT and LEFT channels (1204). The technique is represented in FIG.12. Thus, the decoder is performing complementary operations to that of the encoder depicted inFIG. 7 such as operations that are complementary to quantization (704) and Huffman coding (705).
Illustrative embodiments may comprise digital signal processor (DSP) hardware, such as the AT&T DSP16 or DSP32C, and software performing the operations discussed below of the present invention. Very large scale integration (VLSI) hardware embodiments of the present invention, as well as hybrid DSP/VLSI embodiments, may also be provided. For example, an AT&T DSP16 may be employed to perform the operations of the rate loop processor depicted in FIG.13. The DSP could receive the spectral coefficients and energy thresholds (1301) and perform the calculation ofblocks1303 and1305 as described on page31. Further, the DSP could calculate the utilized scale factor according to the equation given onpage32 and depicted inblock1305. The quantization blocks1307 and1308 can be carried out as described onpage32. Finally, the DSP may perform the cost calculation (1311) and comparison (1313) associated with quantization. The cost calculation is described onpage32 and illustrated further in FIG.9. In this way, the interpolation factor may be adjusted (1315) according to the analysis carried out within the DSP or similar type hardware embodiments. It is to be understood that the above-described embodiments is merely illustrative of the principles of this invention. Other arrangements may be derived by those skilled in the art without departing from the spirit and scope of the invention.

Claims (4)

4. A decoder for decoding a set of frequency coefficients representing an audio signal, the decoder comprising:
(a) means for receiving the set of coefficients, the set of frequency coefficients having been encoded by:
(1) converting a time domain representation of the audio signal into a frequency domain representation of the audio signal comprising the set of frequency coefficients;
(2) calculating a masking threshold based upon the set of frequency coefficients;
(3) using a rate loop processor in an iterative fashion to determine a set of quantization step size coefficients needed to encode the set of frequency coefficients, said set of quantization step size coefficients determined by using the masking threshold and an absolute hearing threshold; and
(4) coding the set of frequency coefficients based upon the set of quantization step size coefficients; and
(b) means for converting the set of coefficients to a time domain signal.
US10/218,2321988-12-302002-08-13Rate loop processor for perceptual encoder/decoderExpired - LifetimeUSRE39080E1 (en)

Priority Applications (2)

Application NumberPriority DateFiling DateTitle
US10/218,232USRE39080E1 (en)1988-12-302002-08-13Rate loop processor for perceptual encoder/decoder
US11/248,622USRE40280E1 (en)1988-12-302005-10-12Rate loop processor for perceptual encoder/decoder

Applications Claiming Priority (5)

Application NumberPriority DateFiling DateTitle
US29259888A1988-12-101988-12-10
US84496792A1992-02-281992-02-28
US84481192A1992-03-021992-03-02
US08/310,898US5627938A (en)1992-03-021994-09-22Rate loop processor for perceptual encoder/decoder
US10/218,232USRE39080E1 (en)1988-12-302002-08-13Rate loop processor for perceptual encoder/decoder

Related Parent Applications (2)

Application NumberTitlePriority DateFiling Date
US84481192AContinuation1988-12-301992-03-02
US08/310,898ReissueUS5627938A (en)1988-12-301994-09-22Rate loop processor for perceptual encoder/decoder

Related Child Applications (1)

Application NumberTitlePriority DateFiling Date
US08/310,898ContinuationUS5627938A (en)1988-12-301994-09-22Rate loop processor for perceptual encoder/decoder

Publications (1)

Publication NumberPublication Date
USRE39080E1true USRE39080E1 (en)2006-04-25

Family

ID=25293693

Family Applications (2)

Application NumberTitlePriority DateFiling Date
US08/310,898CeasedUS5627938A (en)1988-12-301994-09-22Rate loop processor for perceptual encoder/decoder
US10/218,232Expired - LifetimeUSRE39080E1 (en)1988-12-302002-08-13Rate loop processor for perceptual encoder/decoder

Family Applications Before (1)

Application NumberTitlePriority DateFiling Date
US08/310,898CeasedUS5627938A (en)1988-12-301994-09-22Rate loop processor for perceptual encoder/decoder

Country Status (5)

CountryLink
US (2)US5627938A (en)
EP (1)EP0559348A3 (en)
JP (1)JP3263168B2 (en)
KR (1)KR970007663B1 (en)
CA (1)CA2090160C (en)

Cited By (52)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20040158456A1 (en)*2003-01-232004-08-12Vinod PrakashSystem, method, and apparatus for fast quantization in perceptual audio coders
US20050125098A1 (en)*2003-12-092005-06-09Yulun WangProtocol for a remotely controlled videoconferencing robot
US20050144017A1 (en)*2003-09-152005-06-30Stmicroelectronics Asia Pacific Pte LtdDevice and process for encoding audio data
US20070198130A1 (en)*2006-02-222007-08-23Yulun WangGraphical interface for a remote presence system
US20090006081A1 (en)*2007-06-272009-01-01Samsung Electronics Co., Ltd.Method, medium and apparatus for encoding and/or decoding signal
US8836751B2 (en)2011-11-082014-09-16Intouch Technologies, Inc.Tele-presence system with a user interface that displays different communication links
US8849679B2 (en)2006-06-152014-09-30Intouch Technologies, Inc.Remote controlled robot system that provides medical images
US8849680B2 (en)2009-01-292014-09-30Intouch Technologies, Inc.Documentation through a remote presence robot
US8897920B2 (en)2009-04-172014-11-25Intouch Technologies, Inc.Tele-presence robot system with software modularity, projector and laser pointer
US8902278B2 (en)2012-04-112014-12-02Intouch Technologies, Inc.Systems and methods for visualizing and managing telepresence devices in healthcare networks
US8965579B2 (en)2011-01-282015-02-24Intouch TechnologiesInterfacing with a mobile telepresence robot
US8983174B2 (en)2004-07-132015-03-17Intouch Technologies, Inc.Mobile robot with a head-based movement mapping scheme
US8996165B2 (en)2008-10-212015-03-31Intouch Technologies, Inc.Telepresence robot with a camera boom
US9089972B2 (en)2010-03-042015-07-28Intouch Technologies, Inc.Remote presence system including a cart that supports a robot face and an overhead camera
US9098611B2 (en)2012-11-262015-08-04Intouch Technologies, Inc.Enhanced video interaction for a user interface of a telepresence network
US9138891B2 (en)2008-11-252015-09-22Intouch Technologies, Inc.Server connectivity control for tele-presence robot
US9160783B2 (en)2007-05-092015-10-13Intouch Technologies, Inc.Robot system that operates through a network firewall
US9174342B2 (en)2012-05-222015-11-03Intouch Technologies, Inc.Social behavior rules for a medical telepresence robot
US9185487B2 (en)2006-01-302015-11-10Audience, Inc.System and method for providing noise suppression utilizing null processing noise subtraction
US9193065B2 (en)2008-07-102015-11-24Intouch Technologies, Inc.Docking system for a tele-presence robot
US9198728B2 (en)2005-09-302015-12-01Intouch Technologies, Inc.Multi-camera mobile teleconferencing platform
US9251313B2 (en)2012-04-112016-02-02Intouch Technologies, Inc.Systems and methods for visualizing and managing telepresence devices in healthcare networks
US9264664B2 (en)2010-12-032016-02-16Intouch Technologies, Inc.Systems and methods for dynamic bandwidth allocation
US9323250B2 (en)2011-01-282016-04-26Intouch Technologies, Inc.Time-dependent navigation of telepresence robots
US9361021B2 (en)2012-05-222016-06-07Irobot CorporationGraphical user interfaces including touchpad driving interfaces for telemedicine devices
US9381654B2 (en)2008-11-252016-07-05Intouch Technologies, Inc.Server connectivity control for tele-presence robot
US9429934B2 (en)2008-09-182016-08-30Intouch Technologies, Inc.Mobile videoconferencing robot system with network adaptive driving
US9558755B1 (en)*2010-05-202017-01-31Knowles Electronics, LlcNoise suppression assisted automatic speech recognition
US9602765B2 (en)2009-08-262017-03-21Intouch Technologies, Inc.Portable remote presence robot
US9616576B2 (en)2008-04-172017-04-11Intouch Technologies, Inc.Mobile tele-presence system with a microphone system
US9640194B1 (en)2012-10-042017-05-02Knowles Electronics, LlcNoise suppression for speech processing based on machine-learning mask estimation
US9668048B2 (en)2015-01-302017-05-30Knowles Electronics, LlcContextual switching of microphones
US9699554B1 (en)2010-04-212017-07-04Knowles Electronics, LlcAdaptive signal equalization
US9799330B2 (en)2014-08-282017-10-24Knowles Electronics, LlcMulti-sourced noise suppression
US9838784B2 (en)2009-12-022017-12-05Knowles Electronics, LlcDirectional audio capture
US9842192B2 (en)2008-07-112017-12-12Intouch Technologies, Inc.Tele-presence robot system with multi-cast features
US9849593B2 (en)2002-07-252017-12-26Intouch Technologies, Inc.Medical tele-robotic system with a master remote station with an arbitrator
US9978388B2 (en)2014-09-122018-05-22Knowles Electronics, LlcSystems and methods for restoration of speech components
US9974612B2 (en)2011-05-192018-05-22Intouch Technologies, Inc.Enhanced diagnostics for a telepresence robot
US10343283B2 (en)2010-05-242019-07-09Intouch Technologies, Inc.Telepresence robot system that can be accessed by a cellular phone
US10471588B2 (en)2008-04-142019-11-12Intouch Technologies, Inc.Robotic based health care system
US10769739B2 (en)2011-04-252020-09-08Intouch Technologies, Inc.Systems and methods for management of information among medical providers and facilities
US10808882B2 (en)2010-05-262020-10-20Intouch Technologies, Inc.Tele-robotic system with a robot face placed on a chair
US10875182B2 (en)2008-03-202020-12-29Teladoc Health, Inc.Remote presence system mounted to operating room hardware
US11154981B2 (en)2010-02-042021-10-26Teladoc Health, Inc.Robot user interface for telepresence robot system
US11389064B2 (en)2018-04-272022-07-19Teladoc Health, Inc.Telehealth cart that supports a removable tablet with seamless audio/video switching
US11399153B2 (en)2009-08-262022-07-26Teladoc Health, Inc.Portable telepresence apparatus
US11636944B2 (en)2017-08-252023-04-25Teladoc Health, Inc.Connectivity infrastructure for a telehealth platform
US11742094B2 (en)2017-07-252023-08-29Teladoc Health, Inc.Modular telehealth cart with thermal imaging and touch screen user interface
US11862302B2 (en)2017-04-242024-01-02Teladoc Health, Inc.Automated transcription and documentation of tele-health encounters
US12093036B2 (en)2011-01-212024-09-17Teladoc Health, Inc.Telerobotic system with a dual application screen presentation
US12224059B2 (en)2011-02-162025-02-11Teladoc Health, Inc.Systems and methods for network-based counseling

Families Citing this family (112)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
USRE40280E1 (en)1988-12-302008-04-29Lucent Technologies Inc.Rate loop processor for perceptual encoder/decoder
EP0559348A3 (en)1992-03-021993-11-03AT&T Corp.Rate control loop processor for perceptual encoder/decoder
JP3125543B2 (en)*1993-11-292001-01-22ソニー株式会社 Signal encoding method and apparatus, signal decoding method and apparatus, and recording medium
KR960003628B1 (en)*1993-12-061996-03-20Lg전자주식회사 Method and apparatus for encoding / decoding digital signal
PL177808B1 (en)*1994-03-312000-01-31Arbitron CoApparatus for and methods of including codes into audio signals and decoding such codes
KR970011727B1 (en)*1994-11-091997-07-14Daewoo Electronics Co LtdApparatus for encoding of the audio signal
EP0721257B1 (en)*1995-01-092005-03-30Daewoo Electronics CorporationBit allocation for multichannel audio coder based on perceptual entropy
CN1110955C (en)*1995-02-132003-06-04大宇电子株式会社Apparatus for adaptively encoding input digital audio signals from plurality of channels
DE19505435C1 (en)*1995-02-171995-12-07Fraunhofer Ges ForschungTonality evaluation system for audio signal
US5956674A (en)*1995-12-011999-09-21Digital Theater Systems, Inc.Multi-channel predictive subband audio coder using psychoacoustic adaptive bit allocation in frequency, time and over the multiple channels
DE19628292B4 (en)1996-07-122007-08-02Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Method for coding and decoding stereo audio spectral values
WO1998046045A1 (en)*1997-04-101998-10-15Sony CorporationEncoding method and device, decoding method and device, and recording medium
DE19730130C2 (en)1997-07-142002-02-28Fraunhofer Ges Forschung Method for coding an audio signal
US6263312B1 (en)1997-10-032001-07-17Alaris, Inc.Audio compression and decompression employing subband decomposition of residual signal and distortion reduction
US5913191A (en)*1997-10-171999-06-15Dolby Laboratories Licensing CorporationFrame-based audio coding with additional filterbank to suppress aliasing artifacts at frame boundaries
US6091773A (en)*1997-11-122000-07-18Sydorenko; Mark R.Data compression method and apparatus
US6037987A (en)*1997-12-312000-03-14Sarnoff CorporationApparatus and method for selecting a rate and distortion based coding mode for a coding system
US6161088A (en)*1998-06-262000-12-12Texas Instruments IncorporatedMethod and system for encoding a digital audio signal
US6128593A (en)*1998-08-042000-10-03Sony CorporationSystem and method for implementing a refined psycho-acoustic modeler
GB9819920D0 (en)*1998-09-111998-11-04Nds LtdAudio encoding system
US7103065B1 (en)*1998-10-302006-09-05Broadcom CorporationData packet fragmentation in a cable modem system
US6760316B1 (en)*1998-10-302004-07-06Broadcom CorporationMethod and apparatus for the synchronization of multiple cable modem termination system devices
ATE412289T1 (en)*1998-10-302008-11-15Broadcom Corp CABLE MODEM SYSTEM
US6961314B1 (en)1998-10-302005-11-01Broadcom CorporationBurst receiver for cable modem system
US6240379B1 (en)*1998-12-242001-05-29Sony CorporationSystem and method for preventing artifacts in an audio data encoder device
JP3739959B2 (en)*1999-03-232006-01-25株式会社リコー Digital audio signal encoding apparatus, digital audio signal encoding method, and medium on which digital audio signal encoding program is recorded
US6363338B1 (en)*1999-04-122002-03-26Dolby Laboratories Licensing CorporationQuantization in perceptual audio coders with compensation for synthesis filter noise spreading
GB2349054A (en)*1999-04-162000-10-18Nds LtdDigital audio signal encoders
EP1063851B1 (en)*1999-06-222007-08-01Victor Company Of Japan, Ltd.Apparatus and method of encoding moving picture signal
JP3762579B2 (en)1999-08-052006-04-05株式会社リコー Digital audio signal encoding apparatus, digital audio signal encoding method, and medium on which digital audio signal encoding program is recorded
JP2001094433A (en)1999-09-172001-04-06Matsushita Electric Ind Co Ltd Subband encoding / decoding method
JP2001099718A (en)*1999-09-302001-04-13Ando Electric Co LtdData processing device of wavemeter and its data processing method
US6499010B1 (en)*2000-01-042002-12-24Agere Systems Inc.Perceptual audio coder bit allocation scheme providing improved perceptual quality consistency
TW499672B (en)*2000-02-182002-08-21Intervideo IncFast convergence method for bit allocation stage of MPEG audio layer 3 encoders
BR0110724A (en)*2000-05-152003-03-11Unilever Nv Concentrated non-aqueous liquid detergent composition, and process for the preparation of a concentrated liquid detergent composition
JP4021124B2 (en)*2000-05-302007-12-12株式会社リコー Digital acoustic signal encoding apparatus, method and recording medium
US6678647B1 (en)*2000-06-022004-01-13Agere Systems Inc.Perceptual coding of audio signals using cascaded filterbanks for performing irrelevancy reduction and redundancy reduction with different spectral/temporal resolution
US7110953B1 (en)*2000-06-022006-09-19Agere Systems Inc.Perceptual coding of audio signals using separated irrelevancy reduction and redundancy reduction
US6778953B1 (en)*2000-06-022004-08-17Agere Systems Inc.Method and apparatus for representing masked thresholds in a perceptual audio coder
GB0115952D0 (en)*2001-06-292001-08-22IbmA scheduling method and system for controlling execution of processes
US6732071B2 (en)*2001-09-272004-05-04Intel CorporationMethod, apparatus, and system for efficient rate control in audio encoding
EP1437713A4 (en)*2001-10-032006-07-26Sony CorpEncoding apparatus and method; decoding apparatus and method and recording medium recording apparatus and method
US6934677B2 (en)*2001-12-142005-08-23Microsoft CorporationQuantization matrices based on critical band pattern information for digital audio wherein quantization bands differ from critical bands
US7240001B2 (en)*2001-12-142007-07-03Microsoft CorporationQuality improvement techniques in an audio encoder
US8401084B2 (en)*2002-04-012013-03-19Broadcom CorporationSystem and method for multi-row decoding of video with dependent rows
US20030215013A1 (en)*2002-04-102003-11-20Budnikov Dmitry N.Audio encoder with adaptive short window grouping
US7299190B2 (en)*2002-09-042007-11-20Microsoft CorporationQuantization and inverse quantization for audio
JP4676140B2 (en)2002-09-042011-04-27マイクロソフト コーポレーション Audio quantization and inverse quantization
ES2334934T3 (en)*2002-09-042010-03-17Microsoft Corporation ENTROPY CODIFICATION BY ADAPTATION OF CODIFICATION BETWEEN LEVEL MODES AND SUCCESSION AND LEVEL LENGTH.
US7502743B2 (en)2002-09-042009-03-10Microsoft CorporationMulti-channel audio encoding and decoding with multi-channel transform selection
EP1559101A4 (en)*2002-11-072006-01-25Samsung Electronics Co Ltd METHOD AND APPARATUS FOR MPEG AUDIO CODING
KR100908117B1 (en)*2002-12-162009-07-16삼성전자주식회사 Audio coding method, decoding method, encoding apparatus and decoding apparatus which can adjust the bit rate
US6996763B2 (en)*2003-01-102006-02-07Qualcomm IncorporatedOperation of a forward link acknowledgement channel for the reverse link data
US20040160922A1 (en)*2003-02-182004-08-19Sanjiv NandaMethod and apparatus for controlling data rate of a reverse link in a communication system
US7155236B2 (en)2003-02-182006-12-26Qualcomm IncorporatedScheduled and autonomous transmission and acknowledgement
US7286846B2 (en)*2003-02-182007-10-23Qualcomm, IncorporatedSystems and methods for performing outer loop power control in wireless communication systems
US7505780B2 (en)*2003-02-182009-03-17Qualcomm IncorporatedOuter-loop power control for wireless communication systems
US8391249B2 (en)*2003-02-182013-03-05Qualcomm IncorporatedCode division multiplexing commands on a code division multiplexed channel
US8150407B2 (en)*2003-02-182012-04-03Qualcomm IncorporatedSystem and method for scheduling transmissions in a wireless communication system
US7660282B2 (en)*2003-02-182010-02-09Qualcomm IncorporatedCongestion control in a wireless data network
US8023950B2 (en)2003-02-182011-09-20Qualcomm IncorporatedSystems and methods for using selectable frame durations in a wireless communication system
US8081598B2 (en)*2003-02-182011-12-20Qualcomm IncorporatedOuter-loop power control for wireless communication systems
US7215930B2 (en)*2003-03-062007-05-08Qualcomm, IncorporatedMethod and apparatus for providing uplink signal-to-noise ratio (SNR) estimation in a wireless communication
US8705588B2 (en)2003-03-062014-04-22Qualcomm IncorporatedSystems and methods for using code space in spread-spectrum communications
US8477592B2 (en)*2003-05-142013-07-02Qualcomm IncorporatedInterference and noise estimation in an OFDM system
JP4212591B2 (en)2003-06-302009-01-21富士通株式会社 Audio encoding device
US8489949B2 (en)*2003-08-052013-07-16Qualcomm IncorporatedCombining grant, acknowledgement, and rate control commands
WO2005027096A1 (en)2003-09-152005-03-24Zakrytoe Aktsionernoe Obschestvo IntelMethod and apparatus for encoding audio
US7460990B2 (en)*2004-01-232008-12-02Microsoft CorporationEfficient coding of digital media spectral data using wide-sense perceptual similarity
US6980933B2 (en)*2004-01-272005-12-27Dolby Laboratories Licensing CorporationCoding techniques using estimated spectral magnitude and phase derived from MDCT coefficients
DE102004009949B4 (en)*2004-03-012006-03-09Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Device and method for determining an estimated value
JP2007004050A (en)*2005-06-272007-01-11Nippon Hoso Kyokai <Nhk> Stereo signal encoding apparatus and encoding program
US7693709B2 (en)*2005-07-152010-04-06Microsoft CorporationReordering coefficients for waveform coding or decoding
US7562021B2 (en)*2005-07-152009-07-14Microsoft CorporationModification of codewords in dictionary used for efficient coding of digital media spectral data
US7539612B2 (en)2005-07-152009-05-26Microsoft CorporationCoding and decoding scale factor information
US7599840B2 (en)*2005-07-152009-10-06Microsoft CorporationSelectively using multiple entropy models in adaptive coding and decoding
US7630882B2 (en)*2005-07-152009-12-08Microsoft CorporationFrequency segmentation to obtain bands for efficient coding of digital media
US7684981B2 (en)*2005-07-152010-03-23Microsoft CorporationPrediction of spectral coefficients in waveform coding and decoding
US8225392B2 (en)*2005-07-152012-07-17Microsoft CorporationImmunizing HTML browsers and extensions from known vulnerabilities
US7565018B2 (en)*2005-08-122009-07-21Microsoft CorporationAdaptive coding and decoding of wide-range coefficients
US7933337B2 (en)*2005-08-122011-04-26Microsoft CorporationPrediction of transform coefficients for image compression
KR100979624B1 (en)2005-09-052010-09-01후지쯔 가부시끼가이샤 Audio encoding apparatus and audio encoding method
BRPI0617447A2 (en)2005-10-142012-04-17Matsushita Electric Industrial Co., Ltd transform encoder and transform coding method
US7953604B2 (en)*2006-01-202011-05-31Microsoft CorporationShape and scale parameters for extended-band frequency coding
US7831434B2 (en)2006-01-202010-11-09Microsoft CorporationComplex-transform channel coding with extended-band frequency coding
US8190425B2 (en)*2006-01-202012-05-29Microsoft CorporationComplex cross-correlation parameters for multi-channel audio
US7835904B2 (en)*2006-03-032010-11-16Microsoft Corp.Perceptual, scalable audio compression
FR2898443A1 (en)*2006-03-132007-09-14France Telecom AUDIO SOURCE SIGNAL ENCODING METHOD, ENCODING DEVICE, DECODING METHOD, DECODING DEVICE, SIGNAL, CORRESPONDING COMPUTER PROGRAM PRODUCTS
WO2007116809A1 (en)*2006-03-312007-10-18Matsushita Electric Industrial Co., Ltd.Stereo audio encoding device, stereo audio decoding device, and method thereof
US8184710B2 (en)*2007-02-212012-05-22Microsoft CorporationAdaptive truncation of transform coefficient data in a transform-based digital media codec
US7761290B2 (en)2007-06-152010-07-20Microsoft CorporationFlexible frequency and time partitioning in perceptual transform coding of audio
US8046214B2 (en)*2007-06-222011-10-25Microsoft CorporationLow complexity decoder for complex transform coding of multi-channel sound
US7885819B2 (en)*2007-06-292011-02-08Microsoft CorporationBitstream syntax for multi-process audio decoding
ES2375192T3 (en)2007-08-272012-02-27Telefonaktiebolaget L M Ericsson (Publ) CODIFICATION FOR IMPROVED SPEECH TRANSFORMATION AND AUDIO SIGNALS.
US8249883B2 (en)*2007-10-262012-08-21Microsoft CorporationChannel extension coding for multi-channel source
US20090144054A1 (en)*2007-11-302009-06-04Kabushiki Kaisha ToshibaEmbedded system to perform frame switching
US8179974B2 (en)2008-05-022012-05-15Microsoft CorporationMulti-level representation of reordered transform coefficients
US8325800B2 (en)2008-05-072012-12-04Microsoft CorporationEncoding streaming media as a high bit rate layer, a low bit rate layer, and one or more intermediate bit rate layers
US8379851B2 (en)2008-05-122013-02-19Microsoft CorporationOptimized client side rate control and indexed file layout for streaming media
US7925774B2 (en)2008-05-302011-04-12Microsoft CorporationMedia streaming using an index file
US8290782B2 (en)*2008-07-242012-10-16Dts, Inc.Compression of audio scale-factors by two-dimensional transformation
US8406307B2 (en)2008-08-222013-03-26Microsoft CorporationEntropy coding/decoding of hierarchically organized data
US8913668B2 (en)*2008-09-292014-12-16Microsoft CorporationPerceptual mechanism for the selection of residues in video coders
US8457194B2 (en)*2008-09-292013-06-04Microsoft CorporationProcessing real-time video
US8265140B2 (en)2008-09-302012-09-11Microsoft CorporationFine-grained client-side control of scalable media delivery
CN101853663B (en)*2009-03-302012-05-23华为技术有限公司Bit allocation method, encoding device and decoding device
US9055374B2 (en)*2009-06-242015-06-09Arizona Board Of Regents For And On Behalf Of Arizona State UniversityMethod and system for determining an auditory pattern of an audio segment
JP5539992B2 (en)*2009-08-202014-07-02トムソン ライセンシング RATE CONTROL DEVICE, RATE CONTROL METHOD, AND RATE CONTROL PROGRAM
JP5774191B2 (en)*2011-03-212015-09-09テレフオンアクチーボラゲット エル エム エリクソン(パブル) Method and apparatus for attenuating dominant frequencies in an audio signal
EP2689419B1 (en)*2011-03-212015-03-04Telefonaktiebolaget L M Ericsson (PUBL)Method and arrangement for damping dominant frequencies in an audio signal
EP2707875A4 (en)2011-05-132015-03-25Samsung Electronics Co Ltd NOISE FILLING AND AUDIO DECODING
ES2657039T3 (en)*2012-10-012018-03-01Nippon Telegraph And Telephone Corporation Coding method, coding device, program, and recording medium

Citations (49)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US3989897A (en)1974-10-251976-11-02Carver R WMethod and apparatus for reducing noise content in audio signals
US4216354A (en)1977-12-231980-08-05International Business Machines CorporationProcess for compressing data relative to voice signals and device applying said process
US4349698A (en)1979-06-191982-09-14Victor Company Of Japan, LimitedAudio signal translation with no delay elements
US4356349A (en)1980-03-121982-10-26Trod Nossel Recording Studios, Inc.Acoustic image enhancing method and apparatus
US4516258A (en)1982-06-301985-05-07At&T Bell LaboratoriesBit allocation generator for adaptive transform coder
US4535472A (en)1982-11-051985-08-13At&T Bell LaboratoriesAdaptive bit allocator
EP0193143A2 (en)1985-02-271986-09-03TELEFUNKEN Fernseh und Rundfunk GmbHAudio signal transmission method
US4646061A (en)1985-03-131987-02-24Racal Data Communications Inc.Data communication with modified Huffman coding
JPS637023A (en)1986-06-271988-01-12トムソン コンシューマー エレクトロニクス セイルズ ゲゼルシャフト ミット ベシュレンクテル ハフツングMethod of audio signal transmission
US4790016A (en)1985-11-141988-12-06Gte Laboratories IncorporatedAdaptive method and apparatus for coding speech
US4803727A (en)1986-11-241989-02-07British Telecommunications Public Limited CompanyTransmission system
JPS6450695A (en)1987-08-211989-02-27Tamura Electric Works LtdTelephone exchange
US4821260A (en)1986-12-171989-04-11Deutsche Thomson-Brandt GmbhTransmission system
US4860313A (en)1986-09-211989-08-22Eci Telecom Ltd.Adaptive differential pulse code modulation (ADPCM) systems
US4860360A (en)1987-04-061989-08-22Gte Laboratories IncorporatedMethod of evaluating speech
US4881267A (en)1987-05-141989-11-14Nec CorporationEncoder of a multi-pulse type capable of optimizing the number of excitation pulses and quantization level
US4896362A (en)1987-04-271990-01-23U.S. Philips CorporationSystem for subband coding of a digital audio signal
US4912763A (en)1986-10-301990-03-27International Business Machines CorporationProcess for multirate encoding signals and device for implementing said process
US4914701A (en)1984-12-201990-04-03Gte Laboratories IncorporatedMethod and apparatus for encoding speech
EP0376553A2 (en)1988-12-301990-07-04AT&T Corp.Perceptual coding of audio signals
US4941152A (en)1985-09-031990-07-10International Business Machines Corp.Signal coding process and system for implementing said process
US4945567A (en)1984-03-061990-07-31Nec CorporationMethod and apparatus for speech-band signal coding
US4949383A (en)1984-08-241990-08-14Bristish Telecommunications Public Limited CompanyFrequency domain speech coding
US4953214A (en)1987-07-211990-08-28Matushita Electric Industrial Co., Ltd.Signal encoding and decoding method and device
US4972484A (en)1986-11-211990-11-20Bayerische Rundfunkwerbung GmbhMethod of transmitting or storing masked sub-band coded audio signals
US5014318A (en)1988-02-251991-05-07Fraunhofer Gesellschaft Zur Forderung Der Angewandten Forschung E. V.Apparatus for checking audio signal processing systems
US5040217A (en)1989-10-181991-08-13At&T Bell LaboratoriesPerceptual coding of audio signals
US5079547A (en)1990-02-281992-01-07Victor Company Of Japan, Ltd.Method of orthogonal transform coding/decoding
US5109417A (en)1989-01-271992-04-28Dolby Laboratories Licensing CorporationLow bit rate transform coder, decoder, and encoder/decoder for high-quality audio
US5151941A (en)1989-09-301992-09-29Sony CorporationDigital signal encoding apparatus
US5185800A (en)1989-10-131993-02-09Centre National D'etudes Des TelecommunicationsBit allocation device for transformed digital audio broadcasting signals with adaptive quantization based on psychoauditive criterion
US5218435A (en)1991-02-201993-06-08Massachusetts Institute Of TechnologyDigital advanced television systems
US5227788A (en)1992-03-021993-07-13At&T Bell LaboratoriesMethod and apparatus for two-component signal compression
US5230038A (en)1989-01-271993-07-20Fielder Louis DLow bit rate transform coder, decoder, and encoder/decoder for high-quality audio
US5235671A (en)1990-10-151993-08-10Gte Laboratories IncorporatedDynamic bit allocation subband excited transform coding method and apparatus
EP0559383A1 (en)1992-03-021993-09-08AT&T Corp.A method and apparatus for coding audio signals based on perceptual model
US5274740A (en)1991-01-081993-12-28Dolby Laboratories Licensing CorporationDecoder for variable number of channel presentation of multidimensional sound fields
US5297236A (en)1989-01-271994-03-22Dolby Laboratories Licensing CorporationLow computational-complexity digital filter bank for encoder, decoder, and encoder/decoder
US5341457A (en)1988-12-301994-08-23At&T Bell LaboratoriesPerceptual coding of audio signals
US5357594A (en)1989-01-271994-10-18Dolby Laboratories Licensing CorporationEncoding and decoding using specially designed pairs of analysis and synthesis windows
US5394473A (en)1990-04-121995-02-28Dolby Laboratories Licensing CorporationAdaptive-block-length, adaptive-transforn, and adaptive-window transform coder, decoder, and encoder/decoder for high-quality audio
US5479562A (en)1989-01-271995-12-26Dolby Laboratories Licensing CorporationMethod and apparatus for encoding and decoding audio information
US5583962A (en)1991-01-081996-12-10Dolby Laboratories Licensing CorporationEncoder/decoder for multidimensional sound fields
US5592584A (en)1992-03-021997-01-07Lucent Technologies Inc.Method and apparatus for two-component signal compression
US5627938A (en)1992-03-021997-05-06Lucent Technologies Inc.Rate loop processor for perceptual encoder/decoder
EP0446037B1 (en)1990-03-091997-10-08AT&T Corp.Hybrid perceptual audio coding
US5752225A (en)1989-01-271998-05-12Dolby Laboratories Licensing CorporationMethod and apparatus for split-band encoding and split-band decoding of audio information using adaptive bit allocation to adjacent subbands
JP2796673B2 (en)1986-08-291998-09-10カール―ハインツ ブランデンブルク Digital coding method
US5924060A (en)1986-08-291999-07-13Brandenburg; Karl HeinzDigital coding process for transmission or storage of acoustical signals by transforming of scanning values into spectral coefficients

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
SE458532B (en)*1987-03-251989-04-10Sandvik Ab TOOLS WITH HEAVY METAL TIP DETERMINED TO ROTABLE IN A CARAVAN

Patent Citations (55)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US3989897A (en)1974-10-251976-11-02Carver R WMethod and apparatus for reducing noise content in audio signals
US4216354A (en)1977-12-231980-08-05International Business Machines CorporationProcess for compressing data relative to voice signals and device applying said process
US4349698A (en)1979-06-191982-09-14Victor Company Of Japan, LimitedAudio signal translation with no delay elements
US4356349A (en)1980-03-121982-10-26Trod Nossel Recording Studios, Inc.Acoustic image enhancing method and apparatus
US4516258A (en)1982-06-301985-05-07At&T Bell LaboratoriesBit allocation generator for adaptive transform coder
US4535472A (en)1982-11-051985-08-13At&T Bell LaboratoriesAdaptive bit allocator
US4945567A (en)1984-03-061990-07-31Nec CorporationMethod and apparatus for speech-band signal coding
US4949383A (en)1984-08-241990-08-14Bristish Telecommunications Public Limited CompanyFrequency domain speech coding
US4914701A (en)1984-12-201990-04-03Gte Laboratories IncorporatedMethod and apparatus for encoding speech
EP0193143A2 (en)1985-02-271986-09-03TELEFUNKEN Fernseh und Rundfunk GmbHAudio signal transmission method
US4646061A (en)1985-03-131987-02-24Racal Data Communications Inc.Data communication with modified Huffman coding
US4941152A (en)1985-09-031990-07-10International Business Machines Corp.Signal coding process and system for implementing said process
US4790016A (en)1985-11-141988-12-06Gte Laboratories IncorporatedAdaptive method and apparatus for coding speech
JP2792853B2 (en)1986-06-271998-09-03トムソン コンシューマー エレクトロニクス セイルズ ゲゼルシャフト ミット ベシュレンクテル ハフツング Audio signal transmission method and apparatus
JPS637023A (en)1986-06-271988-01-12トムソン コンシューマー エレクトロニクス セイルズ ゲゼルシャフト ミット ベシュレンクテル ハフツングMethod of audio signal transmission
US5924060A (en)1986-08-291999-07-13Brandenburg; Karl HeinzDigital coding process for transmission or storage of acoustical signals by transforming of scanning values into spectral coefficients
JP2796673B2 (en)1986-08-291998-09-10カール―ハインツ ブランデンブルク Digital coding method
US4860313A (en)1986-09-211989-08-22Eci Telecom Ltd.Adaptive differential pulse code modulation (ADPCM) systems
US4912763A (en)1986-10-301990-03-27International Business Machines CorporationProcess for multirate encoding signals and device for implementing said process
US4972484A (en)1986-11-211990-11-20Bayerische Rundfunkwerbung GmbhMethod of transmitting or storing masked sub-band coded audio signals
US4803727A (en)1986-11-241989-02-07British Telecommunications Public Limited CompanyTransmission system
US4821260A (en)1986-12-171989-04-11Deutsche Thomson-Brandt GmbhTransmission system
US4860360A (en)1987-04-061989-08-22Gte Laboratories IncorporatedMethod of evaluating speech
US4896362A (en)1987-04-271990-01-23U.S. Philips CorporationSystem for subband coding of a digital audio signal
US4881267A (en)1987-05-141989-11-14Nec CorporationEncoder of a multi-pulse type capable of optimizing the number of excitation pulses and quantization level
US4953214A (en)1987-07-211990-08-28Matushita Electric Industrial Co., Ltd.Signal encoding and decoding method and device
JPS6450695A (en)1987-08-211989-02-27Tamura Electric Works LtdTelephone exchange
US5014318A (en)1988-02-251991-05-07Fraunhofer Gesellschaft Zur Forderung Der Angewandten Forschung E. V.Apparatus for checking audio signal processing systems
EP0376553A2 (en)1988-12-301990-07-04AT&T Corp.Perceptual coding of audio signals
US5535300A (en)1988-12-301996-07-09At&T Corp.Perceptual coding of audio signals using entropy coding and/or multiple power spectra
US5341457A (en)1988-12-301994-08-23At&T Bell LaboratoriesPerceptual coding of audio signals
US5297236A (en)1989-01-271994-03-22Dolby Laboratories Licensing CorporationLow computational-complexity digital filter bank for encoder, decoder, and encoder/decoder
US5357594A (en)1989-01-271994-10-18Dolby Laboratories Licensing CorporationEncoding and decoding using specially designed pairs of analysis and synthesis windows
US5230038A (en)1989-01-271993-07-20Fielder Louis DLow bit rate transform coder, decoder, and encoder/decoder for high-quality audio
US5109417A (en)1989-01-271992-04-28Dolby Laboratories Licensing CorporationLow bit rate transform coder, decoder, and encoder/decoder for high-quality audio
US5752225A (en)1989-01-271998-05-12Dolby Laboratories Licensing CorporationMethod and apparatus for split-band encoding and split-band decoding of audio information using adaptive bit allocation to adjacent subbands
US5479562A (en)1989-01-271995-12-26Dolby Laboratories Licensing CorporationMethod and apparatus for encoding and decoding audio information
US5151941A (en)1989-09-301992-09-29Sony CorporationDigital signal encoding apparatus
US5185800A (en)1989-10-131993-02-09Centre National D'etudes Des TelecommunicationsBit allocation device for transformed digital audio broadcasting signals with adaptive quantization based on psychoauditive criterion
US5040217A (en)1989-10-181991-08-13At&T Bell LaboratoriesPerceptual coding of audio signals
USRE36714E (en)1989-10-182000-05-23Lucent Technologies Inc.Perceptual coding of audio signals
US5079547A (en)1990-02-281992-01-07Victor Company Of Japan, Ltd.Method of orthogonal transform coding/decoding
EP0446037B1 (en)1990-03-091997-10-08AT&T Corp.Hybrid perceptual audio coding
US5394473A (en)1990-04-121995-02-28Dolby Laboratories Licensing CorporationAdaptive-block-length, adaptive-transforn, and adaptive-window transform coder, decoder, and encoder/decoder for high-quality audio
US5235671A (en)1990-10-151993-08-10Gte Laboratories IncorporatedDynamic bit allocation subband excited transform coding method and apparatus
US5400433A (en)1991-01-081995-03-21Dolby Laboratories Licensing CorporationDecoder for variable-number of channel presentation of multidimensional sound fields
US5274740A (en)1991-01-081993-12-28Dolby Laboratories Licensing CorporationDecoder for variable number of channel presentation of multidimensional sound fields
US5583962A (en)1991-01-081996-12-10Dolby Laboratories Licensing CorporationEncoder/decoder for multidimensional sound fields
US5633981A (en)1991-01-081997-05-27Dolby Laboratories Licensing CorporationMethod and apparatus for adjusting dynamic range and gain in an encoder/decoder for multidimensional sound fields
US5218435A (en)1991-02-201993-06-08Massachusetts Institute Of TechnologyDigital advanced television systems
US5285498A (en)1992-03-021994-02-08At&T Bell LaboratoriesMethod and apparatus for coding audio signals based on perceptual model
US5627938A (en)1992-03-021997-05-06Lucent Technologies Inc.Rate loop processor for perceptual encoder/decoder
US5592584A (en)1992-03-021997-01-07Lucent Technologies Inc.Method and apparatus for two-component signal compression
US5227788A (en)1992-03-021993-07-13At&T Bell LaboratoriesMethod and apparatus for two-component signal compression
EP0559383A1 (en)1992-03-021993-09-08AT&T Corp.A method and apparatus for coding audio signals based on perceptual model

Non-Patent Citations (68)

* Cited by examiner, † Cited by third party
Title
"Aspec: Adaptive Spectral Entropy Coding of High Quality Music Signals", AES 90<SUP>th </SUP>Convention, 1991.
"FX/FORTRAN Programmer's Handbook", Alliant Computer Systems Corp., Jul. 1988.
"Aspec: Adaptive Spectral Entropy Coding of High Quality Music Signals", AES 90th Convention, 1991.
A. Fletcher, "Auditory Patterns", Reviews of Modern Physics, vol. 12, pp. 47-65, 1940.
AT&T Bell Laboratories et al., "ASPEC," ISO-MPEG Audio Coding Submission, submitted Oct. 18, 1989, amended Dec. 11, 1989, revised Jun. 18, 1990.
B. Scharf, Foundations of Modern Auditory Theory, edited by Jerry V. Tobias, Chapter 5, Academic Press, N.Y., N.Y., pp. 157-202, 1970.
Brandenburg et al., "A Digital Signal Processor for Real Time Adaptive Transform Coding of Audio Signals Up To 20 kHz Bandwidth," IEEE Int'l Conf on Circuits and Computers, 1982, pp. 474-477.
Brandenburg, K., "A Contribution on the Methods of and Evaluation of the Quality of High-quality Music Coding," The Department of Engineering at the University of Erlangen-Nuremberg, Doctor of Engineering Thesis, pp. 1-199, Erlangen University Library, Erlangen, Germany, Jan., 1989.
Brandenburg, K., "High quality sound coding at 2.5 bit/sample," 84<SUP>th </SUP>Convention, AES (Audio Engineering Society) Mar. 1-4, 1988, Paris, Preprint 2582 (D-2), 8 pages, Audio Engineering Society, New York, 1988.
Brandenburg, K., "High quality sound coding at 2.5 bit/sample," 84th Convention, AES (Audio Engineering Society) Mar. 1-4, 1988, Paris, Preprint 2582 (D-2), 8 pages, Audio Engineering Society, New York, 1988.
Brandenburg, K., et al., "Low Bit Rate Coding of High-Quality Digital Audio: Algorithms and Evaluation of Quality," AES Conference, May 14-17, 1989, Toronto, Canada, pp. 1-25, Audio Engineering Society, New York.
Brandenburg, K., et al., "OCF: Coding High Quality Audio with Data Rates of 64kbit/sec," AES Convention, Nov. 3-6, 1988, Los Angeles, California, 12 pages, Audio Engineering Society, New York.
Brandenburg, K., et al., "Transmission of High Quality Audio Signals with Bit Rates in the Range of 64-144 KBIT/SEC," ITG-Conference Proceedings, Information Technology Society, pp. 217-222, VDE-Verlag GmbH, Berlin, Nov., 1988.
Brandenburg, K.-H., Langenbucher, G.C., Shram, H., Seitzer, D.; A Digital Signal Processor For Real Time Adaptive Transform Coding Of Audio Signals Up To 20 KHZ Bandwith, IEEE Int'l Conf. On Circuits and Computers, 1982, pp. 474-477.
Crochiere, R.E., Tribolet, J.M.; Frequency Domain Techniques For Speech Coding, J. Acoust. Soc. Of Amer., Dec. 1979, pp. 1642-1646.
Crochiere, R.E.; Sub-Band Coding, The Bell System Technical Journal, vol. 60, No. 7, Sep. 1981, pp. 1633-1653.
D.E. Knuth, et al., "The Art of Computer Programming", 2<SUP>nd </SUP>Ed., vol. 2, Reading, Mass, pp. 274-275, 1981.
D.E. Knuth, et al., "The Art of Computer Programming", 2nd Ed., vol. 2, Reading, Mass, pp. 274-275, 1981.
Dolby's Preliminary Invalidity Contentions, Dolby Laboratories Inc. and Dolby Laboratories Licensing Corporation v. Lucent Technologies Inc. and Lucent Technologies Guardian I LLC, United States District Court, Northern District of California, San Jose Division, Apr. 3, 2003.
Dolby's Supplemental Invalidity Contentions, Dolby Laboratories Inc. and Dolby Laboratories Licencing Corporation vs. Lucent Technologies Inc. and Lucent Technologies Guardian I LLC, United States Distric Court, Northern District of California, San Jose Division, Jan. 7, 2004.
E. F. Schroeder, et al., "MSC: Stereo Audio Coding With CD-Quality and 256 kBIT/SEC", IEEE Transactions on Consumer Electronics, vol., CE-33, No. 4, pp. 512-519, Nov. 1987.
E. Tan, et al., "Digital Audio Tape for Data Storage", IEEE Spectrum, pp. 34-38, Oct. 1989.
E. Zwicker, et al., Absolute and Masked Thresholds of Continuous Sounds, pp. 65-81 of "The Ear As A Communication Receiver" (Original German edition "Das Ohr als Nachrichtenempfanger", Second Rev. ed. 1967).
Ernst Eberlein, et al., Psychoacoustically Based Measuring Device For Optimizing Data Reduction Methods, Tonmelstertagung '88, pp. 552-564, Nov. 19, 1988.
Flanagan, J.L., Schroeder, M.R., Atal, B.S., Crochiere, R.E., Jayant, N.S., Tribolet. J.M.; Speech Coding, IEEE Transactions on Communications, vol. COM-27, No. 4, Apr. 1979, pp. 710-736.
G. Theile, et al., "Low Bit-Rate Coding of High-Quality Audio Signals An Introduction to the MASCAM System" AES 7<SUP>th </SUP>International Conf., EBU Review-Technical, No. 230, pp. 158-209, Aug., 1988.
G. Theile, et al., "Low Bit-Rate Coding of High-Quality Audio Signals" AES 82<SUP>nd </SUP>Convention, pp. 1-31, Mar., 1987.
G. Theile, et al., "Low Bit-Rate Coding of High-Quality Audio Signals An Introduction to the MASCAM System" AES 7th International Conf., EBU Review—Technical, No. 230, pp. 158-209, Aug., 1988.
G. Theile, et al., "Low Bit-Rate Coding of High-Quality Audio Signals" AES 82nd Convention, pp. 1-31, Mar., 1987.
Grauel, C.; Sub-Band Coding With Adaptive Bit Allocation, Signal Processing 2, North-Holland Publishing Co., 1980, pp. 23-30.
H.G. Musmann, "The ISO Audio Coding Standard", Globe-com '90, vol. 1(3), Dec. 1990, N.Y., pp. 511-517.
Heron, C.D., Crochiere, R.E., Cox, R.V.; A 32-Band Sub-band/Transform Coder Incorporatng Vector Quantization For Dynamic Bit Allocation, Proceedings IEEE ICASSP, 1983, pp. 1276-1279.
J. Princen et al., "Subband Transform Coding Using Filter Bank Designs Based on Time Domain Aliasing Cancellation", IEEE ICASSP, pp. 2161-2164, 1987.
J. Princen, et al., "Analysis/Synthesis Filter Bank Design Based on Time Domain Aliasing Cancellation", IEEE ICASSP, vol. AASP-34, No. 5, pp. 1153-1161, 1986.
J.D. Johnston, "Estimation of Perceptul Entropy Using Noise Masking Criteria", IEEE UCASSP, pp. 2524-2527, 1989.
J.D. Johnston, "Perceptual Transform Coding of Wideband Stereo Signals", IEEE ICASSP, pp. 1993-1996, 1989.
J.D. Johnston, "Transform Coding of Audio Signals Using Perceptual Noise Criteria", IEEE Journal On Selected Areas In Communications, vol. 6, No. 2, pp. 314-323, Feb. 1988.
Jayant, N.S., Noll, P.; Digital Coding of Waveforms, Prentice-Hall, 1984, pp. 56-58.
Jetzt, "Critical Distance Measurements on Rooms From the Sound Energy Spectrum Response", Journal of the Acoustical Society of America, vol. 65, pp. 1204-1211, 1979.
Johnston, "Transform Coding of AUdio Signals Using Perceptual Noise Criteria," IEEE J. Selected Areas in Comm., vol. 6, No. 2, Feb. 1988, pp. 314-323.
K. Brandenburg, "Aspec Coding", AES 10<SUP>th </SUP>International Conf., pp. 81-90, Sep. 1991.
K. Brandenburg, "Evaluation Of Quality For Audio Encoding At Low Bit Rates", AES 82<SUP>nd </SUP>Convention, pp. 1-11, Mar., 1987.
K. Brandenburg, "OCF-A New Coding Algorithm For High Quality Sound Signals", IEEE ICASSP, pp. 141-144, 1987.
K. Brandenburg, "Second Generation Perpetual Audio Coding: The Hybrid Coder", AES 89<SUP>th </SUP>Convention, 1990.
K. Brandenburg, "Aspec Coding", AES 10th International Conf., pp. 81-90, Sep. 1991.
K. Brandenburg, "Evaluation Of Quality For Audio Encoding At Low Bit Rates", AES 82nd Convention, pp. 1-11, Mar., 1987.
K. Brandenburg, "Second Generation Perpetual Audio Coding: The Hybrid Coder", AES 89th Convention, 1990.
Krah�, "Neues Quellencodierungsverfahren f�r qualitativ hochwertige, digital Audiosignale," University Duisburg, Nov. 1985.
Krahé, "Neues Quellencodierungsverfahren für qualitativ hochwertige, digital Audiosignale," University Duisburg, Nov. 1985.
Krahe, D.; New Source Coding Method For High Quality Digital Audio Signals, University Duisburg Nov., 1985 (Original and English Translation).
Krasner, M.A. Digital Encoding of Speech and Audio Signals Based on the Perceptual Requirements of the Audiroty System, Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy, Massachusetts Institute of Technology, May 4, 1979.
Krasner, M.A., Digital Encoding of Speech and Audio Signals Based on the Perceptual Requirements of the Auditory System, MIT Lincoln Laboratory, Technical Report 535, Jun. 18, 1979.
M.R. Schroeder, et al., "Optimizing Digital Speech Coders by Exploiting Masking Properties of the Human Ear", Journal of Acoustical Society of America, vol. 66 (6), pp. 1647-1652, Dec. 1979.
N.S. Jayant, et al., "Digital Coding of Waveforms-Principals and Applications to Speech and Video", Chapter 12, Transform Coding, 1987.
Press et al., "Numerical Recipes," Cambridge University Press, 1986, pp. 77-92, 240-247 and 595.
Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T.; Numerical Recipies Cambridge University Press, 1986, pp. 77-92, 240-247 and 595.
R. G. van der Waal, et al., "Subband Coding of Sterophonic Digital Audio Signals", IEEE, pp. 3601-3604, Jul., 1991.
R. P. Hellman, "Asymmetry of Masking Between Noise and Tone", Perception and Psychophysics II, pp. 241-246, 1972.
R.N.J. Veldhhuis, et al., "Subband Coding of Digital Audio Signals", Phillips Journal of Research, vol. 44, Nos. 2, 3, pp. 329-343, Jul. 1989.
Ramstad, T.A., Sub-band Coder With A Simple Adaptive Bit-Allocation Algorithm: A Possible Candidate for Digital Mobile Telephony? Proceedings IEEE ICASSP, 1982, vol. 1, pp. 203-207.
Seitzer, et al., "Digital Coding of High Quality Audio," Proceedings Advanced Computer Technology, Reliable Systems and Applications, 5<SUP>th </SUP>Annual European Computer Conference, Bologna, May 13-16, 1991, pp. 148-154, IEEE Computer Society Press, Los Alamitos, California, 1991.
Seitzer, et al., "Digital Coding of High Quality Audio," Proceedings Advanced Computer Technology, Reliable Systems and Applications, 5th Annual European Computer Conference, Bologna, May 13-16, 1991, pp. 148-154, IEEE Computer Society Press, Los Alamitos, California, 1991.
Terhardt, E., Stoll, G., Seewann, M., Algorithm For Extraction Of Pitch And Pitch Salience From Complex Tonal Signals, J. Acoust. Soc. Am., 71(3), Mar. 1982, pp. 679-688.
Tribolet et al., "Frequency Domain Coding of Speech," IEEE Trans. on Acoust., Speech and Sig. Proc., Oct. 1979, pp. 512-530.
Tribolet, J.M., Crochiere, R. E., Frequency Domain Coding of Speech, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-27, No. 5, Oct. 1979, pp. 512-530.
Zelinski, R., Noll, P.; Adaptive Transform Coding of Speech Signals, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-25, No. 4, Aug. 1977, pp. 299-309.
Zwicker, "Psychoakustic," 1982, pp 31-53.*
Zwicker, E., Terhardt, E.; Facts And Models In Hearing, Springer-Verlag 1974, pp. 251-257.

Cited By (101)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US9849593B2 (en)2002-07-252017-12-26Intouch Technologies, Inc.Medical tele-robotic system with a master remote station with an arbitrator
US10315312B2 (en)2002-07-252019-06-11Intouch Technologies, Inc.Medical tele-robotic system with a master remote station with an arbitrator
US20040158456A1 (en)*2003-01-232004-08-12Vinod PrakashSystem, method, and apparatus for fast quantization in perceptual audio coders
US7650277B2 (en)*2003-01-232010-01-19Ittiam Systems (P) Ltd.System, method, and apparatus for fast quantization in perceptual audio coders
US7725323B2 (en)*2003-09-152010-05-25Stmicroelectronics Asia Pacific Pte. Ltd.Device and process for encoding audio data
US20050144017A1 (en)*2003-09-152005-06-30Stmicroelectronics Asia Pacific Pte LtdDevice and process for encoding audio data
US7813836B2 (en)2003-12-092010-10-12Intouch Technologies, Inc.Protocol for a remotely controlled videoconferencing robot
US9956690B2 (en)2003-12-092018-05-01Intouch Technologies, Inc.Protocol for a remotely controlled videoconferencing robot
US10882190B2 (en)2003-12-092021-01-05Teladoc Health, Inc.Protocol for a remotely controlled videoconferencing robot
US9375843B2 (en)2003-12-092016-06-28Intouch Technologies, Inc.Protocol for a remotely controlled videoconferencing robot
US20050125098A1 (en)*2003-12-092005-06-09Yulun WangProtocol for a remotely controlled videoconferencing robot
US9766624B2 (en)2004-07-132017-09-19Intouch Technologies, Inc.Mobile robot with a head-based movement mapping scheme
US10241507B2 (en)2004-07-132019-03-26Intouch Technologies, Inc.Mobile robot with a head-based movement mapping scheme
US8983174B2 (en)2004-07-132015-03-17Intouch Technologies, Inc.Mobile robot with a head-based movement mapping scheme
US10259119B2 (en)2005-09-302019-04-16Intouch Technologies, Inc.Multi-camera mobile teleconferencing platform
US9198728B2 (en)2005-09-302015-12-01Intouch Technologies, Inc.Multi-camera mobile teleconferencing platform
US9185487B2 (en)2006-01-302015-11-10Audience, Inc.System and method for providing noise suppression utilizing null processing noise subtraction
US7769492B2 (en)2006-02-222010-08-03Intouch Technologies, Inc.Graphical interface for a remote presence system
US20070198130A1 (en)*2006-02-222007-08-23Yulun WangGraphical interface for a remote presence system
US8849679B2 (en)2006-06-152014-09-30Intouch Technologies, Inc.Remote controlled robot system that provides medical images
US10682763B2 (en)2007-05-092020-06-16Intouch Technologies, Inc.Robot system that operates through a network firewall
US9160783B2 (en)2007-05-092015-10-13Intouch Technologies, Inc.Robot system that operates through a network firewall
US20090006081A1 (en)*2007-06-272009-01-01Samsung Electronics Co., Ltd.Method, medium and apparatus for encoding and/or decoding signal
US11787060B2 (en)2008-03-202023-10-17Teladoc Health, Inc.Remote presence system mounted to operating room hardware
US10875182B2 (en)2008-03-202020-12-29Teladoc Health, Inc.Remote presence system mounted to operating room hardware
US11472021B2 (en)2008-04-142022-10-18Teladoc Health, Inc.Robotic based health care system
US10471588B2 (en)2008-04-142019-11-12Intouch Technologies, Inc.Robotic based health care system
US9616576B2 (en)2008-04-172017-04-11Intouch Technologies, Inc.Mobile tele-presence system with a microphone system
US10493631B2 (en)2008-07-102019-12-03Intouch Technologies, Inc.Docking system for a tele-presence robot
US9193065B2 (en)2008-07-102015-11-24Intouch Technologies, Inc.Docking system for a tele-presence robot
US10878960B2 (en)2008-07-112020-12-29Teladoc Health, Inc.Tele-presence robot system with multi-cast features
US9842192B2 (en)2008-07-112017-12-12Intouch Technologies, Inc.Tele-presence robot system with multi-cast features
US9429934B2 (en)2008-09-182016-08-30Intouch Technologies, Inc.Mobile videoconferencing robot system with network adaptive driving
US8996165B2 (en)2008-10-212015-03-31Intouch Technologies, Inc.Telepresence robot with a camera boom
US10875183B2 (en)2008-11-252020-12-29Teladoc Health, Inc.Server connectivity control for tele-presence robot
US9381654B2 (en)2008-11-252016-07-05Intouch Technologies, Inc.Server connectivity control for tele-presence robot
US9138891B2 (en)2008-11-252015-09-22Intouch Technologies, Inc.Server connectivity control for tele-presence robot
US10059000B2 (en)2008-11-252018-08-28Intouch Technologies, Inc.Server connectivity control for a tele-presence robot
US12138808B2 (en)2008-11-252024-11-12Teladoc Health, Inc.Server connectivity control for tele-presence robots
US8849680B2 (en)2009-01-292014-09-30Intouch Technologies, Inc.Documentation through a remote presence robot
US8897920B2 (en)2009-04-172014-11-25Intouch Technologies, Inc.Tele-presence robot system with software modularity, projector and laser pointer
US10969766B2 (en)2009-04-172021-04-06Teladoc Health, Inc.Tele-presence robot system with software modularity, projector and laser pointer
US10404939B2 (en)2009-08-262019-09-03Intouch Technologies, Inc.Portable remote presence robot
US9602765B2 (en)2009-08-262017-03-21Intouch Technologies, Inc.Portable remote presence robot
US10911715B2 (en)2009-08-262021-02-02Teladoc Health, Inc.Portable remote presence robot
US11399153B2 (en)2009-08-262022-07-26Teladoc Health, Inc.Portable telepresence apparatus
US9838784B2 (en)2009-12-022017-12-05Knowles Electronics, LlcDirectional audio capture
US11154981B2 (en)2010-02-042021-10-26Teladoc Health, Inc.Robot user interface for telepresence robot system
US10887545B2 (en)2010-03-042021-01-05Teladoc Health, Inc.Remote presence system including a cart that supports a robot face and an overhead camera
US9089972B2 (en)2010-03-042015-07-28Intouch Technologies, Inc.Remote presence system including a cart that supports a robot face and an overhead camera
US11798683B2 (en)2010-03-042023-10-24Teladoc Health, Inc.Remote presence system including a cart that supports a robot face and an overhead camera
US9699554B1 (en)2010-04-212017-07-04Knowles Electronics, LlcAdaptive signal equalization
US9558755B1 (en)*2010-05-202017-01-31Knowles Electronics, LlcNoise suppression assisted automatic speech recognition
US11389962B2 (en)2010-05-242022-07-19Teladoc Health, Inc.Telepresence robot system that can be accessed by a cellular phone
US10343283B2 (en)2010-05-242019-07-09Intouch Technologies, Inc.Telepresence robot system that can be accessed by a cellular phone
US10808882B2 (en)2010-05-262020-10-20Intouch Technologies, Inc.Tele-robotic system with a robot face placed on a chair
US9264664B2 (en)2010-12-032016-02-16Intouch Technologies, Inc.Systems and methods for dynamic bandwidth allocation
US10218748B2 (en)2010-12-032019-02-26Intouch Technologies, Inc.Systems and methods for dynamic bandwidth allocation
US12093036B2 (en)2011-01-212024-09-17Teladoc Health, Inc.Telerobotic system with a dual application screen presentation
US9785149B2 (en)2011-01-282017-10-10Intouch Technologies, Inc.Time-dependent navigation of telepresence robots
US9469030B2 (en)2011-01-282016-10-18Intouch TechnologiesInterfacing with a mobile telepresence robot
US9323250B2 (en)2011-01-282016-04-26Intouch Technologies, Inc.Time-dependent navigation of telepresence robots
US11289192B2 (en)2011-01-282022-03-29Intouch Technologies, Inc.Interfacing with a mobile telepresence robot
US10399223B2 (en)2011-01-282019-09-03Intouch Technologies, Inc.Interfacing with a mobile telepresence robot
US10591921B2 (en)2011-01-282020-03-17Intouch Technologies, Inc.Time-dependent navigation of telepresence robots
US11468983B2 (en)2011-01-282022-10-11Teladoc Health, Inc.Time-dependent navigation of telepresence robots
US8965579B2 (en)2011-01-282015-02-24Intouch TechnologiesInterfacing with a mobile telepresence robot
US12224059B2 (en)2011-02-162025-02-11Teladoc Health, Inc.Systems and methods for network-based counseling
US10769739B2 (en)2011-04-252020-09-08Intouch Technologies, Inc.Systems and methods for management of information among medical providers and facilities
US9974612B2 (en)2011-05-192018-05-22Intouch Technologies, Inc.Enhanced diagnostics for a telepresence robot
US8836751B2 (en)2011-11-082014-09-16Intouch Technologies, Inc.Tele-presence system with a user interface that displays different communication links
US9715337B2 (en)2011-11-082017-07-25Intouch Technologies, Inc.Tele-presence system with a user interface that displays different communication links
US10331323B2 (en)2011-11-082019-06-25Intouch Technologies, Inc.Tele-presence system with a user interface that displays different communication links
US8902278B2 (en)2012-04-112014-12-02Intouch Technologies, Inc.Systems and methods for visualizing and managing telepresence devices in healthcare networks
US10762170B2 (en)2012-04-112020-09-01Intouch Technologies, Inc.Systems and methods for visualizing patient and telepresence device statistics in a healthcare network
US9251313B2 (en)2012-04-112016-02-02Intouch Technologies, Inc.Systems and methods for visualizing and managing telepresence devices in healthcare networks
US11205510B2 (en)2012-04-112021-12-21Teladoc Health, Inc.Systems and methods for visualizing and managing telepresence devices in healthcare networks
US11515049B2 (en)2012-05-222022-11-29Teladoc Health, Inc.Graphical user interfaces including touchpad driving interfaces for telemedicine devices
US10780582B2 (en)2012-05-222020-09-22Intouch Technologies, Inc.Social behavior rules for a medical telepresence robot
US10061896B2 (en)2012-05-222018-08-28Intouch Technologies, Inc.Graphical user interfaces including touchpad driving interfaces for telemedicine devices
US9361021B2 (en)2012-05-222016-06-07Irobot CorporationGraphical user interfaces including touchpad driving interfaces for telemedicine devices
US10603792B2 (en)2012-05-222020-03-31Intouch Technologies, Inc.Clinical workflows utilizing autonomous and semiautonomous telemedicine devices
US10658083B2 (en)2012-05-222020-05-19Intouch Technologies, Inc.Graphical user interfaces including touchpad driving interfaces for telemedicine devices
US9776327B2 (en)2012-05-222017-10-03Intouch Technologies, Inc.Social behavior rules for a medical telepresence robot
US11628571B2 (en)2012-05-222023-04-18Teladoc Health, Inc.Social behavior rules for a medical telepresence robot
US10328576B2 (en)2012-05-222019-06-25Intouch Technologies, Inc.Social behavior rules for a medical telepresence robot
US9174342B2 (en)2012-05-222015-11-03Intouch Technologies, Inc.Social behavior rules for a medical telepresence robot
US11453126B2 (en)2012-05-222022-09-27Teladoc Health, Inc.Clinical workflows utilizing autonomous and semi-autonomous telemedicine devices
US10892052B2 (en)2012-05-222021-01-12Intouch Technologies, Inc.Graphical user interfaces including touchpad driving interfaces for telemedicine devices
US9640194B1 (en)2012-10-042017-05-02Knowles Electronics, LlcNoise suppression for speech processing based on machine-learning mask estimation
US9098611B2 (en)2012-11-262015-08-04Intouch Technologies, Inc.Enhanced video interaction for a user interface of a telepresence network
US11910128B2 (en)2012-11-262024-02-20Teladoc Health, Inc.Enhanced video interaction for a user interface of a telepresence network
US10924708B2 (en)2012-11-262021-02-16Teladoc Health, Inc.Enhanced video interaction for a user interface of a telepresence network
US10334205B2 (en)2012-11-262019-06-25Intouch Technologies, Inc.Enhanced video interaction for a user interface of a telepresence network
US9799330B2 (en)2014-08-282017-10-24Knowles Electronics, LlcMulti-sourced noise suppression
US9978388B2 (en)2014-09-122018-05-22Knowles Electronics, LlcSystems and methods for restoration of speech components
US9668048B2 (en)2015-01-302017-05-30Knowles Electronics, LlcContextual switching of microphones
US11862302B2 (en)2017-04-242024-01-02Teladoc Health, Inc.Automated transcription and documentation of tele-health encounters
US11742094B2 (en)2017-07-252023-08-29Teladoc Health, Inc.Modular telehealth cart with thermal imaging and touch screen user interface
US11636944B2 (en)2017-08-252023-04-25Teladoc Health, Inc.Connectivity infrastructure for a telehealth platform
US11389064B2 (en)2018-04-272022-07-19Teladoc Health, Inc.Telehealth cart that supports a removable tablet with seamless audio/video switching

Also Published As

Publication numberPublication date
CA2090160A1 (en)1993-09-03
JP3263168B2 (en)2002-03-04
EP0559348A3 (en)1993-11-03
KR970007663B1 (en)1997-05-15
JPH0651795A (en)1994-02-25
KR930020412A (en)1993-10-19
EP0559348A2 (en)1993-09-08
US5627938A (en)1997-05-06
CA2090160C (en)1998-10-06

Similar Documents

PublicationPublication DateTitle
USRE39080E1 (en)Rate loop processor for perceptual encoder/decoder
EP0559383B1 (en)Method for coding mode selection for stereophonic audio signals utilizing peceptual models
EP0564089B1 (en)A method and appartus for the perceptual coding of audio signals
EP0709004B1 (en)Hybrid adaptive allocation for audio encoder and decoder
US5488665A (en)Multi-channel perceptual audio compression system with encoding mode switching among matrixed channels
US5301255A (en)Audio signal subband encoder
KR100209870B1 (en)Perceptual coding of audio signals
CA2197128C (en)Enhanced joint stereo coding method using temporal envelope shaping
USRE42949E1 (en)Stereophonic audio signal decompression switching to monaural audio signal
EP0799531B1 (en)Method and apparatus for applying waveform prediction to subbands of a perceptual coding system
US5581654A (en)Method and apparatus for information encoding and decoding
KR100556505B1 (en)Reproducing and recording apparatus, decoding apparatus, recording apparatus, reproducing and recording method, decoding method and recording method
EP0775389B1 (en)Encoding system and encoding method for encoding a digital signal having at least a first and a second digital signal component
US5758316A (en)Methods and apparatus for information encoding and decoding based upon tonal components of plural channels
EP0717518A2 (en)High efficiency audio encoding method and apparatus
USRE40280E1 (en)Rate loop processor for perceptual encoder/decoder
EP1046239B1 (en)Method and apparatus for phase estimation in a transform coder for high quality audio
JPH09102742A (en)Encoding method and device, decoding method and device and recording medium
Noll et al.ISO/MPEG audio coding
JP3513879B2 (en) Information encoding method and information decoding method

Legal Events

DateCodeTitleDescription
ASAssignment

Owner name:JPMORGAN CHASE BANK, AS COLLATERAL AGENT, TEXAS

Free format text:SECURITY AGREEMENT;ASSIGNOR:LUCENT TECHNOLOGIES INC.;REEL/FRAME:014416/0873

Effective date:20030528

FEPPFee payment procedure

Free format text:PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

ASAssignment

Owner name:LUCENT TECHNOLOGIES INC., NEW JERSEY

Free format text:TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENT RIGHTS;ASSIGNOR:JPMORGAN CHASE BANK, N.A. (FORMERLY KNOWN AS THE CHASE MANHATTAN BANK), AS ADMINISTRATIVE AGENT;REEL/FRAME:018597/0051

Effective date:20061130

ASAssignment

Owner name:CREDIT SUISSE AG, NEW YORK

Free format text:SECURITY INTEREST;ASSIGNOR:ALCATEL-LUCENT USA INC.;REEL/FRAME:030510/0627

Effective date:20130130

ASAssignment

Owner name:ALCATEL-LUCENT USA INC., NEW JERSEY

Free format text:RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG;REEL/FRAME:033950/0001

Effective date:20140819


[8]ページ先頭

©2009-2025 Movatter.jp