Background
It is often desirable to transmit low-to-high-speed data signals over voice channels (e.g., telephone channels, radio channels, and television channels), which carry analog audio signals and/or music signals. These data signals may be used to convey, for example, the serial number, name, copyright information, royalty account numbers, and false or true indications of the songs played. These data signals may also be used to identify particular programs and/or program sources. Programs may include television programs, radio programs, laser video discs, tapes, interactive programs and/or games, and so forth; program sources may include program creators, networks, local stations, syndicators, cable companies, etc.; the broadcast of the program may include the delivery of the program over the air, cable, satellite, or confined to the home, video cassette recorder, optical disc drive, computer, etc.
These data signals are called ancillary codes. When ancillary codes are used to identify programs and/or program sources, these ancillary codes are detected by the program monitoring system to confirm the broadcast of the selected program, by the viewer statistics system to account for the viewer's viewing interests, and by some similar system.
In a program monitoring system responsive to ancillary codes in programs, the ancillary codes inserted in the program signals are in the form of identification codes identifying the respective broadcast programs. Thus, while monitoring the broadcast of the program, the program monitoring system detects the identification code to confirm that the encoded program is broadcast. The program monitoring system also typically determines the geographic area in which the programs are broadcast, the number of times the programs are broadcast, and the stations, cables, and channels on which the programs are broadcast.
In audience statistics systems that employ ancillary codes, the ancillary codes are typically added to the channel to which the receiver can tune. When the auxiliary code is present in the output signal of the receiver, the channel to which the receiver is tuned and the program identification code are identified. Clearly, unique ancillary codes may be added by the program source to some or all of the programs broadcast to the households.
When adding an ancillary code to a program signal, it must be done in such a way that the ancillary code is not perceptible to the viewer of the program. Various techniques are currently available to achieve this imperceptibility.
One common technique for adding data to the voice channel involves transmitting the data in an under-utilized portion of the spectrum below and/or above the available voice band of the telephone line so that the data is not perceptible to a listener. The data is whitened by spread spectrum, and low interference is maintained.
An example of a technique for placing information in the low frequency region of the voiceband is shown in U.S. patent 4,425,661 to Moses et al. Another technique described in us patent 4,672,605 to Husting et al involves the use of spread spectrum signals with most of the energy in the high audio region and above the voice band. Another technique described in U.S. patent 4,425,642 to Moses et al involves spread spectrum processing of data across the entire channel spectrum such that the spectral energy of the data possesses pseudo-random noise characteristics that, when added to the voice channel, cause only a white noise increase that is imperceptible.
While the above-described systems generally meet the specific intent designed, these systems suffer from certain drawbacks inherent in the use of spread spectrum processing. Specifically, the use of spread spectrum whitening techniques alone results in very low data throughput rates on the voice channel due to the large spreading gain that must be achieved. In addition, while these techniques have limited use of some of the "masking" characteristics of the audio signal that is transmitted with the data, as will be discussed below, these techniques do not take full advantage of these characteristics, thereby limiting the processing gain that could otherwise be achieved.
Other techniques that enable the transmission of voice and data in one channel include: (1) using the start pulse generated by bringing the sub-band to a zero energy level and then digitizing the sound with a short period of time to follow as a sequence number, (2) using the sub-band to carry digital information by forcing the sub-band energy to zero or at an actual level to generate "mark" and "space" (i.e., "1" and "0"). The main disadvantage of the former technique is poor noise immunity and it is impractical in situations where many bytes of data must be stored and processed. The major drawbacks of the latter technique also include poor noise immunity and very low data throughput.
Thomas et al, U.S. patent 5,425,100, discloses a multi-stage encoding system that includes a plurality of encoders, each encoder associated with a different stage in a multi-stage propagated signal distribution system. The disclosure of Thomas et al in U.S. patent 5,425,100 is incorporated herein by reference in its entirety.
A commonly used "AMOL" system as taught by Haselwood et al in U.S. Pat. No. 4,025,851 is incorporated herein by reference in its entirety. This system adds an auxiliary data signal in the form of a source identification code to selected horizontal lines during the vertical blanking interval of the broadcast television signal. A monitoring device located within the select area of the whole united states confirms the program being broadcast by detecting the source identification code. The monitoring device stores these detected source identification codes along with the time of detection and the detected channel for later retrieval.
United states patent 5,243,423 to DeJean et al teaches an audience statistics and program monitoring system in which an auxiliary signal is transmitted in raster image lines of a broadcast television signal. In order to reduce the perceptibility of the auxiliary signal, the image lines conveying the auxiliary signal are varied in a pseudo-random sequence. Alternatively, the auxiliary signal may be modulated at a relatively low modulation level by converting the auxiliary signal to a spread spectrum auxiliary signal. The encoded broadcast program is then identified by decoding the ancillary code at the receiver that is in close proximity to the monitored receiver.
The application of digital data compression methods to signals has a large impact on the effectiveness of the above coding methods. For example, some video compression schemes eliminate the vertical blanking period. Therefore, this video signal compression eliminates the ancillary codes injected during the vertical blanking interval. Digitization also removes spread spectrum ancillary codes and other signals that rely on small signal amplitude concealment. In addition, compression algorithms that "clip" high-end frequencies remove ancillary codes transmitted in the high-frequency portion of the video signal band.
Although the addition of ancillary codes in the normally visible portion of the active video signal can in most cases avoid the removal of ancillary codes by compression schemes, and although the addition of ancillary codes in the low energy density portion of the video signal at a certain frequency increases the likelihood that the ancillary codes will not be perceived even if the ancillary codes are added to the active video signal, the ancillary codes may still be perceived under certain conditions. For example, if the intensity of a luminance signal modulated onto an image (i.e., luminance) carrier or the intensity of a color signal modulated onto a chrominance subcarrier is less than the intensity of an auxiliary code when the auxiliary code is modulated onto a frequency between a video carrier and the chrominance subcarrier, the auxiliary code is not masked by the chrominance subcarrier and the image carrier of the video signal. Thus, the auxiliary code may have a sufficient relative amplitude to make the program look noise-like.
It is known in the art that each audio signal has a masking function that masks the sound distortion that is present with the signal. Thus, any distortion or noise introduced into the transmission channel that is properly distributed or shaped will be masked by the audio signal itself. This masking may result in, in part or in whole, an improvement in quality compared to a system without noise shaping, or a near perfect signal quality equivalent to a signal without noise. In either case, this "masking" is due to the inability of human sensory function to distinguish two signal components belonging to the audio signal and noise, respectively, at the same spectral, temporal or spatial location. A significant consequence of this limitation is that the listener's perception of noise is zero even if the signal-to-noise ratio is at a measurable level. Ideally, the noise level at each point in the audio signal space is just as noticeable as distortion, a limit commonly referred to as the "perceptual entropy envelope".
The main purpose of noise shaping is therefore to advantageously shape the noise by time or frequency, so that as much of the noise component as possible is masked by the audio signal itself, thereby minimizing perceptibility of the distortion. See Nikil Jayant et al, "Signal Compression Based on Method of Human Permission" (81Proc. f. Soft-HEIEEE 1385 (1993)). Schematic diagrams of time-frequency domain masking are shown in fig. 1A-1C, where a short sinusoidal signal 10 produces a masking threshold 12. See John G.Beerends and Jan A.Stemerdin, "A Perceptil Audio Quality measurement Based on a Psychoacological Sound detection" (40 J.Audio Engineering Soc' y 963,966 (1992).
"perceptual coding" techniques employing the above principles are currently used in signal compression and are based on three types of masking: frequency domain, time domain, and noise level. The basic principle of frequency domain masking is that when some strong signals are present in a sound band, low level signals that are close to stronger signals at other frequencies are masked from perception by a listener. Time domain masking is based on the fact that certain types of noise and sound are not immediately perceived before and after large signal transients. Noise masking exploits the fact that if a high wideband noise level occurs simultaneously with various types of strong signals, the wideband noise level is not noticeable.
Perceptual coding forms the basis of other coding techniques for audio signal compression in precision sound sub-band coding (PASC) and compact disc (MD) and Digital Compact Cartridge (DCC) formats. In particular, such compression algorithms take advantage of the fact that certain signals in a sound channel will be masked by other strong signals, removing those masked signals so that the remaining signals can be compressed into a channel of lower bit rate.
Another disadvantage of the prior art for simultaneously transmitting data and audio signals is that if the signals are transmitted over a channel that implements a lossy compression algorithm, such as an MPEG compression algorithm, the data, or at least a portion thereof, is removed because most such compression algorithms divide the sound channel into a plurality of sub-bands and then encode and transmit only the strongest signal within each sub-band. Regardless of which of the foregoing techniques is employed, the data may not always be the strongest signal in the sub-band, so it may not be possible to transmit all portions of the data. In addition, for spread spectrum techniques, even if the data is assumed to be the strongest signal in one or two sub-bands, the information contained in these sub-bands would only contain a small portion of the total information carried by the data, and may not be useful, since the information is spread across the entire signal spectrum.
Therefore, there is a need for a system for simultaneously transmitting ancillary codes and audio signals that takes advantage of perceptual coding techniques and is capable of transmitting ancillary codes over lossy compressed channels.
Detailed description of the preferred embodiments
Fig. 1a-1c are schematic representations of time domain and frequency domain masking of sound distortion, in which a short sinusoidal sound signal 10 has a masking threshold or perceptual entropy envelope 12.
Fig. 2 is a schematic block diagram of an encoder 202 implementing features of the present invention for encoding an ancillary code transmitted simultaneously with an audio signal over a sound channel (not shown), such as a television transmission channel, using perceptual encoding techniques. The encoder 202 includes a multi-layer artificial Neural Network (NN)204, and the neural network 204 monitors the audio signal via an audio input 206a for "opportunities" when, at what frequencies, and at what amplitudes the ancillary codes are inserted so that they are not felt by the human ear. In other words, the NN204 determines the "perceptual entropy envelope" of the sound channel. As described above, the "perceptual entropy envelope" is a three-dimensional (time, frequency, amplitude) map of the optimal masking function of the sound channel. Those skilled in the art will appreciate that neural networks (e.g., NN 204) contain a combination of simple computational units that, after being "trained" to perform a particular transformation task between input data and output data. As used herein, the term "neural network" also includes all necessary preprocessing circuitry, such as filters, timing circuits, and the like. The transformation of the neural network is achieved after a long initial training period, in which input data and output data satisfying the transformation task are provided to the NN 204. In this embodiment, the input to the NN204 comprises an audio signal segment, and the desired output is an auditory noise masking threshold (i.e., a perceptual entropy envelope) produced by the audio signal segment. In this way, the NN204 is trained to extract perceptually significant features from the audio signal at the audio input end 206a, which features are related to the perceptual entropy envelope produced by successive frames of the input ancillary code. The rules for performing the NN204 transformation function are stored in the ROM 205, and in this embodiment, the ROM 205 contains an inserted chip, so that later upgrading is easy and feasible.
The NN204 controls a clock control circuit 208a, a level control circuit 208b, and a burst timing circuit 208c for purposes that will be described in more detail later. As will be described in detail later, the ancillary codes will be encoded under the control of the NN as one or more whitened direct sequence spread spectrum signals and/or narrowband FSK ancillary codes for combination with the audio signal at a time, frequency and amplitude such that the ancillary codes are masked by the audio signal.
The digital auxiliary code, which contains the serial number and other identification numbers, is generated by the control computer 210 and input to the encoder 202, preferably through the RS 232-C interface 212, although it should be understood that any number of different types of interfaces may be used. For example, the ancillary codes generated by the control computer 210 may be numbers identifying a program, a source of the program, a radio or television broadcast network, a local radio or television station, or numbers encoded on a compact disc to identify a particular program, actor, or song. The ancillary codes output from the control computer 210 are input to a preprocessing circuit 213, the preprocessing circuit 213 including a block encoder 214 and a bit interleaving circuit 216. The block processor 214 is used to encode the ancillary codes to enable detection and error correction when received at the decoder (fig. 3). Bit interleaving circuit 216 enables the encoded number to withstand the attack of errors in the transmission path. Exemplary apparatus and methods for performing such block coding and bit interleaving techniques are described in U.S. patent 4,672,605 to Hustig et al, which is hereby incorporated by reference. The ancillary codes output from the preprocessing circuit 213 are stored in each of three Random Access Memories (RAMs) 218a, 218b and 218c for use by a wideband spread spectrum encoder 220, a band-limited spread spectrum encoder 222 and an FSK burst encoder 224, respectively, for purposes to be described hereinafter.
The wideband spread spectrum encoder 220 encodes the ancillary code into a wideband direct sequence spread spectrum signal with a processing gain and level determined by the NN204 as being associated with the opportunity for noise masking in the audio signal. In particular, the NN204 dynamically determines a noise masking perceptual entropy envelope to control the spread spectrum processing gain (i.e., the ratio of the data rate to the pseudorandom code rate) and signal level of the wideband pseudorandom noise transmission output from the encoder 220. The auxiliary code stored in RAM 218a is input to modulo-2 encoder 228 and mixed with the synchronous PN code from PN code generator 230 to form a direct sequence signal. In a preferred embodiment, the modulo-2 encoder 228 is constructed using exclusive-or logic gates. The direct sequence signal output from modulo-2 encoder 228 is input to a header signal generator 223 which adds a PN code header signal to each frame of the direct sequence signal in accordance with the synchronization and timing signals from synchronization and timing circuit 234 to improve acquisition of the ancillary code at the decoder (fig. 3).
As shown in fig. 2, the synchronization and timing circuit 234 is controlled by a signal from the burst timing circuit 208 c. The direct sequence signal output from the header signal generator 232 has a spectrum that is fairly flat over the bandwidth of the channel, as is typical of a normal direct sequence signal. Once the PN code header signal is added to each frame or segment of the auxiliary code, the resultant bandwidth-spread spectrum code is output to the adder 235 through the variable attenuator 236, and the transmission level of the signal is established in accordance with the control signal from the level control circuit 208 b. The level control circuit 208b is controlled by a signal from the NN 204.
Band-limited spread spectrum encoder 222 is similar to wideband spread spectrum encoder 220 except that it encodes the ancillary code into a band-limited, rather than a wideband, direct sequence spread spectrum signal based on the noise masking opportunities and frequency masking opportunities in the sound channel as determined by NN 204. The ancillary code stored in RAM 218b is input to modulo-2 encoder 238, as is encoder 220, and mixed with the PN code from PN code generator 240 to form a direct sequence signal. The direct sequence signal output from modulo-2 encoder 238 is input to a header signal generator 242 which adds a PN code beacon signal to each frame of the direct sequence signal in accordance with the synchronization and timing signals from synchronization and timing circuit 244. As shown in fig. 2, the synchronization and timing circuit 244 is controlled from the burst timing circuit 208 c. The spectrum of the direct-sequence signal output from the header signal generator 242 is fairly flat across the channel bandwidth, as is the signal generated by the encoder 220. Once the PN code header signal is added to the direct sequence signal, the signal is output to multiplier 246 for multiplication with a signal from a synchronous clock 248 having a high ratio of clock frequency to PN code frequency. In this way, the signal frequencies may not be translated to a selected frequency, which is preferably located at the center of the sound channel selection sub-band. The signal output from multiplier 246 is then bandwidth limited by bandpass filter 250, bandpass filter 250 converting the direct sequence of signal energy to the selected sub-band. The resulting band-limited spread spectrum auxiliary code is output to the adder 235 through the variable attenuator 252, and the variable attenuator 252 controls the amplitude of the transmitted auxiliary code under the control of the level control circuit 208 b.
The FSK burst encoder 224 encodes the ancillary code as a narrowband signal associated with both time masking and frequency masking opportunities. The ancillary codes stored in RAM 218c are input to a header signal generator 254 which adds a header to each frame of data to facilitate capture of the data at the decoder (fig. 3). The ancillary codes are then input to an FSK encoder 256 and a bandpass filter 258. The FSK encoder 256 FSK modulates the code. While band pass filter 258 limits the bandwidth to the encoding, concentrating the signal energy in selected sub-bands. As shown in fig. 2, the header signal generator 254 and the FSK encoder are controlled by a signal from the burst timing circuit 208 c.
The obtained FSK auxiliary code is then output to the adder 235 through the variable attenuator 260, and the variable attenuator 260 controls the amplitude of the transmitted signal under the control of the level control circuit 208 b. It should be appreciated that the FSK auxiliary code output from the encoder 224 may be continuous, but its level may vary dynamically, or may be pulsed, with the opportunity for temporal masking being activated as determined by the NN 204. However, as will be described in more detail, in the particular practice where the ancillary codes must be lossy compressed according to known compression algorithms (e.g., MPEG), the signal must be transmitted in burst mode so that the signal can withstand such compression.
The wideband spread spectrum code, band-limited spread spectrum code and FSK ancillary code output from encoder 220 and 224, respectively, are combined by adder 235 with the audio signal at terminal 206a to form a composite signal that is output at terminal 206b to the audio channel. The composite signal may also be recorded on a suitable recording medium, such as a CD, which is then transmitted if the CD is played back. In a preferred embodiment, shown in fig. 2, the audio signal is input to a device such as a Digital Signal Processor (DSP)260a, which is used to attenuate the audio signal levels in certain sub-bands under control of the signal from level control circuit 208b, before being input to adder 235. Such attenuation is required when the NN204 sends out FSK bursts or band-limited spread spectrum signal transmissions and subsequently detects unwanted burst energy in the audio signal sub-band that would interfere with the data transmission. The resultant signal is also input to an authentication circuit 261, which authentication circuit 261 contains a channel simulator 262 and a typical receiver 264. The channel simulator 262 adds noise to the composite signal, making the composite signal slightly degraded than would normally be the case when the actual voice channel is transmitted. The acknowledgement receiver 264 acknowledges the ability to successfully decode the data information contained in the composite signal and transmits an acknowledgement signal to the NN 204.
It is clear that if perceptual coding techniques such as those described above are used to encode the ancillary codes to be transmitted, then perceptual compression schemes such as MPEG and PASC will most likely remove data from the composite signal before or during transmission. Therefore, to address this problem, the NN204 must be trained not only to listen for the channel for the chance that the ancillary code may be imperceptibly transmitted, but also to compensate for the particular compression scheme to be encountered.
For example, one well-known and widely-employed compression scheme divides the sound band into 32 sub-bands. With frequency domain masking, and even to some extent time domain masking, only the strongest signal in each sub-band is encoded and transmitted, assuming that the remaining signals are masked in the sub-band by the stronger signal and the latter is not heard. In this case, to ensure that the ancillary code is transmitted, the NN204 must be trained to "listen" for the opportunity to transmit the ancillary code as an FSK burst signal, when the ancillary code is the strongest signal in a particular sub-band, which is transmitted masked by a strong wideband transient (time domain masking) in a subsequent adjacent sub-band.
In the above embodiments, one or more of the sub-bands carrying the auxiliary code may be pre-selected, if not optimal. For example, a first sub-band may be selected to carry an ID code identifying the television network carrying the audio signal, another sub-band may be selected to carry an ID code identifying the audio signal distributor, and a third sub-band may be selected to carry an ID code identifying the local station sending the audio signal. In a preferred embodiment, to maximize speed and limit errors in data throughput to a certain degree, data transmissions are generated in a "partial response" mode, meaning that the auxiliary codes are transmitted at a bit rate faster than what is normally considered optimal to ensure clear decoding at the decoder (fig. 3), with the result that the data received at the decoder contains a set of "fuzzy logic". However, since transmission in partial response mode is usually not optimal, it has to be ensured that the transmission of the auxiliary code is fast enough to fit into the narrow sub-band. As will be described, the correction of errors produced by the partial response pattern is performed at each decoder by a neural network (fig. 3) trained in pattern recognition to judge the identity of the auxiliary code.
Fig. 3 is a schematic block diagram of a decoder 300 implementing the features of the present invention for retrieving an ancillary code encoded by encoder 202 and transmitted over a voice channel. The decoder 300 receives the synthesized signal at an audio input 302, which is transmitted over an audio channel (not shown). The received signal is input to a band pass filter 304 whose parameters are defined by the pass band of the sound channel to filter out all unwanted frequencies. The signal output from the filter 304 is input to a signal preprocessor 305, and the signal preprocessor 305 includes an Automatic Gain Controller (AGC)306, an equalizer 308, and an analog-to-digital (a/D) converter 310. The signal preprocessor 306 maintains the amplitude of the signal within an acceptable range. The equalizer 308 compensates for known phase and amplitude distortions in the signal path. The analog-to-digital converter 310 converts the signal to digital form for processing. The digital signal output from the preprocessor 305 is input to a receiver synchronization circuit 312 and an FSK signal processing circuit 314.
The receiver synchronization circuit 312 performs synchronization acquisition of the wideband and/or bandlimited spread spectrum signals in quadrature phase using an iterative phase stepping process to be described below. A header PN code generator 316 generates the same header PN code as generated at the generators 232,242 (fig. 2) and is modulo-2 mixed with the signal output from the pre-processor 305 in quadrature phase in a wide dynamic range (i.e., 18 to 24 bit resolution) Digital Signal Processor (DSP) 318. In the depicted embodiment, the DSP 318 contains four EXCLUSIVE-OR gates 318a-318 d. The four signals output from the DSP 318 are input to a lock detection circuit 320 for detecting when the phase of the auxiliary code is locked with the phase of the header PN code from the generator 316. A signal indicating whether phase lock is detected is input to the phase shift circuit 322 and the digital phase-locked loop (PLL) 324. As long as the signal output from circuit 320 indicates that the signal phase is not locked, the phase shift circuit continues to shift the phase of the header PN code from generator 316 until circuit 320 detects a phase lock. It should be noted that locking typically occurs in bursts when the received transmitted data is of the highest quality. To this end, the phase locked loop 324 acts as a flywheel to maintain clock phase synchronization between the lock bursts.
The timing signal generated by the phase locked loop 324 is fed to a PN code generator 326 of a decoder circuit 328. The PN code is modulo-2 mixed with the signal output from the preprocessing circuit 305 by the exclusive or gate 330 to recover the auxiliary code containing the ID number. The output of the exclusive-or gate 330 is typically a fuzzy logic set because, as previously described, the transmission of data typically occurs in a partial response mode. The output signal from the exclusive or gate 330 is input to a Neural Network (NN) 332. in a preferred embodiment, the neural network 332 preferably comprises "back-extension perceptrons" that perform block decoding, bit de-interleaving, and capture confirmation functions using pattern and feature recognition techniques. Such a pattern and feature identification technique and a back-extension sensor performing the same are well known in the art and will not be described in detail below.
Once the capture of the auxiliary code is confirmed by the NN 332 with pattern recognition, this fact is notified to the lock detection circuit 320 as confirmation that the lock is valid. A time stamp generated according to the time of the day clock 333 is added to the decoded auxiliary code, and the decoded auxiliary code is output from the NN 332 after a considerable delay of, for example, l0 seconds. Alternatively, the signal output from the NN 332 may simply represent: the decoded auxiliary code is the same as the previous auxiliary code, or the decoded auxiliary code is ambiguous, or the decoded auxiliary code is different from the previous number, when a new decoded auxiliary code is output as described above. The ancillary codes output from the NN 332 are held in a Data Storage Unit (DSU)334, the contents of which are transferred to a central processing unit 336 by suitable means at regular time intervals, the central processing unit 336 processing the retrieved ancillary codes and time so as to be available for use in performing radio and television surveys and program broadcast verification, and music royalty tracking applications, for example, as described below.
Referring again to the FSK processing circuit 314, to decode the ancillary codes conveyed as FSK burst signals, the signal output from the preprocessor 305 is input to a bandpass filter 336, similar to the filter 258, and then to an FSK decoder 338, which decodes the ancillary codes. The output of the FSK decoder 338 is also a fuzzy logic set, since the data transmission occurs in a partially responsive environment. The fuzzy logic signal output from the FSK decoder 338 is input to the NN 332, and the NN 332 processes the signal in the same manner as the signal input from the exclusive or 330.
Fig. 4a-4e depict various frequency and time profiles of typical ancillary codes and audio signals used and produced by the system of the present invention. Fig. 4a plots energy, with respect to frequency, of an audio signal 400, such as that received at the acoustic input 206 a. Fig. 4b is a graph of the voltage time profile of a portion 410 of the audio signal 400 within the selected sub-band 402 (fig. 4 a). Fig. 4c is a graph of the energy time distribution of the portion 410 of the audio signal within the sub-band 402. Also shown in fig. 4c are the temporal masking threshold 420 and the perceptual entropy envelope 422 of the audio signal portion 410. It will be appreciated that the audio signal portion 410 will mask signals having signal energy below its perceptual entropy envelope 422. Fig. 4d shows an ancillary code 430, such as encoded by the encoder 202, transmitted with and masked by the audio signal portion 410. It should be noted that the secondary code 430 appears as a 100 millisecond burst of signal that decays exponentially. Fig. 4e is a time profile of a composite signal 440, such as output from the encoder 202 via the audio output 206b, comprising an audio signal portion 410 and an ancillary code 430.
In operation, it is desirable that the above-described invention be used for several purposes, including: program broadcast validation, television and radio surveys, and music royalty tracking. For example, in television and radio survey applications, the audio signal at terminal 206a comprises a programming signal and the ancillary codes comprise ID codes identifying local stations, broadcasters, distributors, certain programming and advertising, and the like. The ancillary code may be encoded as described above and received with the audio signal of the television or radio and received at a decoder in the residence of the television viewer or listener or at some central location. The decoded ID code is then used to verify the broadcast of a particular program in the corresponding time slot. In the application of audience measurement statistics, the decoded ID code may then be used to determine a particular program or time slot at any given time. In a music royalty application, it is contemplated that the encoded ID number may be recorded on a CD so that, when the CD is played, an ancillary code containing an ID code identifying the musical programming is transmitted along with the audio signal recorded thereon. Further, the encoded auxiliary code may be received and decoded at various strategically placed decoding locations with decoder 300. Typically, this technique will be used to collect data used to collect royalties for calculating credit levels, such as: "Billboad Top 100".
In another embodiment, the first NN204 shown in FIG. 2 may be replaced by an analog hardwired NN, as shown in FIGS. 5a-5 h. Specifically, fig. 5a-5h contain schematic block diagrams of a simulated NN embodying such an embodiment of the present invention. As shown in fig. 5a, the audio signal is connected to pin 7 of the 16-pin interface connector 510 via an additional connector (not shown) and input to the filter bank 520, as shown in fig. 5 b. In addition, the audio signal at pin 7 of the connector 510 is input to the DSP 260a (fig. 2) and the neural network 204 (fig. 2) through the terminal 206a (fig. 2).
Referring to fig. 5b, the filter bank 520 includes four filters 522, 524, 526 and 528 for dividing the audio signal input to the four filters into four sub-bands centered at 1.5kHz, 2.0kHz, 2.77kHz (transmission sub-band) and 4.0kHz, respectively. These sub-band audio signals are then input to a filter circuit 530. as shown in fig. 5c, the filter circuit 530 comprises four full-wave rectifier circuits 532,534, 536 and 538. The rectified audio signal is output from the rectification circuits 532, 534, 536, 538 to the threshold detector circuit 540, which, as shown in fig. 5d, comprises 12 LM339 threshold detectors 542a-542 l. Threshold detector circuit 540 is used to detect the three thresholds for each sub-band established by voltage dividers 544, 546 and variable threshold circuit 548. The signal output from threshold detector 542 is used by circuitry to implement certain transmission opportunity rules, as will be described with reference to fig. 5D-5F.
In particular, the main purpose of the "rule-based" analog NN shown in fig. 5A-5H is to transmit data packets in a television audio channel using time masking. The transmission sub-band is centered at 2.77kHz and covers two MPEG layer ii bands. The method is used to transmit the ancillary code immediately after the audio signal surge according to some transmission rule implemented in the hardwired circuit.
In particular, fig. 5e depicts a circuit 550 for sending a low level transmission and a circuit 552 for sending a high level transmission. To transmit a low level transmission, a surge envelope detector 554 with a 2 millisecond time constant and 30mVRMS sensitivity (4 cycles minimum) operates on the full wave rectified signal input at pin 1 and produces a logic "1" during the audio signal surge. The signal output on pin 4 of detector 554 is input to one end of surge detector 556, and surge detector 556 generates a 1 millisecond pulse on output pin 12 at the end of the surge. The signal output on pin 12 of detector 556 is input to disable/transmit circuit 558, which disables the transmission of low signals if: if (i) the length of the surge is not long enough (i.e., 10 milliseconds minimum) as determined by surge circuit 559, (ii) if the time that has elapsed since the last transmission of any type (low or high) is less than 450 milliseconds as determined by timing circuit 560, or (iii) if the beat/smile circuit (fig. 5g) is enabled and the circuit detects a smile or beat. If all the conditions described above are met, a "GO = LOW LEVEL" signal is transmitted to pin 11 of connector 510 (fig. 5 a). The GO = LOW LEVEL signal is also transmitted to the DSP 260a (fig. 2), the clock control circuit 208a (fig. 2), the LEVEL control circuit 208b (fig. 2), and the burst timing circuit 208c (fig. 2).
Also, to transmit high level transmissions, a surge envelope detector 562 with a 2 millisecond time constant and 60 mvams sensitivity (4 cycles minimum) operates on the full wave rectified signal input at pin 1 and produces a logic "1" at output pin 4 during an audio signal surge. The signal output on the pin of detector 562 is input to one end of surge detector 564, which produces a 1 millisecond pulse on output pin 12 at the end of the surge. The signal output on pin 12 of detector 564 is input to disable/transmit circuit 566, which disables transmission of the high level signal by disable/transmit circuit 566 if: if (i) the length of the surge is not long enough (i.e., a 5 millisecond minimum) as determined by the length of the surge circuit 568 activated by the dip switch 569, (ii) if the elapsed time since the last transmission of any type (low or high) is less than 450 milliseconds as determined by the circuit 560, or (iii) if the beat/smile circuit (fig. 5g) is activated and a smile or beat is detected. If all of the above conditions are met, a "GO = HIGHLEVEL" signal is transmitted to pin 10 of connector 510 (fig. 5 a). The "GO = HIGH LEVEL signal is also communicated to the DSP 260a (fig. 2), the clock control circuit 208a (fig. 2), the level control circuit 208b (fig. 2), and the burst timing circuit 208c (fig. 2).
Referring to fig. 5f, one second transmit circuit 570 initiates a high level transmission if the following rule is satisfied. First, the fade to bad detector 572 must detect a 100 ms fade below 35 VRMS. If such a fade is detected, a one second time period is established by one second timing circuit 573. At the beginning of each one-second time period, timing circuit 537 causes D-type flip-flop 574 to set. Circuit 574 is reset by any type of transmission. During the time that trigger circuit 574 is set, if a high level surge is detected in a sub-band centered at 1.5kHz, or if a high level surge is detected in a sub-band centered at 2.0kHz, or a high level surge is detected in a sub-band centered at 4.0kHz, then high level transmission is initiated, at which time circuit 570 may be reconfigured by replacing or gates 575a and 575b with and gates, but requiring all (but not only one) of the above conditions to be met. The initiation of the transmission causes one shot circuit 576 to generate a 1 millisecond pulse to circuit 552 (fig. 5e) which circuit 552 in turn generates a high transmit signal to connector 510.
A beat/smile circuit 577 is shown as 5 g. The clap/smile circuit 577 is activated by a dip switch 579 (fig. 5 d). When the circuit 577 is enabled and the dip switches 579a, 579b, and 579c (fig. 5d) are ON, a level above the minimum threshold centered ON the sub-bands of 1.5kHz, 2.0kHz, and 4.0kHz enables a logic "1" at the envelope detectors 580a, 580b, and 580 c. The envelope detectors each have a time constant of 2 milliseconds. Likewise, when the switches 579a, 579b, and 579c described above are OFF, levels above the intermediate threshold on sub-bands centered at 1.5, 2.0, and 4.0kHz enable a logic "1" at the output of the envelope detectors 580a, 580b, and 580 c.
The outputs of envelope detectors 580a, 580b, and 580c are input to nand gate 582. The output of the AND gate 582 is also input to a variable threshold circuit 548 (FIG. 5 d). A logic "1" of the AND gate 582 moves the variable threshold toward 2.75V, while a logic "0" moves the variable threshold toward 0.5V. The variable threshold voltage is compared at detector 542c (fig. 5d) with the sub-band voltage centered at 2.77 kHz. The sub-band voltage of 100mVRMS on the voice channel is equal to 2.8Vpk at this point. A monostable trigger circuit 584 having a 2 millisecond time constant generates a logic "1" signal when the sub-band voltage exceeds the variable threshold voltage to provide a "NoGo" signal when the sub-band voltage exceeds the variable threshold voltage. The electronic switch provides a connection between pin 1(a) and pin 2(B) when the signal on pin 13(C) is at a logic "1".
Referring to fig. 5h, a five second transmit circuit 586 is configured to increase the likelihood of a transmission when there is no transmission for a period of five seconds. This function is used by having the dip switch 587 in the ON position. The timing circuit 588 sets a period of five seconds in response to any transmission. If there is no transmission for five seconds, the timer circuit 590 generates a 1 millisecond pulse. D-type flip-flop 592 disables a five second transfer multiple times before being reset by either a high transfer or a low transfer. The dip switch 593 is ON disables this rule. The signal that initiates the transmission is compared to the one second transmit circuit signal at nand gate 575c (fig. 5f) and a high level transmission is generated.
Referring to fig. 2 and 5a-5h, the switching circuit, identified by reference numerals 205a-205j, contains a ROM 205 portion (fig. 2) and sets up the rules that are executed by the neural network. Therefore, the outputs of the switching circuits 205a-205j are electrically connected as inputs to the neural network 204 (FIG. 2).
In summary, FIGS. 5a-5h implement the following rules: (1) any high or low level surge that allows transmission of the auxiliary code must be longer than a predetermined minimum length; (2) a predetermined minimum amount of time must have elapsed since the last transmission; (3) not allowing transmission within a predetermined time after laughing or patting; (4) no high level transmission is allowed unless there is a fade for a predetermined time and the fade is at some level below the predetermined level; (5) if no transmission has been made for 5 seconds in the middle of a high level transmission and/or a low level transmission, one transmission is allowed.
The present invention may be used for program confirmation and/or viewer recording. When used as a program confirmation, the monitoring station is located at a location where signals are received from one or more radio and/or television stations and/or other transmitters. The monitoring station monitors the transmitted signal for ancillary codes that are implicit in the signal and uses these ancillary codes to identify, directly or indirectly, the program, the source of the broadcast signal, or both, that contains the ancillary codes. This information is then reported to interested parties. The interested party uses this information to confirm that the program containing the ancillary code is being broadcast or transmitted. For example, an advertiser may confirm that its commercial advertisement is being broadcast for a selected time and on a selected paid channel. As another example, an artist (whose royalty depends on the number of times their program, song, etc. is propagated) may confirm the number of shows in their royalty declaration.
Examples of viewer recordings of the present invention are shown in fig. 6 and 7. As shown in fig. 6 and 7, the viewer recording system 600 (fig. 7) records a number of viewing habits of statistically selected households. The television viewer recording system 600 includes a home recording device 604 located within a statistically selected home. The home recording apparatus 604 may include an audience composition distribution apparatus (hereinafter referred to as "person statistics apparatus"). The people stats 606 enable the viewer to indicate their presence with a remote control 608 and/or a key switch 610. It may also be (or in addition to) personal badge 612 may be worn by a viewer, or identification information may be periodically propagated to people stats 606. A viewer in a household may have a personal tag 612 that sends unique information identifying the viewer. In addition to (or instead of) receiving information from remote controller 608, key switches 610 and/or personal tags 612, people stats 606 may include infrared cameras and computer image processing systems (not shown) to passively identify viewers among the viewing audience without active participation by the identified viewers. Such a system is disclosed in U.S. patent nos. 4,858,000 and 5,031,228 filed on.12/15/1992 and in U.S. patent application No. 07/992,383. Thus, people stats 606 identify viewers in the audience. It is desirable, but not necessary, to place people stats 606 close to the television to be recorded, such as television 614.
Although the viewer record is limited to the viewing activity of the television at home, it is clear that viewing or tuning performed outdoors may also be recorded. To this end, a portable statistics device 616 is provided. The portable statistics device 616 may be carried by a family viewer when away from the family, referred to as a personal statistics machine. Portable statistics device 616 is capable of making statistics of programs or stations tuned by televisions in the vicinity of portable statistics device 616. The portable statistics device 616 may also be used with a portable television 618.
As shown in fig. 7, the television viewer recording system 600 generally includes a household statistics apparatus 604 installed in each of a plurality of statistically selected households and receiving signals from one or more signal sources 620. The viewer recording system 600 also includes a central office device 622, located at the central office, that collects data from the home statistics device 604 and from external program recording sources as represented by arrow 626. The central office device 622 processes data collected from the home recording device 604 and/or from external program recording sources to generate audience record reports.
Although fig. 7 schematically depicts program signal source 620 as a broadcasting transmitting antenna that transmits program signals received by statistically selected in-home antennas 628, it should be understood that program signals may be distributed by a variety of means, such as coaxial cable, fiber optic cable, satellite, rental video tape, video disk, and the like. Additionally, while fig. 7 depicts a television program signal being distributed to a statistically selected television receiver 614 in a household, it will be appreciated that the present invention is equally applicable to radio signals or any other video or audio source, such as a cassette tape, CD, etc., in the discussion of the invention herein below.
The home statistics apparatus 604 of the television viewer recording system 600 preferably includes a data storage and communication processor 630 in communication with a communication processor 634 of the central office apparatus 622 via a public switched telephone network 632.
The home statistics apparatus 604 also includes a tuning recording device 636 for each monitored television 614. Each tone selective recording device 636 includes one or more sensors 638, a signal preprocessing circuit 640, a home code reader 642, and a home signature extractor 644. There are many types of sensors, any of which may be used as sensor 638. For example, one type of sensor 638 is physically connected specifically to the audio circuitry of the statistical television 614. However, a preferred sensor for sensor 638 is a non-invasive sensor, such as a microphone or a magnetic transducer. A microphone or the like may be mounted in the vicinity of the tv being counted to pick up the sound emanating from the speaker, in a manner that is non-intrusive. Since the installation is non-intrusive, the television 614 being counted need not be turned on when the sensor 638 is electrically connected to the television 614. Thus avoiding the annoyance resulting therefrom.
Because the microphone used as sensor 638 may also pick up other sounds in the ambient environment, the second microphone 646 may be installed to pick up more background noise and less sound from the speaker of the statistical television 614. The output of the second microphone 646 is used by the signal preprocessor circuit 640 to at least partially remove background noise. This is done by a well-known means of amplitude matching the signals from microphones 638 and 646 and then subtracting the signal from one of the microphones 638, 648 from the signal from the other microphone. In addition, the signal pre-processing circuit 640 may employ an input filter that may, for example, pass only sound signals in the 300Hz to 3000Hz passband, thereby eliminating traffic noise and artifacts introduced by the response characteristics of the household appliance or device. Another example of a non-invasive sensor that may be used as sensor 638 includes a sensing sound pickup device operating in association with the audio output circuitry of the television 614 being counted.
The configuration of the sensor 638 acquires at least a portion of the program signal corresponding to the program or station of the television 614 selected for viewing. These portions of the program signal acquired by the sensor 638 are pre-processed as required by the pre-processing circuit 640. The signal pre-processing circuit 640 simultaneously provides the pre-processed program signals to a home code reader 642 that attempts to locate and read ancillary codes from the program signals corresponding to one or more viewer-selected programs or stations in a statistically selected home, and a home signature extractor 644 that generates program signatures from the one or more viewer-selected program signals whenever the ancillary codes are not found by the home code reader 642.
The home code reader 642 may be any of the readers similar to those described above with reference to FIG. 3. The ancillary code may be in any form so long as the associated program and/or station is uniquely identified by the code. Also, as pointed out by Thomas et al in U.S. patent 5,425,100, the ancillary code may comprise a plurality of portions, each portion containing unique source information, such that the information in each portion represents a selected one of a plurality of allocation levels for the associated program.
Because the ancillary codes can carry all the information needed to identify broadcast transmissions, and because code readers are well known, viewer recording systems employing coded program transmissions are economically very attractive. In addition, the code reader reading the auxiliary code may be equipped with a suitable checking algorithm or the like, so that the number of errors occurring in accurately reading the auxiliary code (e.g., the multilevel code described by Thomas et al in U.S. Pat. No. 5,425,100) may be very small.
A problem with systems relying solely on ancillary codes is that not all programs or stations are configured with an available ancillary code. Therefore, it is preferable to further include a feature label extractor for extracting a feature label from the program signal. These feature tags may be used when the ancillary codes are not included in the program being viewed. Thus, in addition to the home code reader 642, the home statistics apparatus 604 also includes a home signature extractor 644, which can collect signatures from received program signals that cannot be read. These signatures are unique to the program signal from which they are extracted and therefore can be used to identify the program or station being viewed. The household feature tag extractor 644 may be of the type disclosed in U.S. patent No. 4,697,209 to Kiewit et al. This patent is incorporated herein by reference.
The data storage and communication processor 630 selectively stores the supplementary code read by the home code reader 642 and/or the feature tag extracted by the home feature tag extractor 644. It should be noted that if the partially readable auxiliary code is read by the home code reader 642, the data storage and communication processor 630 may also store a segment (e.g., a segment or partial segment of a multi-level code) of the auxiliary code for use in the viewer recording system 600.
In the case where a portable statistics device 616, which may be similar to the home statistics device 604 and may have one or more sensors 638 for statistically selecting inside or outside of a home, the data it produces is temporarily stored in random access memory 648 so that it may be transferred from time to the data storage and communications processor 630 via interface circuitry 650 (e.g., a first modem in the portable statistics device 616) and corresponding interface circuitry 652 (e.g., a second modem associated with the data storage and communications processor 630). Data may be communicated between interface circuits 650 and 652 via direct electrical connections, radio frequency transmissions, pulsed infrared signaling, and the like, as is well known in the art. The portable statistics device 616 also includes signal pre-processing circuitry 640, a code reader 642, and a signature extractor 644.
If the program or station cannot be identified from the ancillary code because there is no ancillary signal or the ancillary signal is unclear, the program signature extracted by the home signature extractor 644 of the home statistics apparatus 604 may be compared to previously extracted reference signatures by the home statistics apparatus 604, the central office apparatus 622, or reference signature extraction apparatus located at one or more local reference signature extractions, or the like. The comparison of the program signature to the reference signature may be made at the home statistics apparatus 604, or at the central office 622 or at the reference signature extraction, with the results of the comparison used to identify the program or station being viewed.
Additionally, the means for extracting the reference signature may comprise program duplication means for each received channel, as indicated in U.S. patent 4,677,466. The program replication means generates a copy of the monitored program and stores the copy in memory so that the copy can be later retrieved by the central office computer 654 of the central office device 622. Therefore, the operator may view the program on the multimedia terminal 656 to identify the unencoded program. The multimedia terminal 656 may include an image display and a speaker. Although the program copying device may be a VCR system as taught in U.S. patent 4,677,466, the program copying device is preferably a signal compression device that produces digital copies of the monitored program. Thus, the digital copy may be transmitted over the public switched telephone network, if necessary, so that the compressed data may be used to reproduce a facsimile of at least a portion of the unencoded program. The operator may view a fax at the terminal 656 to identify the uncoded program.
Various compression methods known in the art may be used to transmit digital copies of the monitored program or station. The video signal may be compressed, for example, in accordance with the method described in the report of "Application of compressed videos to Image Compression" by w.r. The Audio signal can be compressed according to the method described in the paper by stautner in the 93 rd Audio engineering society conference (10.1-4.1992). However, other suitable compression techniques may be employed.
Additionally, the home feature tag extractor 644 may also be a home channel and/or station detector to identify selected channels and/or stations. Therefore, the family member's selection of a channel and/or station may be used when the ancillary code is not included in the program being viewed. Thus, in addition to the home ancillary code reader 642, a home channel and/or station detector may be included in the home statistics apparatus 604 so that the selection of channels and/or stations by a family member can be determined and collected when the ancillary code cannot be detected.
When a home channel and/or station detector is used in place of the home signature extractor 644 and a family member performs control actions using the remote control 608, a signal from the remote control 608 is received by one of the sensors 638 of the television receiver and tuning recording device 636. Thus, if the home ancillary code 642 is unable to locate and/or read a valid ancillary code from a program signal corresponding to a program or station selected by one or more members of the household, the channel and/or station detected by the home channel and/or station detector 644 may instead provide information relating to the viewing habits of the members of the household.
Additionally, if the home ancillary code reader 642 is unable to locate and/or read a valid ancillary code from a program signal corresponding to a program or station selected by one or more members of the home, the tuning recording device 636 may arrange to prompt those members to enter the selected channel and/or station by employing input means, such as the remote controller 608, the key switch 610 of the people counter 606, voice recognition sensors, etc. The prompt may be provided by the television receiver 614, by using on-screen information, or by a sensor or display 658. Such sensors or displays 658 may provide audio signals, synthesized voice information for speakers, image displays, or flashing information for LEDs, CRTs, or LCDs, etc. The response to the prompt may be received by an appropriate one of the sensors 638 or other microphone and stored for eventual transmission to the central office device 622.
It is understood that the present invention may take many forms and embodiments. The above description of embodiments is non-limiting and variations may be made without departing from the scope and spirit of the invention. For example, the encoder 200 may be more numerous than the transmission encoder 220 and 224, particularly if the auxiliary code knows that the auxiliary code will or will not be subject to lossy compression prior to transmission of the auxiliary code. In addition, the functions of the transcoder 220, 222, 224, the receive synchronization circuit 312, and the decoder circuit 33 of the FSK decoder 314 may be performed by a digital signal processor, if desired. In addition, the term "propagation" or "wide transmission" as used herein means the transfer of signals between two or more points. For example, signal transmission between broadcasters, between two cable stations, between a broadcaster or cable station and a residential commercial or industrial facility, between a VCR or other tape drive, a cassette tape drive, a magnetic disk drive, an optical disk drive, a computer or solid state player and a receiver or other display. May be over-the-air, over cable, over satellite links, or through a conductive medium, etc.
Although embodiments of the present invention have been described above, various modifications, alterations and substitutions may be made to the above embodiments without departing from the respective uses of other features, some of which may also be employed in some examples of the present invention. Additionally, as used in the claims, an audio signal source may include a television program, radio channel and/or television channel, song, CD, laser disc, tape, computer program, interactive program, game, program originator, network, local station, syndicator cable company, and the like. Furthermore, the present invention may also be used in a signal talent system that determines the channel tuned by a tuner. The following claims should therefore be studied in their broadest sense and in a manner consistent with the scope of the invention.