AES3 is astandard for the exchange ofdigital audio signals betweenprofessional audio devices. An AES3 signal can carry two channels ofpulse-code-modulated digital audio over severaltransmission media includingbalanced lines,unbalanced lines, andoptical fiber.[1]
AES3 was jointly developed by theAudio Engineering Society (AES) and theEuropean Broadcasting Union (EBU) and so is also known asAES/EBU. The standard was first published in 1985 and was revised in 1992 and 2003. AES3 has been incorporated into theInternational Electrotechnical Commission's standardIEC 60958, and is available in a consumer-grade variant known asS/PDIF.
The development of standards fordigital audio interconnect for both professional and domestic audio equipment, began in the late 1970s[2] in a joint effort between the Audio Engineering Society and the European Broadcasting Union, and culminated in the publishing of AES3 in 1985. The AES3 standard has been revised in 1992 and 2003 and is published in AES and EBU versions.[1] Early on, the standard was frequently known as AES/EBU.
Variants using different physical connections are specified in IEC 60958. These are essentially consumer versions of AES3 for use within the domestichigh fidelity environment using connectors more commonly found in the consumer market. These variants are commonly known as S/PDIF.
IEC 60958 (formerly IEC 958) is theInternational Electrotechnical Commission'sstandard ondigital audio interfaces. It reproduces the AES3 professional digital audio interconnect standard and the consumer version of the same,S/PDIF.
The standard consists of several parts:
AES-2id is an AES information document published by theAudio Engineering Society[3] for digital audio engineering—Guidelines for the use of the AES3 interface. This document provides guidelines for the use of AES3, AES Recommended Practice for Digital Audio Engineering, Serial transmission format for two-channel linearly represented digital audio data. This document also covers the description of related standards used in conjunction with AES3 such asAES11.The full details of AES-2id can be studied in the standards section of theAudio Engineering Society web site[4] by downloading copies of the AES-2id document as a PDF file.
The AES3 standard parallels part 4 of the international standard IEC 60958. Of the physical interconnection types defined by IEC 60958, two are in common use.
Type I connections usebalanced, three-conductor, 110-ohmtwisted pair cabling withXLR connectors. Type I connections are most often used in professional installations and are considered the standard connector for AES3. The hardware interface is usually implemented usingRS-422 line drivers and receivers.
Cable end | Device end | |
---|---|---|
Input | XLR male plug | XLR female jack |
Output | XLR female plug | XLR male jack |
IEC 60958 Type II defines an unbalanced electrical or optical interface forconsumer electronics applications. The precursor of the IEC 60958 Type II specification was the Sony/Philips Digital Interface, orS/PDIF. Both were based on the original AES/EBU work. S/PDIF and AES3 are interchangeable at the protocol level, but at the physical level, they specify different electrical signalling levels andimpedances, which may be significant in some applications.
AES/EBU signals can also be run using unbalanced BNC connectors a with a 75-ohm coaxial cable. The unbalanced version has a very long transmission distance as opposed to the 150 meters maximum for the balanced version.[5] The AES-3id standard defines a 75-ohmBNC electrical variant of AES3. This uses the same cabling, patching and infrastructure as analogue or digital video, and is thus common in the broadcast industry.
AES3 was designed primarily to support stereoPCM encoded audio in eitherDAT format at 48 kHz orCD format at 44.1 kHz. No attempt was made to use a carrier able to support both rates; instead, AES3 allows the data to be run atany rate, and encoding the clock and the data together usingbiphase mark code (BMC).
Each bit occupies onetime slot. Each audio sample (of up to 24 bits) is combined with four flag bits and a synchronisation preamble which is four time slots long to make asubframe of 32 time slots. The 32 time slots of each subframe are assigned as follows:
Time slot | Name | Description |
---|---|---|
0–3 | Preamble | A synchronisation preamble (biphase mark code violation) for audio blocks, frames, and subframes. |
4–7 | Auxiliary sample (optional) | A low-quality auxiliary channel used as specified in the channel status word, notably for producertalkback orrecording studio-to-studio communication. |
8–27, or 4–27 | Audio sample | One sample stored withmost significant bit (MSB) last. If the auxiliary sample is used, bits 4–7 are not included. Data with smaller sample bit depths always have MSB at bit 27 and are zero-extended towards theleast significant bit (LSB). |
28 | Validity (V) | Unset if the audio data are correct and suitable for D/A conversion. During the presence of defective samples, the receiving equipment may be instructed to mute its output. It is used by most CD players to indicate that concealment rather than error correction is taking place. |
29 | User data (U) | Forms a serial data stream for each channel (with 1 bit per frame), with a format specified in the channel status word. |
30 | Channel status (C) | Bits from each frame of an audio block are collated giving a 192-bit channel status word. Its structure depends on whether AES3 orS/PDIF is used. |
31 | Parity (P) | Even parity bit for detection of errors in data transmission. Excludes preamble; Bits 4–31 have an even number of ones. |
Two subframes (A and B, normally used for left and right audio channels) make aframe. Frames contain 64 bit periods and are produced once per audio sample period. At the highest level, each 192 consecutive frames are grouped into anaudio block. While samples repeat each frame time, metadata is only transmitted once per audio block. At 48 kHz sample rate, there are 250 audio blocks per second, and 3,072,000 time slots per second supported by a 6.144 MHz biphase clock.[6]
The synchronisation preamble is a specially codedpreamble that identifies the subframe and its position within the audio block. Preambles are not normal BMC-encoded data bits, although they do still have zeroDC bias.
Three preambles are possible:
The three preambles are called X, Y, Z in the AES3 standard; and M, W, B in IEC 958 (an AES extension).
The 8-bit preambles are transmitted in the time allocated to the first four time slots of each subframe (time slots 0 to 3). Any of the three marks the beginning of a subframe. X or Z marks the beginning of a frame, and Z marks the beginning of an audio block.
| 0 | 1 | 2 | 3 | | 0 | 1 | 2 | 3 | Time slots _____ _ _____ _ / \_____/ \_/ \_____/ \_/ \ Preamble X _____ _ ___ ___ / \___/ \___/ \_____/ \_/ \ Preamble Y _____ _ _ _____ / \_/ \_____/ \_____/ \_/ \ Preamble Z ___ ___ ___ ___ / \___/ \___/ \___/ \___/ \ All 0 bits BMC encoded _ _ _ _ _ _ _ _ / \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \ All 1 bits BMC encoded | 0 | 1 | 2 | 3 | | 0 | 1 | 2 | 3 | Time slots
In two-channel AES3, the preambles form a pattern of ZYXYXYXY..., but it is straightforward to extend this structure to additional channels (more subframes per frame), each with a Y preamble, as is done in theMADI protocol.
There is one channel status bit in each subframe, a total of 192 bits or 24 bytes for each channel in each block. Between the AES3 and S/PDIF standards, the contents of the 192-bit channel status word differ significantly, although they agree that the first channel status bit distinguishes between the two. In the case of AES3, the standard describes, in detail, the function of each bit.[1]
SMPTE timecode data can be embedded within AES3 signals. It can be used forsynchronization and for logging and identifying audio content. It is embedded as a 32-bit binary word in bytes 18 to 21 of the channel status data.[8]
TheAES11 standard provides information on the synchronization of digital audio structures.[9]
theAES52 standard describes how to insert unique identifiers into an AES3 bit stream.[10]
SMPTE 2110-31 defines how to encapsulate an AES3 data stream inReal-time Transport Protocol packets for transmission over an IP network using the SMPTE 2110 IP based multicast framework.[11]
SMPTE 302M-2007 defines how to encapsulate an AES3 data stream in anMPEG transport stream for television applications.[12]
AES3 digital audio format can also be carried over anAsynchronous Transfer Mode network. The standard for packing AES3 frames into ATM cells isAES47.
In 1977, stimulated by the growing need for standards in digital audio, the AES Digital Audio Standards Committee was formed.
Bytes 18 to 21, Bits 0 to 7: Time of day sample address code. Value (each Byte): 32-bit binary value representing the first sample of current block. LSBs are transmitted first. Default value shall be logic "0". Note: This is the time-of-day laid down during the source encoding of the signal and shall remain unchanged during subsequent operations. A value of all zeros for the binary sample address code shall, for the purposes of transcoding to real time, or to time codes in particular, be taken as midnight (i.e., 00 h, 00 min, 00 s, 00 frame). Transcoding of the binary number to any conventional time code requires accurate sampling frequency information to provide the sample accurate time.