Following the terms of thenoisy-channel coding theorem, the channel capacity of a givenchannel is the highest information rate (in units ofinformation per unit time) that can be achieved with arbitrarily small error probability.[1][2]
Information theory, developed byClaude E. Shannon in 1948, defines the notion of channel capacity and provides a mathematical model by which it may be computed. The key result states that the capacity of the channel, as defined above, is given by the maximum of themutual information between the input and output of the channel, where the maximization is with respect to the input distribution.[3]
The notion of channel capacity has been central to the development of modern wireline and wireless communication systems, with the advent of novelerror correction coding mechanisms that have resulted in achieving performance very close to the limits promised by channel capacity.
Channel capacity is additive over independent channels.[4] It means that using two independent channels in a combined manner provides the same theoretical capacity as using them independently. More formally, let and be two independent channels modelled as above; having an input alphabet and an output alphabet. Idem for. We define the product channel as
This theorem states:
Proof
We first show that.
Let and be two independent random variables. Let be a random variable corresponding to the output of through the channel, and for through.
By definition.
Since and are independent, as well as and, is independent of. We can apply the following property ofmutual information:
For now we only need to find a distribution such that. In fact, and, two probability distributions for and achieving and, suffice:
ie.
Now let us show that.
Let be some distribution for the channel defining and the corresponding output. Let be the alphabet of, for, and analogously and.
IfG is anundirected graph, it can be used to define a communications channel in which the symbols are the graph vertices, and two codewords may be confused with each other if their symbols in each position are equal or adjacent. The computational complexity of finding the Shannon capacity of such a channel remains open, but it can be upper bounded by another important graph invariant, theLovász number.[5]
Thenoisy-channel coding theorem states that for any error probability ε > 0 and for any transmissionrateR less than the channel capacityC, there is an encoding and decoding scheme transmitting data at rateR whose error probability is less than ε, for a sufficiently large block length. Also, for any rate greater than the channel capacity, the probability of error at the receiver goes to 0.5 as the block length goes to infinity.
C is measured inbits per second if thelogarithm is taken in base 2, ornats per second if thenatural logarithm is used, assumingB is inhertz; the signal and noise powersS andN are expressed in a linearpower unit (like watts or volts2). SinceS/N figures are often cited indB, a conversion may be needed. For example, a signal-to-noise ratio of 30 dB corresponds to a linear power ratio of.
To determine the channel capacity, it is necessary to find the capacity-achieving distribution and evaluate themutual information. Research has mostly focused on studying additive noise channels under certain power constraints and noise distributions, as analytical methods are not feasible in the majority of other scenarios. Hence, alternative approaches such as, investigation on the input support,[6] relaxations[7] and capacity bounds,[8] have been proposed in the literature.
The capacity of a discrete memoryless channel can be computed using theBlahut-Arimoto algorithm.
Deep learning can be used to estimate the channel capacity. In fact, the channel capacity and the capacity-achieving distribution of any discrete-time continuous memoryless vector channel can be obtained using CORTICAL,[9] a cooperative framework inspired bygenerative adversarial networks. CORTICAL consists of two cooperative networks: a generator with the objective of learning to sample from the capacity-achieving input distribution, and a discriminator with the objective to learn to distinguish between paired and unpaired channel input-output samples and estimates.
This section[10] focuses on the single-antenna, point-to-point scenario. For channel capacity in systems with multiple antennas, see the article onMIMO.
AWGN channel capacity with the power-limited regime and bandwidth-limited regime indicated. Here,;B andC can be scaled proportionally for other values.
If the average received power is [W], the total bandwidth is in Hertz, and the noisepower spectral density is [W/Hz], the AWGN channel capacity is
[bits/s],
where is the received signal-to-noise ratio (SNR). This result is known as theShannon–Hartley theorem.[11]
When the SNR is large (SNR ≫ 0 dB), the capacity is logarithmic in power and approximately linear in bandwidth. This is called thebandwidth-limited regime.
When the SNR is small (SNR ≪ 0 dB), the capacity is linear in power but insensitive to bandwidth. This is called thepower-limited regime.
The bandwidth-limited regime and power-limited regime are illustrated in the figure.
In aslow-fading channel, where the coherence time is greater than the latency requirement, there is no definite capacity as the maximum rate of reliable communications supported by the channel,, depends on the random channel gain, which is unknown to the transmitter. If the transmitter encodes data at rate [bits/s/Hz], there is a non-zero probability that the decoding error probability cannot be made arbitrarily small,
,
in which case the system is said to be in outage. With a non-zero probability that the channel is in deep fade, the capacity of the slow-fading channel in strict sense is zero. However, it is possible to determine the largest value of such that the outage probability is less than. This value is known as the-outage capacity.
In afast-fading channel, where the latency requirement is greater than the coherence time and the codeword length spans many coherence periods, one can average over many independent channel fades by coding over a large number of coherence time intervals. Thus, it is possible to achieve a reliable rate of communication of [bits/s/Hz] and it is meaningful to speak of this value as the capacity of the fast-fading channel.
Feedback capacity is the greatest rate at whichinformation can be reliably transmitted, per unit time, over a point-to-pointcommunication channel in which the receiver feeds back the channel outputs to the transmitter. Information-theoretic analysis of communication systems that incorporate feedback is more complicated and challenging than without feedback. Possibly, this was the reasonC.E. Shannon chose feedback as the subject of the first Shannon Lecture, delivered at the 1973 IEEE International Symposium on Information Theory in Ashkelon, Israel.
The feedback capacity is characterized by the maximum of thedirected information between the channel inputs and the channel outputs, where the maximization is with respect to the causal conditioning of the input given the output. Thedirected information was coined byJames Massey[12] in 1990, who showed that its an upper bound on feedback capacity. Formemoryless channels, Shannon showed[13] that feedback does not increase the capacity, and the feedback capacity coincides with the channel capacity characterized by themutual information between the input and the output. The feedback capacity is known as a closed-form expression only for several examples such as the trapdoor channel,[14] Ising channel,.[15][16] For some other channels, it is characterized through constant-size optimization problems such as the binary erasure channel with a no-consecutive-ones input constraint,[17] NOST channel.[18]
The basic mathematical model for a communication system is the following:
Communication with feedback
Here is the formal definition of each element (where the only difference with respect to the nonfeedback capacity is the encoder definition):
is the message to be transmitted, taken in analphabet;
is the channel input symbol ( is a sequence of symbols) taken in analphabet;
is the channel output symbol ( is a sequence of symbols) taken in an alphabet;
is the estimate of the transmitted message;
is the encoding function at time, for a block of length;
That is, for each time there exists a feedback of the previous output such that the encoder has access to all previous outputs. An code is a pair of encoding and decoding mappings with, and is uniformly distributed. A rate is said to beachievable if there exists a sequence of codes such that theaverage probability of error: tends to zero as.
Thefeedback capacity is denoted by, and is defined as the supremum over all achievable rates.
When the Gaussian noise is colored, the channel has memory. Consider for instance the simple case on anautoregressive model noise process where is an i.i.d. process.
The feedback capacity is difficult to solve in the general case. There are some techniques that are related to control theory andMarkov decision processes if the channel is discrete.
^Saleem Bhatti."Channel capacity".Lecture notes for M.Sc. Data Communication Networks and Distributed Systems D51 -- Basic Communications and Networks. Archived fromthe original on 2007-08-21.
^Cover, Thomas M.; Thomas, Joy A. (2006). "Chapter 7: Channel Capacity".Elements of Information Theory (Second ed.). Wiley-Interscience. pp. 206–207.ISBN978-0-471-24195-9.
^Huang, J.; Meyn, S.P. (2005). "Characterization and Computation of Optimal Distributions for Channel Coding".IEEE Transactions on Information Theory.51 (7):2336–2351.doi:10.1109/TIT.2005.850108.ISSN0018-9448.S2CID2560689.
^McKellips, A.L. (2004). "Simple tight bounds on capacity for the peak-limited discrete-time channel".International Symposium onInformation Theory, 2004. ISIT 2004. Proceedings. IEEE. p. 348.doi:10.1109/ISIT.2004.1365385.ISBN978-0-7803-8280-0.S2CID41462226.
^Shannon, C. (September 1956). "The zero error capacity of a noisy channel".IEEE Transactions on Information Theory.2 (3):8–19.doi:10.1109/TIT.1956.1056798.
^Elishco, Ohad; Permuter, Haim (September 2014). "Capacity and Coding for the Ising Channel With Feedback".IEEE Transactions on Information Theory.60 (9):5138–5149.arXiv:1205.4674.doi:10.1109/TIT.2014.2331951.S2CID9761759.
^Aharoni, Ziv; Sabag, Oron; Permuter, Haim H. (September 2022). "Feedback Capacity of Ising Channels With Large Alphabet via Reinforcement Learning".IEEE Transactions on Information Theory.68 (9):5637–5656.doi:10.1109/TIT.2022.3168729.S2CID248306743.
^Sabag, Oron; Permuter, Haim H.; Kashyap, Navin (2016). "The Feedback Capacity of the Binary Erasure Channel With a No-Consecutive-Ones Input Constraint".IEEE Transactions on Information Theory.62 (1):8–22.doi:10.1109/TIT.2015.2495239.
^Shemuel, Eli; Sabag, Oron; Permuter, Haim H. (2022). "The Feedback Capacity of Noisy Output Is the State (NOST) Channels".IEEE Transactions on Information Theory.68 (8):5044–5059.arXiv:2107.07164.doi:10.1109/TIT.2022.3165538.
^Permuter, Haim Henry; Weissman, Tsachy; Goldsmith, Andrea J. (February 2009). "Finite State Channels With Time-Invariant Deterministic Feedback".IEEE Transactions on Information Theory.55 (2):644–662.arXiv:cs/0608070.doi:10.1109/TIT.2008.2009849.S2CID13178.