TheNyquist–Shannon sampling theorem is an essential principle fordigital signal processing linking thefrequency range of a signal and thesample rate required to avoid a type ofdistortion calledaliasing. The theorem states that the sample rate must be at least twice thebandwidth of the signal to avoid aliasing. In practice, it is used to selectband-limiting filters to keep aliasing below an acceptable amount when an analog signal is sampled or when sample rates are changed within a digital signal processing function.

The Nyquist–Shannon sampling theorem is a theorem in the field ofsignal processing which serves as a fundamental bridge betweencontinuous-time signals anddiscrete-time signals. It establishes a sufficient condition for asample rate that permits a discrete sequence ofsamples to capture all the information from a continuous-time signal of finitebandwidth.
Strictly speaking, the theorem only applies to a class ofmathematical functions having aFourier transform that is zero outside of a finite region of frequencies. Intuitively we expect that when one reduces a continuous function to a discrete sequence andinterpolates back to a continuous function, the fidelity of the result depends on the density (orsample rate) of the original samples. The sampling theorem introduces the concept of a sample rate that is sufficient for perfect fidelity for the class of functions that areband-limited to a given bandwidth, such that no actual information is lost in the sampling process. It expresses the sufficient sample rate in terms of the bandwidth for the class of functions. The theorem also leads to a formula for perfectly reconstructing the original continuous-time function from the samples.
Perfect reconstruction may still be possible when the sample-rate criterion is not satisfied, provided other constraints on the signal are known (see§ Sampling of non-baseband signals below andcompressed sensing). In some cases (when the sample-rate criterion is not satisfied), utilizing additional constraints allows for approximate reconstructions. The fidelity of these reconstructions can be verified and quantified utilizingBochner's theorem.[1]
The nameNyquist–Shannon sampling theorem honoursHarry Nyquist andClaude Shannon, but the theorem was also previously discovered byE. T. Whittaker (published in 1915), and Shannon cited Whittaker's paper in his work. The theorem is thus also known by the namesWhittaker–Shannon sampling theorem,Whittaker–Shannon, andWhittaker–Nyquist–Shannon, and may also be referred to as thecardinal theorem of interpolation.
Sampling is a process of converting a signal (for example, a function of continuous time or space) into a sequence of values (a function of discrete time or space).Shannon's version of the theorem states:[2]
Theorem—If a function contains no frequencies higher thanB hertz, then it can be completely determined from its ordinates at a sequence of points spaced less than seconds apart.
A sufficient sample-rate is therefore anything larger than samples per second. Equivalently, for a given sample rate, perfect reconstruction is guaranteed possible for a bandlimit.
When the bandlimit is too high (or there is no bandlimit), the reconstruction exhibits imperfections known asaliasing. Modern statements of the theorem are sometimes careful to explicitly state that must contain nosinusoidal component at exactly frequency or that must be strictly less than one half the sample rate. The threshold is called theNyquist rate and is an attribute of the continuous-time input to be sampled. The sample rate must exceed the Nyquist rate for the samples to suffice to represent The threshold is called theNyquist frequency and is an attribute of thesampling equipment. All meaningful frequency components of the properly sampled exist below the Nyquist frequency. The condition described by these inequalities is called theNyquist criterion, or sometimes theRaabe condition. The theorem is also applicable to functions of other domains, such as space, in the case of a digitized image. The only change, in the case of other domains, is the units of measure attributed to and

The symbol is customarily used to represent the interval between adjacent samples and is called thesample period orsampling interval. The samples of function are commonly denoted by[3] (alternatively in older signal processing literature), for all integer values of The multiplier is a result of the transition from continuous time to discrete time (seeDiscrete-time Fourier transform#Relation to Fourier Transform), and it is needed to preserve the energy of the signal as varies.
A mathematically ideal way to interpolate the sequence involves the use ofsinc functions. Each sample in the sequence is replaced by a sinc function, centered on the time axis at the original location of the sample with the amplitude of the sinc function scaled to the sample value, Subsequently, the sinc functions are summed into a continuous function. A mathematically equivalent method uses theDirac comb and proceeds byconvolving one sinc function with a series ofDirac delta pulses, weighted by the sample values. Neither method is numerically practical. Instead, some type of approximation of the sinc functions, finite in length, is used. The imperfections attributable to the approximation are known asinterpolation error.
Practicaldigital-to-analog converters produce neither scaled and delayedsinc functions, nor idealDirac pulses. Instead they produce apiecewise-constant sequence of scaled and delayedrectangular pulses (thezero-order hold), usually followed by alowpass filter (called an "anti-imaging filter") to remove spurious high-frequency replicas (images) of the original baseband signal.

When is a function with aFourier transform:
Then the samples of are sufficient to create aperiodic summation of (seeDiscrete-time Fourier transform#Relation to Fourier Transform):
| Eq.1 |

which is a periodic function and its equivalent representation as aFourier series, whose coefficients are. This function is also known as thediscrete-time Fourier transform (DTFT) of the sample sequence.
As depicted, copies of are shifted by multiples of the sampling rate and combined by addition. For a band-limited function and sufficiently large it is possible for the copies to remain distinct from each other. But if the Nyquist criterion is not satisfied, adjacent copies overlap, and it is not possible in general to discern an unambiguous Any frequency component above is indistinguishable from a lower-frequency component, called analias, associated with one of the copies. In such cases, the customary interpolation techniques produce the alias, rather than the original component. When the sample-rate is pre-determined by other considerations (such as an industry standard), is usually filtered to reduce its high frequencies to acceptable levels before it is sampled. The type of filter required is alowpass filter, and in this application it is called ananti-aliasing filter.


When there is no overlap of the copies (also known as "images") of, the term ofEq.1 can be recovered by the product:
where:
The sampling theorem is proved since uniquely determines.
All that remains is to derive the formula for reconstruction. need not be precisely defined in the region because is zero in that region. However, the worst case is when the Nyquist frequency. A function that is sufficient for that and all less severe cases is:
where is therectangular function. Therefore:
The inverse transform of both sides produces theWhittaker–Shannon interpolation formula:
which shows how the samples,, can be combined to reconstruct.
Poisson shows that the Fourier series inEq.1 produces the periodic summation of, regardless of and. Shannon, however, only derives the series coefficients for the case. Virtually quoting Shannon's original paper:
| Eq.2 |
Shannon's proof of the theorem is complete at that point, but he goes on to discuss reconstruction viasinc functions, what we now call theWhittaker–Shannon interpolation formula as discussed above. He does not derive or prove the properties of the sinc function, as the Fourier pair relationship between therect (the rectangular function) and sinc functions was well known by that time.[4]
Let be the sample. Then the function is represented by:
As in the other proof, the existence of the Fourier transform of the original signal is assumed, so the proof does not say whether the sampling theorem extends to bandlimited stationary random processes.


The sampling theorem is usually formulated for functions of a single variable. Consequently, the theorem is directly applicable to time-dependent signals and is normally formulated in that context. However, the sampling theorem can be extended in a straightforward way to functions of arbitrarily many variables. Grayscale images, for example, are often represented as two-dimensional arrays (or matrices) of real numbers representing the relative intensities ofpixels (picture elements) located at the intersections of row and column sample locations. As a result, images require two independent variables, or indices, to specify each pixel uniquely—one for the row, and one for the column.
Color images typically consist of a composite of three separate grayscale images, one to represent each of the three primary colors—red, green, and blue, orRGB for short. Other colorspaces using 3-vectors for colors include HSV, CIELAB, XYZ, etc. Some colorspaces such as cyan, magenta, yellow, and black (CMYK) may represent color by four dimensions. All of these are treated asvector-valued functions over a two-dimensional sampled domain.
Similar to one-dimensional discrete-time signals, images can also suffer from aliasing if the sampling resolution, or pixel density, is inadequate. For example, a digital photograph of a striped shirt with high frequencies (in other words, the distance between the stripes is small), can cause aliasing of the shirt when it is sampled by the camera'simage sensor. The aliasing appears as amoiré pattern. The "solution" to higher sampling in the spatial domain for this case would be to move closer to the shirt, use a higher resolution sensor, or to optically blur the image before acquiring it with the sensor using anoptical low-pass filter.
Another example is shown here in the brick patterns. The top image shows the effects when the sampling theorem's condition is not satisfied. When software rescales an image (the same process that creates the thumbnail shown in the lower image) it, in effect, runs the image through alow-pass filter first and thendownsamples the image to result in a smaller image that does not exhibit themoiré pattern. The top image is what happens when the image is downsampled without low-pass filtering: aliasing results.
The sampling theorem applies to camera systems, where the scene and lens constitute an analog spatial signal source, and the image sensor is a spatial sampling device. Each of these components is characterized by amodulation transfer function (MTF), representing the precise resolution (spatial bandwidth) available in that component. Effects of aliasing or blurring can occur when the lens MTF and sensor MTF are mismatched. When the optical image which is sampled by the sensor device contains higher spatial frequencies than the sensor, the under sampling acts as a low-pass filter to reduce or eliminate aliasing. When the area of the sampling spot (the size of the pixel sensor) is not large enough to provide sufficientspatial anti-aliasing, a separate anti-aliasing filter (optical low-pass filter) may be included in a camera system to reduce the MTF of the optical image. Instead of requiring an optical filter, thegraphics processing unit ofsmartphone cameras performsdigital signal processing to remove aliasing with a digital filter. Digital filters also apply sharpening to amplify the contrast from the lens at high spatial frequencies, which otherwise falls off rapidly at diffraction limits.
The sampling theorem also applies to post-processing digital images, such as to up or down sampling. Effects of aliasing, blurring, and sharpening may be adjusted with digital filtering implemented in software, which necessarily follows the theoretical principles.

To illustrate the necessity of consider the family of sinusoids generated by different values of in this formula:
With or equivalently the samples are given by:
regardless of the value of That sort of ambiguity is the reason for thestrict inequality of the sampling theorem's condition.
As discussed by Shannon:[2]
A similar result is true if the band does not start at zero frequency but at some higher value, and can be proved by a linear translation (corresponding physically tosingle-sideband modulation) of the zero-frequency case. In this case the elementary pulse is obtained from by single-side-band modulation.
That is, a sufficient no-loss condition for samplingsignals that do not havebaseband components exists that involves thewidth of the non-zero frequency interval as opposed to its highest frequency component. Seesampling for more details and examples.
For example, in order to sampleFM radio signals in the frequency range of 100–102 MHz, it is not necessary to sample at 204 MHz (twice the upper frequency), but rather it is sufficient to sample at 4 MHz (twice the width of the frequency interval). (Reconstruction is not usually the goal with sampled IF or RF signals. Rather, the sample sequence can be treated as ordinary samples of the signal frequency-shifted to near baseband, and digital demodulation can proceed on that basis.)
Using the bandpass condition, where for all outside the open band of frequencies
for some nonnegative integer and some sampling frequency, it is possible to find an interpolation that reproduces the signal. Note that there may be several combinations of and that work, including the normal baseband condition as the case The corresponding interpolation filter to be convolved with the sample is the impulse response of an ideal "brick-wall"bandpass filter (as opposed to the idealbrick-walllowpass filter used above) with cutoffs at the upper and lower edges of the specified band, which is the difference between a pair of lowpass impulse responses:
This function is 1 at and zero at any other multiple of (as well as at other times if).
Other generalizations, for example to signals occupying multiple non-contiguous bands, are possible as well. Even the most generalized form of the sampling theorem does not have a provably true converse. That is, one cannot conclude that information is necessarily lost just because the conditions of the sampling theorem are not satisfied; from an engineering perspective, however, it is generally safe to assume that if the sampling theorem is not satisfied then information will most likely be lost.
The sampling theory of Shannon can be generalized for the case ofnonuniform sampling, that is, samples not taken equally spaced in time. The Shannon sampling theory for non-uniform sampling states that a band-limited signal can be perfectly reconstructed from its samples if the average sampling rate satisfies the Nyquist condition.[5] Therefore, although uniformly spaced samples may result in easier reconstruction algorithms, it is not a necessary condition for perfect reconstruction.
The general theory for non-baseband and nonuniform samples was developed in 1967 byHenry Landau.[6] He proved that the average sampling rate (uniform or otherwise) must be twice theoccupied bandwidth of the signal, assuming it isa priori known what portion of the spectrum was occupied.
In the late 1990s, this work was partially extended to cover signals for which the amount of occupied bandwidth is known but the actual occupied portion of the spectrum is unknown.[7] In the 2000s, a complete theory was developed(see the sectionSampling below the Nyquist rate under additional restrictions below) usingcompressed sensing. In particular, the theory, using signal processing language, is described in a 2009 paper by Mishali and Eldar.[8] They show, among other things, that if the frequency locations are unknown, then it is necessary to sample at least at twice the Nyquist criteria; in other words, you must pay at least a factor of 2 for not knowing the location of thespectrum. Note that minimum sampling requirements do not necessarily guaranteestability.
The Nyquist–Shannon sampling theorem provides asufficient condition for the sampling and reconstruction of a band-limited signal. When reconstruction is done via theWhittaker–Shannon interpolation formula, the Nyquist criterion is also a necessary condition to avoid aliasing, in the sense that if samples are taken at a slower rate than twice the band limit, then there are some signals that will not be correctly reconstructed. However, if further restrictions are imposed on the signal, then the Nyquist criterion may no longer be anecessary condition.
A non-trivial example of exploiting extra assumptions about the signal is given by the recent field ofcompressed sensing, which allows for full reconstruction with a sub-Nyquist sampling rate. Specifically, this applies to signals that are sparse (or compressible) in some domain. As an example, compressed sensing deals with signals that may have a low overall bandwidth (say, theeffective bandwidth) but the frequency locations are unknown, rather than all together in a single band, so that thepassband technique does not apply. In other words, the frequency spectrum is sparse. Traditionally, the necessary sampling rate is thus Using compressed sensing techniques, the signal could be perfectly reconstructed if it is sampled at a rate slightly lower than With this approach, reconstruction is no longer given by a formula, but instead by the solution to alinear optimization program.
Another example where sub-Nyquist sampling is optimal arises under the additional constraint that the samples are quantized in an optimal manner, as in a combined system of sampling and optimallossy compression.[9] This setting is relevant in cases where the joint effect of sampling andquantization is to be considered, and can provide a lower bound for the minimal reconstruction error that can be attained in sampling and quantizing arandom signal. For stationary Gaussian random signals, this lower bound is usually attained at a sub-Nyquist sampling rate, indicating that sub-Nyquist sampling is optimal for this signal model under optimalquantization.[10]
The sampling theorem was implied by the work ofHarry Nyquist in 1928,[11] in which he showed that up to independent pulse samples could be sent through a system of bandwidth; but he did not explicitly consider the problem of sampling and reconstruction of continuous signals. About the same time,Karl Küpfmüller showed a similar result[12] and discussed the sinc-function impulse response of a band-limiting filter, via its integral, the step-responsesine integral; this bandlimiting and reconstruction filter that is so central to the sampling theorem is sometimes referred to as aKüpfmüller filter (but seldom so in English).
The sampling theorem, essentially adual of Nyquist's result, was proved byClaude E. Shannon.[2]Edmund Taylor Whittaker published similar results in 1915,[13] as did his sonJohn Macnaghten Whittaker in 1935,[14] andDennis Gabor in 1946 ("Theory of communication").
In 1948 and 1949, Claude E. Shannon published the two revolutionary articles in which he foundedinformation theory.[15][16][2] In Shannon's "A Mathematical Theory of Communication", the sampling theorem is formulated as "Theorem 13": Let contain no frequencies over W. Then
where
It was not until these articles were published that the theorem known as "Shannon's sampling theorem" became common property among communication engineers, although Shannon himself writes that this is a fact which is common knowledge in the communication art.[B] A few lines further on, however, he adds: "but in spite of its evident importance, [it] seems not to have appeared explicitly in the literature ofcommunication theory". Despite his sampling theorem being published at the end of the 1940s, Shannon had derived his sampling theorem as early as 1940.[17]
Others who have independently discovered or played roles in the development of the sampling theorem have been discussed in several historical articles, for example, by Jerri[18] and by Lüke.[19] For example, Lüke points out thatHerbert Raabe, an assistant to Küpfmüller, proved the theorem in his 1939 Ph.D. dissertation; the termRaabe condition came to be associated with the criterion for unambiguous representation (sampling rate greater than twice the bandwidth). Meijering[20] mentions several other discoverers and names in a paragraph and pair of footnotes:
As pointed out by Higgins, the sampling theorem should really be considered in two parts, as done above: the first stating the fact that a bandlimited function is completely determined by its samples, the second describing how to reconstruct the function using its samples. Both parts of the sampling theorem were given in a somewhat different form byJ. M. Whittaker and before him also by Ogura. They were probably not aware of the fact that the first part of the theorem had been stated as early as 1897 byBorel.[Meijering 1] As we have seen, Borel also used around that time what became known as the cardinal series. However, he appears not to have made the link. In later years it became known that the sampling theorem had been presented before Shannon to the Russian communication community byKotel'nikov. In more implicit, verbal form, it had also been described in the German literature byRaabe. Several authors have mentioned that Someya introduced the theorem in the Japanese literature parallel to Shannon. In the English literature, Weston introduced it independently of Shannon around the same time.[Meijering 2]
- ^Several authors, following Black, have claimed that this first part of the sampling theorem was stated even earlier by Cauchy, in a paper published in 1841. However, the paper of Cauchy does not contain such a statement, as has been pointed out by Higgins.
- ^As a consequence of the discovery of the several independent introductions of the sampling theorem, people started to refer to the theorem by including the names of the aforementioned authors, resulting in such catchphrases as "the Whittaker–Kotel'nikov–Shannon (WKS) sampling theorem" or even "the Whittaker–Kotel'nikov–Raabe–Shannon–Someya sampling theorem". To avoid confusion, perhaps the best thing to do is to refer to it as the sampling theorem, "rather than trying to find a title that does justice to all claimants".
— Eric Meijering, "A Chronology of Interpolation From Ancient Astronomy to Modern Signal and Image Processing" (citations omitted)
In Russian literature it is known as the Kotelnikov's theorem, named afterVladimir Kotelnikov, who discovered it in 1933.[21]
Exactly how, when, or whyHarry Nyquist had his name attached to the sampling theorem remains obscure. The termNyquist Sampling Theorem (capitalized thus) appeared as early as 1959 in a book from his former employer,Bell Labs,[22] and appeared again in 1963,[23] and not capitalized in 1965.[24] It had been called theShannon Sampling Theorem as early as 1954,[25] but also justthe sampling theorem by several other books in the early 1950s.
In 1958,Blackman andTukey cited Nyquist's 1928 article as a reference forthe sampling theorem of information theory,[26] even though that article does not treat sampling and reconstruction of continuous signals as others did. Their glossary of terms includes these entries:
- Sampling theorem (of information theory)
- Nyquist's result that equi-spaced data, with two or more points per cycle of highest frequency, allows reconstruction of band-limited functions. (SeeCardinal theorem.)
- Cardinal theorem (of interpolation theory)
- A precise statement of the conditions under which values given at a doubly infinite set of equally spaced points can be interpolated to yield a continuous band-limited function with the aid of the function
Exactly what "Nyquist's result" they are referring to remains mysterious.
When Shannon stated and proved the sampling theorem in his 1949 article, according to Meijering,[20] "he referred to the critical sampling interval as theNyquist interval corresponding to the band in recognition of Nyquist's discovery of the fundamental importance of this interval in connection with telegraphy". This explains Nyquist's name on the critical interval, but not on the theorem.
Similarly, Nyquist's name was attached toNyquist rate in 1953 byHarold S. Black:
If the essential frequency range is limited to cycles per second, was given by Nyquist as the maximum number of code elements per second that could be unambiguously resolved, assuming the peak interference is less than half a quantum step. This rate is generally referred to assignaling at the Nyquist rate and has been termed aNyquist interval.
— Harold Black,Modulation Theory[27] (bold added for emphasis; italics as in the original)
According to theOxford English Dictionary, this may be the origin of the termNyquist rate. In Black's usage, it is not a sampling rate, but a signaling rate.
{{cite book}}:ISBN / Date incompatibility (help)