The Statistics standard library module contains basic statistics functionality.
Statistics.std
—Functionstd(itr; corrected::Bool=true, mean=nothing[, dims])
Compute the sample standard deviation of collectionitr
.
The algorithm returns an estimator of the generative distribution's standard deviation under the assumption that each entry ofitr
is a sample drawn from the same unknown distribution, with the samples uncorrelated. For arrays, this computation is equivalent to calculatingsqrt(sum((itr .- mean(itr)).^2) / (length(itr) - 1))
. Ifcorrected
istrue
, then the sum is scaled withn-1
, whereas the sum is scaled withn
ifcorrected
isfalse
withn
the number of elements initr
.
Ifitr
is anAbstractArray
,dims
can be provided to compute the standard deviation over dimensions.
A pre-computedmean
may be provided. Whendims
is specified,mean
must be an array with the same shape asmean(itr, dims=dims)
(additional trailing singleton dimensions are allowed).
If array containsNaN
ormissing
values, the result is alsoNaN
ormissing
(missing
takes precedence if array contains both). Use theskipmissing
function to omitmissing
entries and compute the standard deviation of non-missing values.
Statistics.stdm
—Functionstdm(itr, mean; corrected::Bool=true[, dims])
Compute the sample standard deviation of collectionitr
, with known mean(s)mean
.
The algorithm returns an estimator of the generative distribution's standard deviation under the assumption that each entry ofitr
is a sample drawn from the same unknown distribution, with the samples uncorrelated. For arrays, this computation is equivalent to calculatingsqrt(sum((itr .- mean(itr)).^2) / (length(itr) - 1))
. Ifcorrected
istrue
, then the sum is scaled withn-1
, whereas the sum is scaled withn
ifcorrected
isfalse
withn
the number of elements initr
.
Ifitr
is anAbstractArray
,dims
can be provided to compute the standard deviation over dimensions. In that case,mean
must be an array with the same shape asmean(itr, dims=dims)
(additional trailing singleton dimensions are allowed).
If array containsNaN
ormissing
values, the result is alsoNaN
ormissing
(missing
takes precedence if array contains both). Use theskipmissing
function to omitmissing
entries and compute the standard deviation of non-missing values.
Statistics.var
—Functionvar(itr; corrected::Bool=true, mean=nothing[, dims])
Compute the sample variance of collectionitr
.
The algorithm returns an estimator of the generative distribution's variance under the assumption that each entry ofitr
is a sample drawn from the same unknown distribution, with the samples uncorrelated. For arrays, this computation is equivalent to calculatingsum((itr .- mean(itr)).^2) / (length(itr) - 1)
. Ifcorrected
istrue
, then the sum is scaled withn-1
, whereas the sum is scaled withn
ifcorrected
isfalse
wheren
is the number of elements initr
.
Ifitr
is anAbstractArray
,dims
can be provided to compute the variance over dimensions.
A pre-computedmean
may be provided. Whendims
is specified,mean
must be an array with the same shape asmean(itr, dims=dims)
(additional trailing singleton dimensions are allowed).
If array containsNaN
ormissing
values, the result is alsoNaN
ormissing
(missing
takes precedence if array contains both). Use theskipmissing
function to omitmissing
entries and compute the variance of non-missing values.
Statistics.varm
—Functionvarm(itr, mean; dims, corrected::Bool=true)
Compute the sample variance of collectionitr
, with known mean(s)mean
.
The algorithm returns an estimator of the generative distribution's variance under the assumption that each entry ofitr
is a sample drawn from the same unknown distribution, with the samples uncorrelated. For arrays, this computation is equivalent to calculatingsum((itr .- mean(itr)).^2) / (length(itr) - 1)
. Ifcorrected
istrue
, then the sum is scaled withn-1
, whereas the sum is scaled withn
ifcorrected
isfalse
withn
the number of elements initr
.
Ifitr
is anAbstractArray
,dims
can be provided to compute the variance over dimensions. In that case,mean
must be an array with the same shape asmean(itr, dims=dims)
(additional trailing singleton dimensions are allowed).
If array containsNaN
ormissing
values, the result is alsoNaN
ormissing
(missing
takes precedence if array contains both). Use theskipmissing
function to omitmissing
entries and compute the variance of non-missing values.
Statistics.cor
—Functioncor(x::AbstractVector)
Return the number one.
cor(X::AbstractMatrix; dims::Int=1)
Compute the Pearson correlation matrix of the matrixX
along the dimensiondims
.
cor(x::AbstractVector, y::AbstractVector)
Compute the Pearson correlation between the vectorsx
andy
.
cor(X::AbstractVecOrMat, Y::AbstractVecOrMat; dims=1)
Compute the Pearson correlation between the vectors or matricesX
andY
along the dimensiondims
.
Statistics.cov
—Functioncov(x::AbstractVector; corrected::Bool=true)
Compute the variance of the vectorx
. Ifcorrected
istrue
(the default) then the sum is scaled withn-1
, whereas the sum is scaled withn
ifcorrected
isfalse
wheren = length(x)
.
cov(X::AbstractMatrix; dims::Int=1, corrected::Bool=true)
Compute the covariance matrix of the matrixX
along the dimensiondims
. Ifcorrected
istrue
(the default) then the sum is scaled withn-1
, whereas the sum is scaled withn
ifcorrected
isfalse
wheren = size(X, dims)
.
cov(x::AbstractVector, y::AbstractVector; corrected::Bool=true)
Compute the covariance between the vectorsx
andy
. Ifcorrected
istrue
(the default), computes$\frac{1}{n-1}\sum_{i=1}^n (x_i-\bar x) (y_i-\bar y)^*$ where$*$ denotes the complex conjugate andn = length(x) = length(y)
. Ifcorrected
isfalse
, computes$\frac{1}{n}\sum_{i=1}^n (x_i-\bar x) (y_i-\bar y)^*$.
cov(X::AbstractVecOrMat, Y::AbstractVecOrMat; dims::Int=1, corrected::Bool=true)
Compute the covariance between the vectors or matricesX
andY
along the dimensiondims
. Ifcorrected
istrue
(the default) then the sum is scaled withn-1
, whereas the sum is scaled withn
ifcorrected
isfalse
wheren = size(X, dims) = size(Y, dims)
.
Statistics.mean!
—Functionmean!(r, v)
Compute the mean ofv
over the singleton dimensions ofr
, and write results tor
.
Examples
julia> using Statisticsjulia> v = [1 2; 3 4]2×2 Matrix{Int64}: 1 2 3 4julia> mean!([1., 1.], v)2-element Vector{Float64}: 1.5 3.5julia> mean!([1. 1.], v)1×2 Matrix{Float64}: 2.0 3.0
Statistics.mean
—Functionmean(itr)
Compute the mean of all elements in a collection.
Ifitr
containsNaN
ormissing
values, the result is alsoNaN
ormissing
(missing
takes precedence if array contains both). Use theskipmissing
function to omitmissing
entries and compute the mean of non-missing values.
Examples
julia> using Statisticsjulia> mean(1:20)10.5julia> mean([1, missing, 3])missingjulia> mean(skipmissing([1, missing, 3]))2.0
mean(f, itr)
Apply the functionf
to each element of collectionitr
and take the mean.
julia> using Statisticsjulia> mean(√, [1, 2, 3])1.3820881233139908julia> mean([√1, √2, √3])1.3820881233139908
mean(f, A::AbstractArray; dims)
Apply the functionf
to each element of arrayA
and take the mean over dimensionsdims
.
This method requires at least Julia 1.3.
julia> using Statisticsjulia> mean(√, [1, 2, 3])1.3820881233139908julia> mean([√1, √2, √3])1.3820881233139908julia> mean(√, [1 2 3; 4 5 6], dims=2)2×1 Matrix{Float64}: 1.3820881233139908 2.2285192400943226
mean(A::AbstractArray; dims)
Compute the mean of an array over the given dimensions.
mean
for empty arrays requires at least Julia 1.1.
Examples
julia> using Statisticsjulia> A = [1 2; 3 4]2×2 Matrix{Int64}: 1 2 3 4julia> mean(A, dims=1)1×2 Matrix{Float64}: 2.0 3.0julia> mean(A, dims=2)2×1 Matrix{Float64}: 1.5 3.5
Statistics.median!
—Functionmedian!(v)
Likemedian
, but may overwrite the input vector.
Statistics.median
—Functionmedian(itr)
Compute the median of all elements in a collection. For an even number of elements no exact median element exists, so the result is equivalent to calculating mean of two median elements.
Ifitr
containsNaN
ormissing
values, the result is alsoNaN
ormissing
(missing
takes precedence ifitr
contains both). Use theskipmissing
function to omitmissing
entries and compute the median of non-missing values.
Examples
julia> using Statisticsjulia> median([1, 2, 3])2.0julia> median([1, 2, 3, 4])2.5julia> median([1, 2, missing, 4])missingjulia> median(skipmissing([1, 2, missing, 4]))2.0
median(A::AbstractArray; dims)
Compute the median of an array along the given dimensions.
Examples
julia> using Statisticsjulia> median([1 2; 3 4], dims=1)1×2 Matrix{Float64}: 2.0 3.0
Statistics.middle
—Functionmiddle(x)
Compute the middle of a scalar value, which is equivalent tox
itself, but of the type ofmiddle(x, x)
for consistency.
middle(x, y)
Compute the middle of two numbersx
andy
, which is equivalent in both value and type to computing their mean ((x + y) / 2
).
middle(a::AbstractArray)
Compute the middle of an arraya
, which consists of finding its extrema and then computing their mean.
julia> using Statisticsjulia> middle(1:10)5.5julia> a = [1,2,3.6,10.9]4-element Vector{Float64}: 1.0 2.0 3.6 10.9julia> middle(a)5.95
Statistics.quantile!
—Functionquantile!([q::AbstractArray, ] v::AbstractVector, p; sorted=false, alpha::Real=1.0, beta::Real=alpha)
Compute the quantile(s) of a vectorv
at a specified probability or vector or tuple of probabilitiesp
on the interval [0,1]. Ifp
is a vector, an optional output arrayq
may also be specified. (If not provided, a new output array is created.) The keyword argumentsorted
indicates whetherv
can be assumed to be sorted; iffalse
(the default), then the elements ofv
will be partially sorted in-place.
Samples quantile are defined byQ(p) = (1-γ)*x[j] + γ*x[j+1]
, wherex[j]
is the j-th order statistic ofv
,j = floor(n*p + m)
,m = alpha + p*(1 - alpha - beta)
andγ = n*p + m - j
.
By default (alpha = beta = 1
), quantiles are computed via linear interpolation between the points((k-1)/(n-1), x[k])
, fork = 1:n
wheren = length(v)
. This corresponds to Definition 7 of Hyndman and Fan (1996), and is the same as the R and NumPy default.
The keyword argumentsalpha
andbeta
correspond to the same parameters in Hyndman and Fan, setting them to different values allows to calculate quantiles with any of the methods 4-9 defined in this paper:
alpha=0
,beta=1
alpha=0.5
,beta=0.5
(MATLAB default)alpha=0
,beta=0
(ExcelPERCENTILE.EXC
, Python default, Stataaltdef
)alpha=1
,beta=1
(Julia, R and NumPy default, ExcelPERCENTILE
andPERCENTILE.INC
, Python'inclusive'
)alpha=1/3
,beta=1/3
alpha=3/8
,beta=3/8
AnArgumentError
is thrown ifv
containsNaN
ormissing
values.
References
Hyndman, R.J and Fan, Y. (1996) "Sample Quantiles in Statistical Packages",The American Statistician, Vol. 50, No. 4, pp. 361-365
Quantile on Wikipedia details the different quantile definitions
Examples
julia> using Statisticsjulia> x = [3, 2, 1];julia> quantile!(x, 0.5)2.0julia> x3-element Vector{Int64}: 1 2 3julia> y = zeros(3);julia> quantile!(y, x, [0.1, 0.5, 0.9]) === ytruejulia> y3-element Vector{Float64}: 1.2000000000000002 2.0 2.8000000000000003
Statistics.quantile
—Functionquantile(itr, p; sorted=false, alpha::Real=1.0, beta::Real=alpha)
Compute the quantile(s) of a collectionitr
at a specified probability or vector or tuple of probabilitiesp
on the interval [0,1]. The keyword argumentsorted
indicates whetheritr
can be assumed to be sorted.
Samples quantile are defined byQ(p) = (1-γ)*x[j] + γ*x[j+1]
, wherex[j]
is the j-th order statistic ofitr
,j = floor(n*p + m)
,m = alpha + p*(1 - alpha - beta)
andγ = n*p + m - j
.
By default (alpha = beta = 1
), quantiles are computed via linear interpolation between the points((k-1)/(n-1), x[k])
, fork = 1:n
wheren = length(itr)
. This corresponds to Definition 7 of Hyndman and Fan (1996), and is the same as the R and NumPy default.
The keyword argumentsalpha
andbeta
correspond to the same parameters in Hyndman and Fan, setting them to different values allows to calculate quantiles with any of the methods 4-9 defined in this paper:
alpha=0
,beta=1
alpha=0.5
,beta=0.5
(MATLAB default)alpha=0
,beta=0
(ExcelPERCENTILE.EXC
, Python default, Stataaltdef
)alpha=1
,beta=1
(Julia, R and NumPy default, ExcelPERCENTILE
andPERCENTILE.INC
, Python'inclusive'
)alpha=1/3
,beta=1/3
alpha=3/8
,beta=3/8
AnArgumentError
is thrown ifv
containsNaN
ormissing
values. Use theskipmissing
function to omitmissing
entries and compute the quantiles of non-missing values.
References
Hyndman, R.J and Fan, Y. (1996) "Sample Quantiles in Statistical Packages",The American Statistician, Vol. 50, No. 4, pp. 361-365
Quantile on Wikipedia details the different quantile definitions
Examples
julia> using Statisticsjulia> quantile(0:20, 0.5)10.0julia> quantile(0:20, [0.1, 0.5, 0.9])3-element Vector{Float64}: 2.0 10.0 18.000000000000004julia> quantile(skipmissing([1, 10, missing]), 0.5)5.5
Settings
This document was generated withDocumenter.jl version 1.8.0 onWednesday 9 July 2025. Using Julia version 1.11.6.