Statistics

The Statistics standard library module contains basic statistics functionality.

Statistics.std —Function

std(itr; corrected::Bool=true, mean=nothing[, dims])

Compute the sample standard deviation of collectionitr.

The algorithm returns an estimator of the generative distribution's standard deviation under the assumption that each entry ofitr is a sample drawn from the same unknown distribution, with the samples uncorrelated. For arrays, this computation is equivalent to calculatingsqrt(sum((itr .- mean(itr)).^2) / (length(itr) - 1)). Ifcorrected istrue, then the sum is scaled withn-1, whereas the sum is scaled withn ifcorrected isfalse withn the number of elements initr.

Ifitr is anAbstractArray,dims can be provided to compute the standard deviation over dimensions.

A pre-computedmean may be provided. Whendims is specified,mean must be an array with the same shape asmean(itr, dims=dims) (additional trailing singleton dimensions are allowed).

Note

If array containsNaN ormissing values, the result is alsoNaN ormissing (missing takes precedence if array contains both). Use theskipmissing function to omitmissing entries and compute the standard deviation of non-missing values.

Statistics.stdm —Function

stdm(itr, mean; corrected::Bool=true[, dims])

Compute the sample standard deviation of collectionitr, with known mean(s)mean.

Ifitr is anAbstractArray,dims can be provided to compute the standard deviation over dimensions. In that case,mean must be an array with the same shape asmean(itr, dims=dims) (additional trailing singleton dimensions are allowed).

Note

Statistics.var —Function

var(itr; corrected::Bool=true, mean=nothing[, dims])

Compute the sample variance of collectionitr.

The algorithm returns an estimator of the generative distribution's variance under the assumption that each entry ofitr is a sample drawn from the same unknown distribution, with the samples uncorrelated. For arrays, this computation is equivalent to calculatingsum((itr .- mean(itr)).^2) / (length(itr) - 1). Ifcorrected istrue, then the sum is scaled withn-1, whereas the sum is scaled withn ifcorrected isfalse wheren is the number of elements initr.

Ifitr is anAbstractArray,dims can be provided to compute the variance over dimensions.

A pre-computedmean may be provided. Whendims is specified,mean must be an array with the same shape asmean(itr, dims=dims) (additional trailing singleton dimensions are allowed).

Note

Statistics.varm —Function

varm(itr, mean; dims, corrected::Bool=true)

Compute the sample variance of collectionitr, with known mean(s)mean.

Ifitr is anAbstractArray,dims can be provided to compute the variance over dimensions. In that case,mean must be an array with the same shape asmean(itr, dims=dims) (additional trailing singleton dimensions are allowed).

Note

Statistics.cor —Function

cor(x::AbstractVector)

Return the number one.

cor(X::AbstractMatrix; dims::Int=1)

Compute the Pearson correlation matrix of the matrixX along the dimensiondims.

cor(x::AbstractVector, y::AbstractVector)

Compute the Pearson correlation between the vectorsx andy.

cor(X::AbstractVecOrMat, Y::AbstractVecOrMat; dims=1)

Compute the Pearson correlation between the vectors or matricesX andY along the dimensiondims.

Statistics.cov —Function

cov(x::AbstractVector; corrected::Bool=true)

Compute the variance of the vectorx. Ifcorrected istrue (the default) then the sum is scaled withn-1, whereas the sum is scaled withn ifcorrected isfalse wheren = length(x).

cov(X::AbstractMatrix; dims::Int=1, corrected::Bool=true)

Compute the covariance matrix of the matrixX along the dimensiondims. Ifcorrected istrue (the default) then the sum is scaled withn-1, whereas the sum is scaled withn ifcorrected isfalse wheren = size(X, dims).

cov(x::AbstractVector, y::AbstractVector; corrected::Bool=true)

Compute the covariance between the vectorsx andy. Ifcorrected istrue (the default), computes$\frac{1}{n-1}\sum_{i=1}^n (x_i-\bar x) (y_i-\bar y)^*$ where$*$ denotes the complex conjugate andn = length(x) = length(y). Ifcorrected isfalse, computes$\frac{1}{n}\sum_{i=1}^n (x_i-\bar x) (y_i-\bar y)^*$.

cov(X::AbstractVecOrMat, Y::AbstractVecOrMat; dims::Int=1, corrected::Bool=true)

Compute the covariance between the vectors or matricesX andY along the dimensiondims. Ifcorrected istrue (the default) then the sum is scaled withn-1, whereas the sum is scaled withn ifcorrected isfalse wheren = size(X, dims) = size(Y, dims).

Statistics.mean! —Function

mean!(r, v)

Compute the mean ofv over the singleton dimensions ofr, and write results tor.

Examples

julia> using Statisticsjulia> v = [1 2; 3 4]2×2 Matrix{Int64}: 1  2 3  4julia> mean!([1., 1.], v)2-element Vector{Float64}: 1.5 3.5julia> mean!([1. 1.], v)1×2 Matrix{Float64}: 2.0  3.0

Statistics.mean —Function

mean(itr)

Compute the mean of all elements in a collection.

Note

Ifitr containsNaN ormissing values, the result is alsoNaN ormissing (missing takes precedence if array contains both). Use theskipmissing function to omitmissing entries and compute the mean of non-missing values.

Examples

julia> using Statisticsjulia> mean(1:20)10.5julia> mean([1, missing, 3])missingjulia> mean(skipmissing([1, missing, 3]))2.0

mean(f, itr)

Apply the functionf to each element of collectionitr and take the mean.

julia> using Statisticsjulia> mean(√, [1, 2, 3])1.3820881233139908julia> mean([√1, √2, √3])1.3820881233139908

mean(f, A::AbstractArray; dims)

Apply the functionf to each element of arrayA and take the mean over dimensionsdims.

Julia 1.3

This method requires at least Julia 1.3.

julia> using Statisticsjulia> mean(√, [1, 2, 3])1.3820881233139908julia> mean([√1, √2, √3])1.3820881233139908julia> mean(√, [1 2 3; 4 5 6], dims=2)2×1 Matrix{Float64}: 1.3820881233139908 2.2285192400943226

mean(A::AbstractArray; dims)

Compute the mean of an array over the given dimensions.

Julia 1.1

mean for empty arrays requires at least Julia 1.1.

Examples

julia> using Statisticsjulia> A = [1 2; 3 4]2×2 Matrix{Int64}: 1  2 3  4julia> mean(A, dims=1)1×2 Matrix{Float64}: 2.0  3.0julia> mean(A, dims=2)2×1 Matrix{Float64}: 1.5 3.5

Statistics.median! —Function

median!(v)

Likemedian, but may overwrite the input vector.

Statistics.median —Function

median(itr)

Compute the median of all elements in a collection. For an even number of elements no exact median element exists, so the result is equivalent to calculating mean of two median elements.

Note

Ifitr containsNaN ormissing values, the result is alsoNaN ormissing (missing takes precedence ifitr contains both). Use theskipmissing function to omitmissing entries and compute the median of non-missing values.

Examples

julia> using Statisticsjulia> median([1, 2, 3])2.0julia> median([1, 2, 3, 4])2.5julia> median([1, 2, missing, 4])missingjulia> median(skipmissing([1, 2, missing, 4]))2.0

median(A::AbstractArray; dims)

Compute the median of an array along the given dimensions.

Examples

julia> using Statisticsjulia> median([1 2; 3 4], dims=1)1×2 Matrix{Float64}: 2.0  3.0

Statistics.middle —Function

middle(x)

Compute the middle of a scalar value, which is equivalent tox itself, but of the type ofmiddle(x, x) for consistency.

middle(x, y)

Compute the middle of two numbersx andy, which is equivalent in both value and type to computing their mean ((x + y) / 2).

middle(a::AbstractArray)

Compute the middle of an arraya, which consists of finding its extrema and then computing their mean.

julia> using Statisticsjulia> middle(1:10)5.5julia> a = [1,2,3.6,10.9]4-element Vector{Float64}:  1.0  2.0  3.6 10.9julia> middle(a)5.95

Statistics.quantile! —Function

quantile!([q::AbstractArray, ] v::AbstractVector, p; sorted=false, alpha::Real=1.0, beta::Real=alpha)

Compute the quantile(s) of a vectorv at a specified probability or vector or tuple of probabilitiesp on the interval [0,1]. Ifp is a vector, an optional output arrayq may also be specified. (If not provided, a new output array is created.) The keyword argumentsorted indicates whetherv can be assumed to be sorted; iffalse (the default), then the elements ofv will be partially sorted in-place.

Samples quantile are defined byQ(p) = (1-γ)*x[j] + γ*x[j+1], wherex[j] is the j-th order statistic ofv,j = floor(n*p + m),m = alpha + p*(1 - alpha - beta) andγ = n*p + m - j.

By default (alpha = beta = 1), quantiles are computed via linear interpolation between the points((k-1)/(n-1), x[k]), fork = 1:n wheren = length(v). This corresponds to Definition 7 of Hyndman and Fan (1996), and is the same as the R and NumPy default.

The keyword argumentsalpha andbeta correspond to the same parameters in Hyndman and Fan, setting them to different values allows to calculate quantiles with any of the methods 4-9 defined in this paper:

Def. 4:alpha=0,beta=1
Def. 5:alpha=0.5,beta=0.5 (MATLAB default)
Def. 6:alpha=0,beta=0 (ExcelPERCENTILE.EXC, Python default, Stataaltdef)
Def. 7:alpha=1,beta=1 (Julia, R and NumPy default, ExcelPERCENTILE andPERCENTILE.INC, Python'inclusive')
Def. 8:alpha=1/3,beta=1/3
Def. 9:alpha=3/8,beta=3/8

Note

AnArgumentError is thrown ifv containsNaN ormissing values.

References

Hyndman, R.J and Fan, Y. (1996) "Sample Quantiles in Statistical Packages",The American Statistician, Vol. 50, No. 4, pp. 361-365
Quantile on Wikipedia details the different quantile definitions

Examples

julia> using Statisticsjulia> x = [3, 2, 1];julia> quantile!(x, 0.5)2.0julia> x3-element Vector{Int64}: 1 2 3julia> y = zeros(3);julia> quantile!(y, x, [0.1, 0.5, 0.9]) === ytruejulia> y3-element Vector{Float64}: 1.2000000000000002 2.0 2.8000000000000003

Statistics.quantile —Function

quantile(itr, p; sorted=false, alpha::Real=1.0, beta::Real=alpha)

Compute the quantile(s) of a collectionitr at a specified probability or vector or tuple of probabilitiesp on the interval [0,1]. The keyword argumentsorted indicates whetheritr can be assumed to be sorted.

Samples quantile are defined byQ(p) = (1-γ)*x[j] + γ*x[j+1], wherex[j] is the j-th order statistic ofitr,j = floor(n*p + m),m = alpha + p*(1 - alpha - beta) andγ = n*p + m - j.

By default (alpha = beta = 1), quantiles are computed via linear interpolation between the points((k-1)/(n-1), x[k]), fork = 1:n wheren = length(itr). This corresponds to Definition 7 of Hyndman and Fan (1996), and is the same as the R and NumPy default.

Def. 4:alpha=0,beta=1
Def. 5:alpha=0.5,beta=0.5 (MATLAB default)
Def. 6:alpha=0,beta=0 (ExcelPERCENTILE.EXC, Python default, Stataaltdef)
Def. 7:alpha=1,beta=1 (Julia, R and NumPy default, ExcelPERCENTILE andPERCENTILE.INC, Python'inclusive')
Def. 8:alpha=1/3,beta=1/3
Def. 9:alpha=3/8,beta=3/8

Note

AnArgumentError is thrown ifv containsNaN ormissing values. Use theskipmissing function to omitmissing entries and compute the quantiles of non-missing values.

References

Hyndman, R.J and Fan, Y. (1996) "Sample Quantiles in Statistical Packages",The American Statistician, Vol. 50, No. 4, pp. 361-365
Quantile on Wikipedia details the different quantile definitions

Examples

julia> using Statisticsjulia> quantile(0:20, 0.5)10.0julia> quantile(0:20, [0.1, 0.5, 0.9])3-element Vector{Float64}:  2.0 10.0 18.000000000000004julia> quantile(skipmissing([1, 10, missing]), 0.5)5.5

Settings

Theme

This document was generated withDocumenter.jl version 1.8.0 onWednesday 9 July 2025. Using Julia version 1.11.6.

Movatterモバイル変換

Statistics