Julia provides support for representing missing values in the statistical sense. This is for situations where no value is available for a variable in an observation, but a valid value theoretically exists. Missing values are represented via themissing
object, which is the singleton instance of the typeMissing
.missing
is equivalent toNULL
in SQL andNA
in R, and behaves like them in most situations.
missing
valuespropagate automatically when passed to standard mathematical operators and functions. For these functions, uncertainty about the value of one of the operands induces uncertainty about the result. In practice, this means a math operation involving amissing
value generally returnsmissing
:
julia> missing + 1missingjulia> "a" * missingmissingjulia> abs(missing)missing
Sincemissing
is a normal Julia object, this propagation rule only works for functions which have opted in to implement this behavior. This can be achieved by:
Missing
,Packages should consider whether it makes sense to propagate missing values when defining new functions, and define methods appropriately if this is the case. Passing amissing
value to a function which does not have a method accepting arguments of typeMissing
throws aMethodError
, just like for any other type.
Functions that do not propagatemissing
values can be made to do so by wrapping them in thepassmissing
function provided by theMissings.jl package. For example,f(x)
becomespassmissing(f)(x)
.
Standard equality and comparison operators follow the propagation rule presented above: if any of the operands ismissing
, the result ismissing
. Here are a few examples:
julia> missing == 1missingjulia> missing == missingmissingjulia> missing < 1missingjulia> 2 >= missingmissing
In particular, note thatmissing == missing
returnsmissing
, so==
cannot be used to test whether a value is missing. To test whetherx
ismissing
, useismissing(x)
.
Special comparison operatorsisequal
and===
are exceptions to the propagation rule. They will always return aBool
value, even in the presence ofmissing
values, consideringmissing
as equal tomissing
and as different from any other value. They can therefore be used to test whether a value ismissing
:
julia> missing === 1falsejulia> isequal(missing, 1)falsejulia> missing === missingtruejulia> isequal(missing, missing)true
Theisless
operator is another exception:missing
is considered as greater than any other value. This operator is used bysort!
, which therefore placesmissing
values after all other values:
julia> isless(1, missing)truejulia> isless(missing, Inf)falsejulia> isless(missing, missing)false
Logical (or boolean) operators|
,&
andxor
are another special case since they only propagatemissing
values when it is logically required. For these operators, whether or not the result is uncertain, depends on the particular operation. This follows the well-established rules ofthree-valued logic which are implemented by e.g.NULL
in SQL andNA
in R. This abstract definition corresponds to a relatively natural behavior which is best explained via concrete examples.
Let us illustrate this principle with the logical "or" operator|
. Following the rules of boolean logic, if one of the operands istrue
, the value of the other operand does not have an influence on the result, which will always betrue
:
julia> true | truetruejulia> true | falsetruejulia> false | truetrue
Based on this observation, we can conclude if one of the operands istrue
and the othermissing
, we know that the result istrue
in spite of the uncertainty about the actual value of one of the operands. If we had been able to observe the actual value of the second operand, it could only betrue
orfalse
, and in both cases the result would betrue
. Therefore, in this particular case, missingness doesnot propagate:
julia> true | missingtruejulia> missing | truetrue
On the contrary, if one of the operands isfalse
, the result could be eithertrue
orfalse
depending on the value of the other operand. Therefore, if that operand ismissing
, the result has to bemissing
too:
julia> false | truetruejulia> true | falsetruejulia> false | falsefalsejulia> false | missingmissingjulia> missing | falsemissing
The behavior of the logical "and" operator&
is similar to that of the|
operator, with the difference that missingness does not propagate when one of the operands isfalse
. For example, when that is the case of the first operand:
julia> false & falsefalsejulia> false & truefalsejulia> false & missingfalse
On the other hand, missingness propagates when one of the operands istrue
, for example the first one:
julia> true & truetruejulia> true & falsefalsejulia> true & missingmissing
Finally, the "exclusive or" logical operatorxor
always propagatesmissing
values, since both operands always have an effect on the result. Also note that the negation operator!
returnsmissing
when the operand ismissing
, just like other unary operators.
Control flow operators includingif
,while
and theternary operatorx ? y : z
do not allow for missing values. This is because of the uncertainty about whether the actual value would betrue
orfalse
if we could observe it. This implies we do not know how the program should behave. In this case, aTypeError
is thrown as soon as amissing
value is encountered in this context:
julia> if missing println("here") endERROR: TypeError: non-boolean (Missing) used in boolean context
For the same reason, contrary to logical operators presented above, the short-circuiting boolean operators&&
and||
do not allow formissing
values in situations where the value of the operand determines whether the next operand is evaluated or not. For example:
julia> missing || falseERROR: TypeError: non-boolean (Missing) used in boolean contextjulia> missing && falseERROR: TypeError: non-boolean (Missing) used in boolean contextjulia> true && missing && falseERROR: TypeError: non-boolean (Missing) used in boolean context
In contrast, there is no error thrown when the result can be determined without themissing
values. This is the case when the code short-circuits before evaluating themissing
operand, and when themissing
operand is the last one:
julia> true && missingmissingjulia> false && missingfalse
Arrays containing missing values can be created like other arrays:
julia> [1, missing]2-element Vector{Union{Missing, Int64}}: 1 missing
As this example shows, the element type of such arrays isUnion{Missing, T}
, withT
the type of the non-missing values. This reflects the fact that array entries can be either of typeT
(here,Int64
) or of typeMissing
. This kind of array uses an efficient memory storage equivalent to anArray{T}
holding the actual values combined with anArray{UInt8}
indicating the type of the entry (i.e. whether it isMissing
orT
).
Arrays allowing for missing values can be constructed with the standard syntax. UseArray{Union{Missing, T}}(missing, dims)
to create arrays filled with missing values:
julia> Array{Union{Missing, String}}(missing, 2, 3)2×3 Matrix{Union{Missing, String}}: missing missing missing missing missing missing
Usingundef
orsimilar
may currently give an array filled withmissing
, but this is not the correct way to obtain such an array. Use amissing
constructor as shown above instead.
An array with element type allowingmissing
entries (e.g.Vector{Union{Missing, T}}
) which does not contain anymissing
entries can be converted to an array type that does not allow formissing
entries (e.g.Vector{T}
) usingconvert
. If the array containsmissing
values, aMethodError
is thrown during conversion:
julia> x = Union{Missing, String}["a", "b"]2-element Vector{Union{Missing, String}}: "a" "b"julia> convert(Array{String}, x)2-element Vector{String}: "a" "b"julia> y = Union{Missing, String}[missing, "b"]2-element Vector{Union{Missing, String}}: missing "b"julia> convert(Array{String}, y)ERROR: MethodError: Cannot `convert` an object of type Missing to an object of type String
Sincemissing
values propagate with standard mathematical operators, reduction functions returnmissing
when called on arrays which contain missing values:
julia> sum([1, missing])missing
In this situation, use theskipmissing
function to skip missing values:
julia> sum(skipmissing([1, missing]))1
This convenience function returns an iterator which filters outmissing
values efficiently. It can therefore be used with any function which supports iterators:
julia> x = skipmissing([3, missing, 2, 1])skipmissing(Union{Missing, Int64}[3, missing, 2, 1])julia> maximum(x)3julia> sum(x)6julia> mapreduce(sqrt, +, x)4.146264369941973
Objects created by callingskipmissing
on an array can be indexed using indices from the parent array. Indices corresponding to missing values are not valid for these objects, and an error is thrown when trying to use them (they are also skipped bykeys
andeachindex
):
julia> x[1]3julia> x[2]ERROR: MissingException: the value at index (2,) is missing[...]
This allows functions which operate on indices to work in combination withskipmissing
. This is notably the case for search and find functions. These functions return indices valid for the object returned byskipmissing
, and are also the indices of the matching entriesin the parent array:
julia> findall(==(1), x)1-element Vector{Int64}: 4julia> findfirst(!iszero, x)1julia> argmax(x)1
Usecollect
to extract non-missing
values and store them in an array:
julia> collect(x)3-element Vector{Int64}: 3 2 1
The three-valued logic described above for logical operators is also used by logical functions applied to arrays. Thus, array equality tests using the==
operator returnmissing
whenever the result cannot be determined without knowing the actual value of themissing
entry. In practice, this meansmissing
is returned if all non-missing values of the compared arrays are equal, but one or both arrays contain missing values (possibly at different positions):
julia> [1, missing] == [2, missing]falsejulia> [1, missing] == [1, missing]missingjulia> [1, 2, missing] == [1, missing, 2]missing
As for single values, useisequal
to treatmissing
values as equal to othermissing
values, but different from non-missing values:
julia> isequal([1, missing], [1, missing])truejulia> isequal([1, 2, missing], [1, missing, 2])false
Functionsany
andall
also follow the rules of three-valued logic. Thus, returningmissing
when the result cannot be determined:
julia> all([true, missing])missingjulia> all([false, missing])falsejulia> any([true, missing])truejulia> any([false, missing])missing
Settings
This document was generated withDocumenter.jl version 1.8.0 onWednesday 9 July 2025. Using Julia version 1.11.6.