Movatterモバイル変換


[0]ホーム

URL:


GitHub

Integers and Floating-Point Numbers

Integers and floating-point values are the basic building blocks of arithmetic and computation. Built-in representations of such values are called numeric primitives, while representations of integers and floating-point numbers as immediate values in code are known as numeric literals. For example,1 is an integer literal, while1.0 is a floating-point literal; their binary in-memory representations as objects are numeric primitives.

Julia provides a broad range of primitive numeric types, and a full complement of arithmetic and bitwise operators as well as standard mathematical functions are defined over them. These map directly onto numeric types and operations that are natively supported on modern computers, thus allowing Julia to take full advantage of computational resources. Additionally, Julia provides software support forArbitrary Precision Arithmetic, which can handle operations on numeric values that cannot be represented effectively in native hardware representations, but at the cost of relatively slower performance.

The following are Julia's primitive numeric types:

TypeSigned?Number of bitsSmallest valueLargest value
Int88-2^72^7 - 1
UInt8802^8 - 1
Int1616-2^152^15 - 1
UInt161602^16 - 1
Int3232-2^312^31 - 1
UInt323202^32 - 1
Int6464-2^632^63 - 1
UInt646402^64 - 1
Int128128-2^1272^127 - 1
UInt12812802^128 - 1
BoolN/A8false (0)true (1)
TypePrecisionNumber of bits
Float16half16
Float32single32
Float64double64

Additionally, full support forComplex and Rational Numbers is built on top of these primitive numeric types. All numeric types interoperate naturally without explicit casting, thanks to a flexible, user-extensibletype promotion system.

Integers

Literal integers are represented in the standard manner:

julia> 11julia> 12341234

The default type for an integer literal depends on whether the target system has a 32-bit architecture or a 64-bit architecture:

# 32-bit system:julia> typeof(1)Int32# 64-bit system:julia> typeof(1)Int64

The Julia internal variableSys.WORD_SIZE indicates whether the target system is 32-bit or 64-bit:

# 32-bit system:julia> Sys.WORD_SIZE32# 64-bit system:julia> Sys.WORD_SIZE64

Julia also defines the typesInt andUInt, which are aliases for the system's signed and unsigned native integer types respectively:

# 32-bit system:julia> IntInt32julia> UIntUInt32# 64-bit system:julia> IntInt64julia> UIntUInt64

Larger integer literals that cannot be represented using only 32 bits but can be represented in 64 bits always create 64-bit integers, regardless of the system type:

# 32-bit or 64-bit system:julia> typeof(3000000000)Int64

Unsigned integers are input and output using the0x prefix and hexadecimal (base 16) digits0-9a-f (the capitalized digitsA-F also work for input). The size of the unsigned value is determined by the number of hex digits used:

julia> x = 0x10x01julia> typeof(x)UInt8julia> x = 0x1230x0123julia> typeof(x)UInt16julia> x = 0x12345670x01234567julia> typeof(x)UInt32julia> x = 0x123456789abcdef0x0123456789abcdefjulia> typeof(x)UInt64julia> x = 0x111122223333444455556666777788880x11112222333344445555666677778888julia> typeof(x)UInt128

This behavior is based on the observation that when one uses unsigned hex literals for integer values, one typically is using them to represent a fixed numeric byte sequence, rather than just an integer value.

Binary and octal literals are also supported:

julia> x = 0b100x02julia> typeof(x)UInt8julia> x = 0o0100x08julia> typeof(x)UInt8julia> x = 0x000000000000000011112222333344440x00000000000000001111222233334444julia> typeof(x)UInt128

As for hexadecimal literals, binary and octal literals produce unsigned integer types. The size of the binary data item is the minimal needed size, if the leading digit of the literal is not0. In the case of leading zeros, the size is determined by the minimal needed size for a literal, which has the same length but leading digit1. It means that:

Even if there are leading zero digits which don’t contribute to the value, they count for determining storage size of a literal. So0x01 is aUInt8 while0x0001 is aUInt16.

That allows the user to control the size.

Unsigned literals (starting with0x) that encode integers too large to be represented asUInt128 values will constructBigInt values instead. This is not an unsigned type but it is the only built-in type big enough to represent such large integer values.

Binary, octal, and hexadecimal literals may be signed by a- immediately preceding the unsigned literal. They produce an unsigned integer of the same size as the unsigned literal would do, with the two's complement of the value:

julia> -0x20xfejulia> -0x00020xfffe

The minimum and maximum representable values of primitive numeric types such as integers are given by thetypemin andtypemax functions:

julia> (typemin(Int32), typemax(Int32))(-2147483648, 2147483647)julia> for T in [Int8,Int16,Int32,Int64,Int128,UInt8,UInt16,UInt32,UInt64,UInt128]           println("$(lpad(T,7)): [$(typemin(T)),$(typemax(T))]")       end   Int8: [-128,127]  Int16: [-32768,32767]  Int32: [-2147483648,2147483647]  Int64: [-9223372036854775808,9223372036854775807] Int128: [-170141183460469231731687303715884105728,170141183460469231731687303715884105727]  UInt8: [0,255] UInt16: [0,65535] UInt32: [0,4294967295] UInt64: [0,18446744073709551615]UInt128: [0,340282366920938463463374607431768211455]

The values returned bytypemin andtypemax are always of the given argument type. (The above expression uses several features that have yet to be introduced, includingfor loops,Strings, andInterpolation, but should be easy enough to understand for users with some existing programming experience.)

Overflow behavior

In Julia, exceeding the maximum representable value of a given type results in a wraparound behavior:

julia> x = typemax(Int64)9223372036854775807julia> x + 1-9223372036854775808julia> x + 1 == typemin(Int64)true

Arithmetic operations with Julia's integer types inherently performmodular arithmetic, mirroring the characteristics of integer arithmetic on modern computer hardware. In scenarios where overflow is a possibility, it is crucial to explicitly check for wraparound effects that can result from such overflows. TheBase.Checked module provides a suite of arithmetic operations equipped with overflow checks, which trigger errors if an overflow occurs. For use cases where overflow cannot be tolerated under any circumstances, utilizing theBigInt type, as detailed inArbitrary Precision Arithmetic, is advisable.

An example of overflow behavior and how to potentially resolve it is as follows:

julia> 10^19-8446744073709551616julia> big(10)^1910000000000000000000

Division errors

Integer division (thediv function) has two exceptional cases: dividing by zero, and dividing the lowest negative number (typemin) by -1. Both of these cases throw aDivideError. The remainder and modulus functions (rem andmod) throw aDivideError when their second argument is zero.

Floating-Point Numbers

Literal floating-point numbers are represented in the standard formats, usingE-notation when necessary:

julia> 1.01.0julia> 1.1.0julia> 0.50.5julia> .50.5julia> -1.23-1.23julia> 1e101.0e10julia> 2.5e-40.00025

The above results are allFloat64 values. LiteralFloat32 values can be entered by writing anf in place ofe:

julia> x = 0.5f00.5f0julia> typeof(x)Float32julia> 2.5f-40.00025f0

Values can be converted toFloat32 easily:

julia> x = Float32(-1.5)-1.5f0julia> typeof(x)Float32

Hexadecimal floating-point literals are also valid, but only asFloat64 values, withp preceding the base-2 exponent:

julia> 0x1p01.0julia> 0x1.8p312.0julia> x = 0x.4p-10.125julia> typeof(x)Float64

Half-precision floating-point numbers are also supported (Float16) on all platforms, with native instructions used on hardware which supports this number format. Otherwise, operations are implemented in software, and useFloat32 for intermediate calculations. As an internal implementation detail, this is achieved under the hood by using LLVM'shalf type, which behaves similarly to what the GCC-fexcess-precision=16 flag does for C/C++ code.

julia> sizeof(Float16(4.))2julia> 2*Float16(4.)Float16(8.0)

The underscore_ can be used as digit separator:

julia> 10_000, 0.000_000_005, 0xdead_beef, 0b1011_0010(10000, 5.0e-9, 0xdeadbeef, 0xb2)

Floating-point zero

Floating-point numbers havetwo zeros, positive zero and negative zero. They are equal to each other but have different binary representations, as can be seen using thebitstring function:

julia> 0.0 == -0.0truejulia> bitstring(0.0)"0000000000000000000000000000000000000000000000000000000000000000"julia> bitstring(-0.0)"1000000000000000000000000000000000000000000000000000000000000000"

Special floating-point values

There are three specified standard floating-point values that do not correspond to any point on the real number line:

Float16Float32Float64NameDescription
Inf16Inf32Infpositive infinitya value greater than all finite floating-point values
-Inf16-Inf32-Infnegative infinitya value less than all finite floating-point values
NaN16NaN32NaNnot a numbera value not== to any floating-point value (including itself)

For further discussion of how these non-finite floating-point values are ordered with respect to each other and other floats, seeNumeric Comparisons. By theIEEE 754 standard, these floating-point values are the results of certain arithmetic operations:

julia> 1/Inf0.0julia> 1/0Infjulia> -5/0-Infjulia> 0.000001/0Infjulia> 0/0NaNjulia> 500 + InfInfjulia> 500 - Inf-Infjulia> Inf + InfInfjulia> Inf - InfNaNjulia> Inf * InfInfjulia> Inf / InfNaNjulia> 0 * InfNaNjulia> NaN == NaNfalsejulia> NaN != NaNtruejulia> NaN < NaNfalsejulia> NaN > NaNfalse

Thetypemin andtypemax functions also apply to floating-point types:

julia> (typemin(Float16),typemax(Float16))(-Inf16, Inf16)julia> (typemin(Float32),typemax(Float32))(-Inf32, Inf32)julia> (typemin(Float64),typemax(Float64))(-Inf, Inf)

Machine epsilon

Most real numbers cannot be represented exactly with floating-point numbers, and so for many purposes it is important to know the distance between two adjacent representable floating-point numbers, which is often known asmachine epsilon.

Julia provideseps, which gives the distance between1.0 and the next larger representable floating-point value:

julia> eps(Float32)1.1920929f-7julia> eps(Float64)2.220446049250313e-16julia> eps() # same as eps(Float64)2.220446049250313e-16

These values are2.0^-23 and2.0^-52 asFloat32 andFloat64 values, respectively. Theeps function can also take a floating-point value as an argument, and gives the absolute difference between that value and the next representable floating point value. That is,eps(x) yields a value of the same type asx such thatx + eps(x) is the next representable floating-point value larger thanx:

julia> eps(1.0)2.220446049250313e-16julia> eps(1000.)1.1368683772161603e-13julia> eps(1e-27)1.793662034335766e-43julia> eps(0.0)5.0e-324

The distance between two adjacent representable floating-point numbers is not constant, but is smaller for smaller values and larger for larger values. In other words, the representable floating-point numbers are densest in the real number line near zero, and grow sparser exponentially as one moves farther away from zero. By definition,eps(1.0) is the same aseps(Float64) since1.0 is a 64-bit floating-point value.

Julia also provides thenextfloat andprevfloat functions which return the next largest or smallest representable floating-point number to the argument respectively:

julia> x = 1.25f01.25f0julia> nextfloat(x)1.2500001f0julia> prevfloat(x)1.2499999f0julia> bitstring(prevfloat(x))"00111111100111111111111111111111"julia> bitstring(x)"00111111101000000000000000000000"julia> bitstring(nextfloat(x))"00111111101000000000000000000001"

This example highlights the general principle that the adjacent representable floating-point numbers also have adjacent binary integer representations.

Rounding modes

If a number doesn't have an exact floating-point representation, it must be rounded to an appropriate representable value. However, the manner in which this rounding is done can be changed if required according to the rounding modes presented in theIEEE 754 standard.

The default mode used is alwaysRoundNearest, which rounds to the nearest representable value, with ties rounded towards the nearest value with an even least significant bit.

Background and References

Floating-point arithmetic entails many subtleties which can be surprising to users who are unfamiliar with the low-level implementation details. However, these subtleties are described in detail in most books on scientific computation, and also in the following references:

Arbitrary Precision Arithmetic

To allow computations with arbitrary-precision integers and floating point numbers, Julia wraps theGNU Multiple Precision Arithmetic Library (GMP) and theGNU MPFR Library, respectively. TheBigInt andBigFloat types are available in Julia for arbitrary precision integer and floating point numbers respectively.

Constructors exist to create these types from primitive numerical types, and thestring literal@big_str orparse can be used to construct them fromAbstractStrings.BigInts can also be input as integer literals when they are too big for other built-in integer types. Note that as there is no unsigned arbitrary-precision integer type inBase (BigInt is sufficient in most cases), hexadecimal, octal and binary literals can be used (in addition to decimal literals).

Once created, they participate in arithmetic with all other numeric types thanks to Julia'stype promotion and conversion mechanism:

julia> BigInt(typemax(Int64)) + 19223372036854775808julia> big"123456789012345678901234567890" + 1123456789012345678901234567891julia> parse(BigInt, "123456789012345678901234567890") + 1123456789012345678901234567891julia> string(big"2"^200, base=16)"100000000000000000000000000000000000000000000000000"julia> 0x100000000000000000000000000000000-1 == typemax(UInt128)truejulia> 0x0000000000000000000000000000000000julia> typeof(ans)BigIntjulia> big"1.23456789012345678901"1.234567890123456789010000000000000000000000000000000000000000000000000000000004julia> parse(BigFloat, "1.23456789012345678901")1.234567890123456789010000000000000000000000000000000000000000000000000000000004julia> BigFloat(2.0^66) / 32.459565876494606882133333333333333333333333333333333333333333333333333333333344e+19julia> factorial(BigInt(40))815915283247897734345611269596115894272000000000

However, type promotion between the primitive types above andBigInt/BigFloat is not automatic and must be explicitly stated.

julia> x = typemin(Int64)-9223372036854775808julia> x = x - 19223372036854775807julia> typeof(x)Int64julia> y = BigInt(typemin(Int64))-9223372036854775808julia> y = y - 1-9223372036854775809julia> typeof(y)BigInt

The default precision (in number of bits of the significand) and rounding mode ofBigFloat operations can be changed globally by callingsetprecision andsetrounding, and all further calculations will take these changes in account. Alternatively, the precision or the rounding can be changed only within the execution of a particular block of code by using the same functions with ado block:

julia> setrounding(BigFloat, RoundUp) do           BigFloat(1) + parse(BigFloat, "0.1")       end1.100000000000000000000000000000000000000000000000000000000000000000000000000003julia> setrounding(BigFloat, RoundDown) do           BigFloat(1) + parse(BigFloat, "0.1")       end1.099999999999999999999999999999999999999999999999999999999999999999999999999986julia> setprecision(40) do           BigFloat(1) + parse(BigFloat, "0.1")       end1.1000000000004
Warning

The relation betweensetprecision orsetrounding and@big_str, the macro used forbig string literals (such asbig"0.3"), might not be intuitive, as a consequence of the fact that@big_str is a macro. See the@big_str documentation for details.

Numeric Literal Coefficients

To make common numeric formulae and expressions clearer, Julia allows variables to be immediately preceded by a numeric literal, implying multiplication. This makes writing polynomial expressions much cleaner:

julia> x = 33julia> 2x^2 - 3x + 110julia> 1.5x^2 - .5x + 113.0

It also makes writing exponential functions more elegant:

julia> 2^2x64

The precedence of numeric literal coefficients is slightly lower than that of unary operators such as negation. So-2x is parsed as(-2) * x and√2x is parsed as(√2) * x. However, numeric literal coefficients parse similarly to unary operators when combined with exponentiation. For example2^3x is parsed as2^(3x), and2x^3 is parsed as2*(x^3).

Numeric literals also work as coefficients to parenthesized expressions:

julia> 2(x-1)^2 - 3(x-1) + 13
Note

The precedence of numeric literal coefficients used for implicit multiplication is higher than other binary operators such as multiplication (*), and division (/,\, and//). This means, for example, that1 / 2im equals-0.5im and6 // 2(2 + 1) equals1 // 1.

Additionally, parenthesized expressions can be used as coefficients to variables, implying multiplication of the expression by the variable:

julia> (x-1)x6

Neither juxtaposition of two parenthesized expressions, nor placing a variable before a parenthesized expression, however, can be used to imply multiplication:

julia> (x-1)(x+1)ERROR: MethodError: objects of type Int64 are not callablejulia> x(x+1)ERROR: MethodError: objects of type Int64 are not callable

Both expressions are interpreted as function application: any expression that is not a numeric literal, when immediately followed by a parenthetical, is interpreted as a function applied to the values in parentheses (seeFunctions for more about functions). Thus, in both of these cases, an error occurs since the left-hand value is not a function.

The above syntactic enhancements significantly reduce the visual noise incurred when writing common mathematical formulae. Note that no whitespace may come between a numeric literal coefficient and the identifier or parenthesized expression which it multiplies.

Syntax Conflicts

Juxtaposed literal coefficient syntax may conflict with some numeric literal syntaxes: hexadecimal, octal and binary integer literals and engineering notation for floating-point literals. Here are some situations where syntactic conflicts arise:

In all cases the ambiguity is resolved in favor of interpretation as numeric literals:

UnlikeE, which is equivalent toe in numeric literals for historical reasons,F is just another letter and does not behave likef in numeric literals. Hence, expressions starting with a numeric literal followed byF are interpreted as the numerical literal multiplied by a variable, which means that, for example,1.5F22 is equal to1.5 * F22.

Literal zero and one

Julia provides functions which return literal 0 and 1 corresponding to a specified type or the type of a given variable.

FunctionDescription
zero(x)Literal zero of typex or type of variablex
one(x)Literal one of typex or type of variablex

These functions are useful inNumeric Comparisons to avoid overhead from unnecessarytype conversion.

Examples:

julia> zero(Float32)0.0f0julia> zero(1.0)0.0julia> one(Int32)1julia> one(BigFloat)1.0

Settings


This document was generated withDocumenter.jl version 1.16.0 onThursday 20 November 2025. Using Julia version 1.12.2.


[8]ページ先頭

©2009-2025 Movatter.jp