Data Types#

enumarrow::Type::type#

Main data type enumeration.

This enumeration provides a quick way to interrogate the category of aDataType instance.

Values:

enumeratorNA#

A NULL type having no physical storage.

enumeratorBOOL#

Boolean as 1 bit, LSB bit-packed ordering.

enumeratorUINT8#

Unsigned 8-bit little-endian integer.

enumeratorINT8#

Signed 8-bit little-endian integer.

enumeratorUINT16#

Unsigned 16-bit little-endian integer.

enumeratorINT16#

Signed 16-bit little-endian integer.

enumeratorUINT32#

Unsigned 32-bit little-endian integer.

enumeratorINT32#

Signed 32-bit little-endian integer.

enumeratorUINT64#

Unsigned 64-bit little-endian integer.

enumeratorINT64#

Signed 64-bit little-endian integer.

enumeratorHALF_FLOAT#

2-byte floating point value

enumeratorFLOAT#

4-byte floating point value

enumeratorDOUBLE#

8-byte floating point value

enumeratorSTRING#

UTF8 variable-length string as List<Char>

enumeratorBINARY#

Variable-length bytes (no guarantee of UTF8-ness)

enumeratorFIXED_SIZE_BINARY#

Fixed-size binary. Each value occupies the same number of bytes.

enumeratorDATE32#

int32_t days since the UNIX epoch

enumeratorDATE64#

int64_t milliseconds since the UNIX epoch

enumeratorTIMESTAMP#

Exact timestamp encoded with int64 since UNIX epoch Default unit millisecond.

enumeratorTIME32#

Time as signed 32-bit integer, representing either seconds or milliseconds since midnight.

enumeratorTIME64#

Time as signed 64-bit integer, representing either microseconds or nanoseconds since midnight.

enumeratorINTERVAL_MONTHS#

YEAR_MONTH interval in SQL style.

enumeratorINTERVAL_DAY_TIME#

DAY_TIME interval in SQL style.

enumeratorDECIMAL128#

Precision- and scale-based decimal type with 128 bits.

enumeratorDECIMAL#

Defined for backward-compatibility.

enumeratorDECIMAL256#

Precision- and scale-based decimal type with 256 bits.

enumeratorLIST#

A list of some logical data type.

enumeratorSTRUCT#

Struct of logical types.

enumeratorSPARSE_UNION#

Sparse unions of logical types.

enumeratorDENSE_UNION#

Dense unions of logical types.

enumeratorDICTIONARY#

Dictionary-encoded type, also called “categorical” or “factor” in other programming languages.

Holds the dictionary value type but not the dictionary itself, which is part of theArrayData struct

enumeratorMAP#

Map, a repeated struct logical type.

enumeratorEXTENSION#

Custom data type, implemented by user.

enumeratorFIXED_SIZE_LIST#

Fixed size list of some logical type.

enumeratorDURATION#

Measure of elapsed time in either seconds, milliseconds, microseconds or nanoseconds.

enumeratorLARGE_STRING#

Like STRING, but with 64-bit offsets.

enumeratorLARGE_BINARY#

Like BINARY, but with 64-bit offsets.

enumeratorLARGE_LIST#

Like LIST, but with 64-bit offsets.

enumeratorINTERVAL_MONTH_DAY_NANO#

Calendar interval type with three fields.

enumeratorRUN_END_ENCODED#

Run-end encoded data.

enumeratorSTRING_VIEW#

String (UTF8) view type with 4-byte prefix and inline small string optimization.

enumeratorBINARY_VIEW#

Bytes view type with 4-byte prefix and inline small string optimization.

enumeratorLIST_VIEW#

A list of some logical data type represented by offset and size.

enumeratorLARGE_LIST_VIEW#

Like LIST_VIEW, but with 64-bit offsets and sizes.

enumeratorDECIMAL32#

Precision- and scale-based decimal type with 32 bits.

enumeratorDECIMAL64#

Precision- and scale-based decimal type with 64 bits.

enumeratorMAX_ID#
classDataType:publicstd::enable_shared_from_this<DataType>,publicarrow::detail::Fingerprintable,publicarrow::util::EqualityComparable<DataType>#

Base class for all data types.

Data types in this library are alllogical. They can be expressed as either a primitive physical type (bytes or bits of some fixed size), a nested type consisting of other data types, or another data type (e.g. a timestamp encoded as an int64).

Simple datatypes may be entirely described by theirType::type id, but complex datatypes are usually parametric.

Subclassed by arrow::BaseBinaryType,arrow::BinaryViewType,arrow::ExtensionType, arrow::FixedWidthType, arrow::NestedType,arrow::NullType

Public Functions

boolEquals(constDataType&other,boolcheck_metadata=false)const#

Return whether the types are equal.

Types that are logically convertible from one to another (e.g. List<UInt8> and Binary) are NOT equal.

boolEquals(conststd::shared_ptr<DataType>&other,boolcheck_metadata=false)const#

Return whether the types are equal.

inlineconststd::shared_ptr<Field>&field(inti)const#

Return the child field at index i.

inlineconstFieldVector&fields()const#

Return the children fields associated with this type.

inlineintnum_fields()const#

Return the number of children fields associated with this type.

StatusAccept(TypeVisitor*visitor)const#

Apply theTypeVisitor::Visit() method specialized to the data type.

virtualstd::stringToString(boolshow_metadata=false)const=0#

A string representation of the type, including any children.

size_tHash()const#

Return hash value (excluding metadata in child fields)

virtualstd::stringname()const=0#

A string name of the type, omitting any child fields.

Since

0.7.0

virtualDataTypeLayoutlayout()const=0#

Return the data type layout.

Children are not included.

Note

Experimental API

inlineconstexprType::typeid()const#

Return the type category.

inlinevirtualType::typestorage_id()const#

Return the type category of the storage type.

inlinevirtualint32_tbyte_width()const#

Returns the type’s fixed byte width, if any.

Returns -1 for non-fixed-width types, and should only be used for subclasses of FixedWidthType

inlinevirtualintbit_width()const#

Returns the type’s fixed bit width, if any.

Returns -1 for non-fixed-width types, and should only be used for subclasses of FixedWidthType

Factory functions#

These functions are recommended for creating data types. They may returnnew objects or existing singletons, depending on the type requested.

conststd::shared_ptr<DataType>&null()#

Return aNullType instance.

conststd::shared_ptr<DataType>&boolean()#

Return aBooleanType instance.

conststd::shared_ptr<DataType>&int8()#

Return aInt8Type instance.

conststd::shared_ptr<DataType>&int16()#

Return aInt16Type instance.

conststd::shared_ptr<DataType>&int32()#

Return aInt32Type instance.

conststd::shared_ptr<DataType>&int64()#

Return aInt64Type instance.

conststd::shared_ptr<DataType>&uint8()#

Return aUInt8Type instance.

conststd::shared_ptr<DataType>&uint16()#

Return aUInt16Type instance.

conststd::shared_ptr<DataType>&uint32()#

Return aUInt32Type instance.

conststd::shared_ptr<DataType>&uint64()#

Return aUInt64Type instance.

conststd::shared_ptr<DataType>&float16()#

Return aHalfFloatType instance.

conststd::shared_ptr<DataType>&float32()#

Return aFloatType instance.

conststd::shared_ptr<DataType>&float64()#

Return aDoubleType instance.

conststd::shared_ptr<DataType>&utf8()#

Return aStringType instance.

conststd::shared_ptr<DataType>&utf8_view()#

Return aStringViewType instance.

conststd::shared_ptr<DataType>&large_utf8()#

Return aLargeStringType instance.

conststd::shared_ptr<DataType>&binary()#

Return aBinaryType instance.

conststd::shared_ptr<DataType>&binary_view()#

Return aBinaryViewType instance.

conststd::shared_ptr<DataType>&large_binary()#

Return aLargeBinaryType instance.

conststd::shared_ptr<DataType>&date32()#

Return aDate32Type instance.

conststd::shared_ptr<DataType>&date64()#

Return aDate64Type instance.

std::shared_ptr<DataType>fixed_size_binary(int32_tbyte_width)#

Create aFixedSizeBinaryType instance.

std::shared_ptr<DataType>decimal(int32_tprecision,int32_tscale)#

Create aDecimalType instance depending on the precision.

If the precision is greater than 38, aDecimal256Type is returned, otherwise aDecimal128Type.

Deprecated: prefersmallest_decimal instead.

std::shared_ptr<DataType>smallest_decimal(int32_tprecision,int32_tscale)#

Create a the smallestDecimalType instance depending on precision.

Given the requested precision and scale, the smallestDecimalType which is able to represent that precision will be returned. As different bit-widths for decimal types are added, the concrete data type returned here can potentially change accordingly.

std::shared_ptr<DataType>decimal32(int32_tprecision,int32_tscale)#

Create aDecimal32Type instance.

std::shared_ptr<DataType>decimal64(int32_tprecision,int32_tscale)#

Create aDecimal64Type instance.

std::shared_ptr<DataType>decimal128(int32_tprecision,int32_tscale)#

Create aDecimal128Type instance.

std::shared_ptr<DataType>decimal256(int32_tprecision,int32_tscale)#

Create aDecimal256Type instance.

std::shared_ptr<DataType>list(std::shared_ptr<Field>value_type)#

Create aListType instance from its childField type.

std::shared_ptr<DataType>list(std::shared_ptr<DataType>value_type)#

Create aListType instance from its childDataType.

std::shared_ptr<DataType>large_list(std::shared_ptr<Field>value_type)#

Create aLargeListType instance from its childField type.

std::shared_ptr<DataType>large_list(std::shared_ptr<DataType>value_type)#

Create aLargeListType instance from its childDataType.

std::shared_ptr<DataType>list_view(std::shared_ptr<DataType>value_type)#

Create aListViewType instance.

std::shared_ptr<DataType>list_view(std::shared_ptr<Field>value_type)#

Create aListViewType instance from its childField type.

std::shared_ptr<DataType>large_list_view(std::shared_ptr<DataType>value_type)#

Create a LargetListViewType instance.

std::shared_ptr<DataType>large_list_view(std::shared_ptr<Field>value_type)#

Create a LargetListViewType instance from its childField type.

std::shared_ptr<DataType>map(std::shared_ptr<DataType>key_type,std::shared_ptr<DataType>item_type,boolkeys_sorted=false)#

Create aMapType instance from its key and value DataTypes.

std::shared_ptr<DataType>map(std::shared_ptr<DataType>key_type,std::shared_ptr<Field>item_field,boolkeys_sorted=false)#

Create aMapType instance from its keyDataType and value field.

The field override is provided to communicate nullability of the value.

std::shared_ptr<DataType>fixed_size_list(std::shared_ptr<Field>value_type,int32_tlist_size)#

Create aFixedSizeListType instance from its childField type.

std::shared_ptr<DataType>fixed_size_list(std::shared_ptr<DataType>value_type,int32_tlist_size)#

Create aFixedSizeListType instance from its childDataType.

std::shared_ptr<DataType>duration(TimeUnit::typeunit)#

Return a Duration instance (naming use _type to avoid namespace conflict with built in time classes).

std::shared_ptr<DataType>day_time_interval()#

Return aDayTimeIntervalType instance.

std::shared_ptr<DataType>month_interval()#

Return aMonthIntervalType instance.

std::shared_ptr<DataType>month_day_nano_interval()#

Return aMonthDayNanoIntervalType instance.

std::shared_ptr<DataType>timestamp(TimeUnit::typeunit)#

Create aTimestampType instance from its unit.

std::shared_ptr<DataType>timestamp(TimeUnit::typeunit,conststd::string&timezone)#

Create aTimestampType instance from its unit and timezone.

std::shared_ptr<DataType>time32(TimeUnit::typeunit)#

Create a 32-bit time type instance.

Unit can be either SECOND or MILLI

std::shared_ptr<DataType>time64(TimeUnit::typeunit)#

Create a 64-bit time type instance.

Unit can be either MICRO or NANO

std::shared_ptr<DataType>struct_(constFieldVector&fields)#

Create aStructType instance.

std::shared_ptr<DataType>struct_(std::initializer_list<std::pair<std::string,std::shared_ptr<DataType>>>fields)#

Create aStructType instance from (name, type) pairs.

std::shared_ptr<DataType>run_end_encoded(std::shared_ptr<DataType>run_end_type,std::shared_ptr<DataType>value_type)#

Create aRunEndEncodedType instance.

std::shared_ptr<DataType>sparse_union(FieldVectorchild_fields,std::vector<int8_t>type_codes={})#

Create aSparseUnionType instance.

std::shared_ptr<DataType>sparse_union(constArrayVector&children,std::vector<std::string>field_names={},std::vector<int8_t>type_codes={})#

Create aSparseUnionType instance.

std::shared_ptr<DataType>dense_union(FieldVectorchild_fields,std::vector<int8_t>type_codes={})#

Create aDenseUnionType instance.

std::shared_ptr<DataType>dense_union(constArrayVector&children,std::vector<std::string>field_names={},std::vector<int8_t>type_codes={})#

Create aDenseUnionType instance.

std::shared_ptr<DataType>dictionary(conststd::shared_ptr<DataType>&index_type,conststd::shared_ptr<DataType>&dict_type,boolordered=false)#

Create aDictionaryType instance.

Parameters:
  • index_type[in] the type of the dictionary indices (must be a signed integer)

  • dict_type[in] the type of the values in the variable dictionary

  • ordered[in] true if the order of the dictionary values has semantic meaning and should be preserved where possible

Concrete type subclasses#

Primitive#

classNullType:publicarrow::DataType#

Concrete type class for always-null data.

Public Functions

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualDataTypeLayoutlayout()constoverride#

Return the data type layout.

Children are not included.

Note

Experimental API

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

classBooleanType:publicarrow::detail::CTypeImpl<BooleanType,PrimitiveCType,Type::BOOL,bool>#

Concrete type class for boolean data.

Public Functions

inlinevirtualintbit_width()constfinal#

Returns the type’s fixed bit width, if any.

Returns -1 for non-fixed-width types, and should only be used for subclasses of FixedWidthType

inlinevirtualDataTypeLayoutlayout()constoverride#

Return the data type layout.

Children are not included.

Note

Experimental API

classUInt8Type:publicarrow::detail::IntegerTypeImpl<UInt8Type,Type::UINT8,uint8_t>#
#include <arrow/type.h>

Concrete type class for unsigned 8-bit integer data.

classInt8Type:publicarrow::detail::IntegerTypeImpl<Int8Type,Type::INT8,int8_t>#
#include <arrow/type.h>

Concrete type class for signed 8-bit integer data.

classUInt16Type:publicarrow::detail::IntegerTypeImpl<UInt16Type,Type::UINT16,uint16_t>#
#include <arrow/type.h>

Concrete type class for unsigned 16-bit integer data.

classInt16Type:publicarrow::detail::IntegerTypeImpl<Int16Type,Type::INT16,int16_t>#
#include <arrow/type.h>

Concrete type class for signed 16-bit integer data.

classUInt32Type:publicarrow::detail::IntegerTypeImpl<UInt32Type,Type::UINT32,uint32_t>#
#include <arrow/type.h>

Concrete type class for unsigned 32-bit integer data.

classInt32Type:publicarrow::detail::IntegerTypeImpl<Int32Type,Type::INT32,int32_t>#
#include <arrow/type.h>

Concrete type class for signed 32-bit integer data.

classUInt64Type:publicarrow::detail::IntegerTypeImpl<UInt64Type,Type::UINT64,uint64_t>#
#include <arrow/type.h>

Concrete type class for unsigned 64-bit integer data.

classInt64Type:publicarrow::detail::IntegerTypeImpl<Int64Type,Type::INT64,int64_t>#
#include <arrow/type.h>

Concrete type class for signed 64-bit integer data.

classHalfFloatType:publicarrow::detail::CTypeImpl<HalfFloatType,FloatingPointType,Type::HALF_FLOAT,uint16_t>#
#include <arrow/type.h>

Concrete type class for 16-bit floating-point data.

classFloatType:publicarrow::detail::CTypeImpl<FloatType,FloatingPointType,Type::FLOAT,float>#
#include <arrow/type.h>

Concrete type class for 32-bit floating-point data (C “float”)

classDoubleType:publicarrow::detail::CTypeImpl<DoubleType,FloatingPointType,Type::DOUBLE,double>#
#include <arrow/type.h>

Concrete type class for 64-bit floating-point data (C “double”)

classDecimalType:publicarrow::FixedSizeBinaryType#
#include <arrow/type.h>

Base type class for (fixed-size) decimal data.

Subclassed byarrow::Decimal128Type,arrow::Decimal256Type,arrow::Decimal32Type,arrow::Decimal64Type

Public Static Functions

staticResult<std::shared_ptr<DataType>>Make(Type::typetype_id,int32_tprecision,int32_tscale)#

Constructs concrete decimal types.

staticint32_tDecimalSize(int32_tprecision)#

Returns the number of bytes needed for precision.

precision must be >= 1

classDecimal32Type:publicarrow::DecimalType#
#include <arrow/type.h>

Concrete type class for 32-bit decimal data.

Arrow decimals are fixed-point decimal numbers encoded as a scaled integer. The precision is the number of significant digits that the decimal type can represent; the scale is the number of digits after the decimal point (note the scale can be negative).

As an example,Decimal32Type(7,3) can exactly represent the numbers 1234.567 and -1234.567 (encoded internally as the 32-bit integers 1234567 and -1234567, respectively), but neither 12345.67 nor 123.4567.

Decimal32Type has a maximum precision of 9 significant digits (also available as Decimal32Type::kMaxPrecision). If higher precision is needed, consider usingDecimal64Type,Decimal128Type orDecimal256Type.

Public Functions

explicitDecimal32Type(int32_tprecision,int32_tscale)#

Decimal32Type constructor that aborts on invalid input.

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

Public Static Functions

staticResult<std::shared_ptr<DataType>>Make(int32_tprecision,int32_tscale)#

Decimal32Type constructor that returns an error on invalid input.

classDecimal64Type:publicarrow::DecimalType#
#include <arrow/type.h>

Concrete type class for 64-bit decimal data.

Arrow decimals are fixed-point decimal numbers encoded as a scaled integer. The precision is the number of significant digits that the decimal type can represent; the scale is the number of digits after the decimal point (note the scale can be negative).

As an example,Decimal64Type(7,3) can exactly represent the numbers 1234.567 and -1234.567 (encoded internally as the 64-bit integers 1234567 and -1234567, respectively), but neither 12345.67 nor 123.4567.

Decimal64Type has a maximum precision of 18 significant digits (also available as Decimal64Type::kMaxPrecision). If higher precision is needed, consider usingDecimal128Type orDecimal256Type.

Public Functions

explicitDecimal64Type(int32_tprecision,int32_tscale)#

Decimal32Type constructor that aborts on invalid input.

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

Public Static Functions

staticResult<std::shared_ptr<DataType>>Make(int32_tprecision,int32_tscale)#

Decimal32Type constructor that returns an error on invalid input.

classDecimal128Type:publicarrow::DecimalType#
#include <arrow/type.h>

Concrete type class for 128-bit decimal data.

Arrow decimals are fixed-point decimal numbers encoded as a scaled integer. The precision is the number of significant digits that the decimal type can represent; the scale is the number of digits after the decimal point (note the scale can be negative).

As an example,Decimal128Type(7,3) can exactly represent the numbers 1234.567 and -1234.567 (encoded internally as the 128-bit integers 1234567 and -1234567, respectively), but neither 12345.67 nor 123.4567.

Decimal128Type has a maximum precision of 38 significant digits (also available as Decimal128Type::kMaxPrecision). If higher precision is needed, consider usingDecimal256Type.

Public Functions

explicitDecimal128Type(int32_tprecision,int32_tscale)#

Decimal128Type constructor that aborts on invalid input.

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

Public Static Functions

staticResult<std::shared_ptr<DataType>>Make(int32_tprecision,int32_tscale)#

Decimal128Type constructor that returns an error on invalid input.

classDecimal256Type:publicarrow::DecimalType#
#include <arrow/type.h>

Concrete type class for 256-bit decimal data.

Arrow decimals are fixed-point decimal numbers encoded as a scaled integer. The precision is the number of significant digits that the decimal type can represent; the scale is the number of digits after the decimal point (note the scale can be negative).

Decimal256Type has a maximum precision of 76 significant digits. (also available as Decimal256Type::kMaxPrecision).

For most use cases, the maximum precision offered byDecimal128Type is sufficient, and it will result in a more compact and more efficient encoding.

Public Functions

explicitDecimal256Type(int32_tprecision,int32_tscale)#

Decimal256Type constructor that aborts on invalid input.

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

Public Static Functions

staticResult<std::shared_ptr<DataType>>Make(int32_tprecision,int32_tscale)#

Decimal256Type constructor that returns an error on invalid input.

Temporal#

enumarrow::TimeUnit::type#

The unit for a time or timestampDataType.

Values:

enumeratorSECOND#
enumeratorMILLI#
enumeratorMICRO#
enumeratorNANO#
std::ostream&operator<<(std::ostream&os,TimeUnit::typeunit)#
std::ostream&operator<<(std::ostream&os,DayTimeIntervalType::DayMillisecondsinterval)#
std::ostream&operator<<(std::ostream&os,MonthDayNanoIntervalType::MonthDayNanosinterval)#
classTemporalType:publicarrow::FixedWidthType#
#include <arrow/type.h>

Base type for all date and time types.

Subclassed byarrow::DateType,arrow::DurationType,arrow::IntervalType,arrow::TimeType,arrow::TimestampType

Public Functions

inlinevirtualDataTypeLayoutlayout()constoverride#

Return the data type layout.

Children are not included.

Note

Experimental API

classDateType:publicarrow::TemporalType#
#include <arrow/type.h>

Base type class for date data.

Subclassed byarrow::Date32Type,arrow::Date64Type

classDate32Type:publicarrow::DateType#
#include <arrow/type.h>

Concrete type class for 32-bit date data (as number of days since UNIX epoch)

Public Functions

inlinevirtualintbit_width()constoverride#

Returns the type’s fixed bit width, if any.

Returns -1 for non-fixed-width types, and should only be used for subclasses of FixedWidthType

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

classDate64Type:publicarrow::DateType#
#include <arrow/type.h>

Concrete type class for 64-bit date data (as number of milliseconds since UNIX epoch)

Public Functions

inlinevirtualintbit_width()constoverride#

Returns the type’s fixed bit width, if any.

Returns -1 for non-fixed-width types, and should only be used for subclasses of FixedWidthType

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

classTimeType:publicarrow::TemporalType,publicarrow::ParametricType#
#include <arrow/type.h>

Base type class for time data.

Subclassed byarrow::Time32Type,arrow::Time64Type

classTime32Type:publicarrow::TimeType#
#include <arrow/type.h>

Concrete type class for 32-bit time data (as number of seconds or milliseconds since midnight)

Public Functions

inlinevirtualintbit_width()constoverride#

Returns the type’s fixed bit width, if any.

Returns -1 for non-fixed-width types, and should only be used for subclasses of FixedWidthType

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

classTime64Type:publicarrow::TimeType#
#include <arrow/type.h>

Concrete type class for 64-bit time data (as number of microseconds or nanoseconds since midnight)

Public Functions

inlinevirtualintbit_width()constoverride#

Returns the type’s fixed bit width, if any.

Returns -1 for non-fixed-width types, and should only be used for subclasses of FixedWidthType

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

classTimestampType:publicarrow::TemporalType,publicarrow::ParametricType#
#include <arrow/type.h>

Concrete type class for datetime data (as number of seconds, milliseconds, microseconds or nanoseconds since UNIX epoch)

If supplied, the timezone string should take either the form (i) “Area/Location”, with values drawn from the names in the IANA Time Zone Database (such as “Europe/Zurich”); or (ii) “(+|-)HH:MM” indicating an absolute offset from GMT (such as “-08:00”). To indicate a native UTC timestamp, one of the strings “UTC”, “Etc/UTC” or “+00:00” should be used.

If any non-empty string is supplied as the timezone for aTimestampType, then the Arrow field containing that timestamp type (and by extension the column associated with such a field) is considered “timezone-aware”. The integer arrays that comprise a timezone-aware column must contain UTC normalized datetime values, regardless of the contents of their timezone string. More precisely, (i) the producer of a timezone-aware column must populate its constituent arrays with valid UTC values (performing offset conversions from non-UTC values if necessary); and (ii) the consumer of a timezone-aware column may assume that the column’s values are directly comparable (that is, with no offset adjustment required) to the values of any other timezone-aware column or to any other valid UTC datetime value (provided all values are expressed in the same units).

If aTimestampType is constructed without a timezone (or, equivalently, if the timezone supplied is an empty string) then the resulting Arrow field (column) is considered “timezone-naive”. The producer of a timezone-naive column may populate its constituent integer arrays with datetime values from any timezone; the consumer of a timezone-naive column should make no assumptions about the interoperability or comparability of the values of such a column with those of any other timestamp column or datetime value.

If a timezone-aware field contains a recognized timezone, its values may be localized to that locale upon display; the values of timezone-naive fields must always be displayed “as is”, with no localization performed on them.

Public Functions

inlinevirtualintbit_width()constoverride#

Returns the type’s fixed bit width, if any.

Returns -1 for non-fixed-width types, and should only be used for subclasses of FixedWidthType

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

classIntervalType:publicarrow::TemporalType,publicarrow::ParametricType#
#include <arrow/type.h>

Subclassed byarrow::DayTimeIntervalType,arrow::MonthDayNanoIntervalType,arrow::MonthIntervalType

classMonthIntervalType:publicarrow::IntervalType#
#include <arrow/type.h>

Represents a number of months.

Type representing a number of months. Corresponds to YearMonth type in Schema.fbs (years are defined as 12 months).

Public Functions

inlinevirtualintbit_width()constoverride#

Returns the type’s fixed bit width, if any.

Returns -1 for non-fixed-width types, and should only be used for subclasses of FixedWidthType

inlinevirtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

classDayTimeIntervalType:publicarrow::IntervalType#
#include <arrow/type.h>

Represents a number of days and milliseconds (fraction of day).

Public Functions

inlinevirtualintbit_width()constoverride#

Returns the type’s fixed bit width, if any.

Returns -1 for non-fixed-width types, and should only be used for subclasses of FixedWidthType

inlinevirtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

structDayMilliseconds#
#include <arrow/type.h>
classMonthDayNanoIntervalType:publicarrow::IntervalType#
#include <arrow/type.h>

Represents a number of months, days and nanoseconds between two dates.

All fields are independent from one another.

Public Functions

inlinevirtualintbit_width()constoverride#

Returns the type’s fixed bit width, if any.

Returns -1 for non-fixed-width types, and should only be used for subclasses of FixedWidthType

inlinevirtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

structMonthDayNanos#
#include <arrow/type.h>
classDurationType:publicarrow::TemporalType,publicarrow::ParametricType#
#include <arrow/type.h>

Represents an elapsed time without any relation to a calendar artifact.

Public Functions

inlinevirtualintbit_width()constoverride#

Returns the type’s fixed bit width, if any.

Returns -1 for non-fixed-width types, and should only be used for subclasses of FixedWidthType

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

Binary-like#

classBinaryType:publicarrow::BaseBinaryType#
#include <arrow/type.h>

Concrete type class for variable-size binary data.

Subclassed byarrow::StringType

Public Functions

inlinevirtualDataTypeLayoutlayout()constoverride#

Return the data type layout.

Children are not included.

Note

Experimental API

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

classBinaryViewType:publicarrow::DataType#
#include <arrow/type.h>

Concrete type class for variable-size binary view data.

Subclassed byarrow::StringViewType

Public Functions

inlinevirtualDataTypeLayoutlayout()constoverride#

Return the data type layout.

Children are not included.

Note

Experimental API

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

unionc_type#
#include <arrow/type.h>

Variable length string or binary with inline optimization for small values (12 bytes or fewer).

This is similar to std::string_view except limited in size to INT32_MAX and at least the first four bytes of the string are copied inline (accessible without pointer dereference). This inline prefix allows failing comparisons early. Furthermore when dealing with short strings the CPU cache working set is reduced since many can be inline.

This union supports two states:

  • Entirely inlined string data

      |----|--------------|   ^    ^   |    |size    in-line string data, zero padded

  • Reference into a buffer

      |----|----|----|----|   ^    ^    ^    ^   |    |    |    |size    |    |    `------.    prefix   |           |          buffer index   |                    offset in buffer

Adapted from TU Munich’s UmbraDB1, Velox, DuckDB.

Alignment to 64 bits enables an aligned load of the size and prefix into a single 64 bit integer, which is useful to the comparison fast path.

Public Functions

inlineint32_tsize()const

The number of bytes viewed.

inlineboolis_inline()const#

True if the view’s data is entirely stored inline.

inlineconstuint8_t*inline_data()const&#

Return a pointer to the inline data of a view.

For inline views, this points to the entire data of the view. For other views, this points to the 4 byte prefix.

constuint8_t*inline_data()&&=delete#

Public Members

int32_tsize#
std::array<uint8_t,kInlineSize>data#
structarrow::BinaryViewType::c_type::[anonymous]inlined#
std::array<uint8_t,kPrefixSize>prefix#
int32_tbuffer_index#
int32_toffset#
structarrow::BinaryViewType::c_type::[anonymous]ref#
classLargeBinaryType:publicarrow::BaseBinaryType#
#include <arrow/type.h>

Concrete type class for large variable-size binary data.

Subclassed byarrow::LargeStringType

Public Functions

inlinevirtualDataTypeLayoutlayout()constoverride#

Return the data type layout.

Children are not included.

Note

Experimental API

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

classStringType:publicarrow::BinaryType#
#include <arrow/type.h>

Concrete type class for variable-size string data, utf8-encoded.

Public Functions

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

classStringViewType:publicarrow::BinaryViewType#
#include <arrow/type.h>

Concrete type class for variable-size string data, utf8-encoded.

Public Functions

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

classLargeStringType:publicarrow::LargeBinaryType#
#include <arrow/type.h>

Concrete type class for large variable-size string data, utf8-encoded.

Public Functions

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

classFixedSizeBinaryType:publicarrow::FixedWidthType,publicarrow::ParametricType#
#include <arrow/type.h>

Concrete type class for fixed-size binary data.

Subclassed byarrow::DecimalType

Public Functions

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

inlinevirtualDataTypeLayoutlayout()constoverride#

Return the data type layout.

Children are not included.

Note

Experimental API

inlinevirtualintbyte_width()constoverride#

Returns the type’s fixed byte width, if any.

Returns -1 for non-fixed-width types, and should only be used for subclasses of FixedWidthType

virtualintbit_width()constoverride#

Returns the type’s fixed bit width, if any.

Returns -1 for non-fixed-width types, and should only be used for subclasses of FixedWidthType

Nested#

classBaseListType:publicarrow::NestedType#
#include <arrow/type.h>

Base class for all variable-size list data types.

Subclassed byarrow::FixedSizeListType,arrow::LargeListType,arrow::LargeListViewType,arrow::ListType,arrow::ListViewType

classListType:publicarrow::BaseListType#
#include <arrow/type.h>

Concrete type class for list data.

List data is nested data where each value is a variable number of child items. Lists can be recursively nested, for example list(list(int32)).

Subclassed byarrow::MapType

Public Functions

inlinevirtualDataTypeLayoutlayout()constoverride#

Return the data type layout.

Children are not included.

Note

Experimental API

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

classLargeListType:publicarrow::BaseListType#
#include <arrow/type.h>

Concrete type class for large list data.

LargeListType is likeListType but with 64-bit rather than 32-bit offsets.

Public Functions

inlinevirtualDataTypeLayoutlayout()constoverride#

Return the data type layout.

Children are not included.

Note

Experimental API

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

classListViewType:publicarrow::BaseListType#
#include <arrow/type.h>

Type class for array of list views.

Public Functions

inlinevirtualDataTypeLayoutlayout()constoverride#

Return the data type layout.

Children are not included.

Note

Experimental API

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

classLargeListViewType:publicarrow::BaseListType#
#include <arrow/type.h>

Concrete type class for large list-view data.

LargeListViewType is likeListViewType but with 64-bit rather than 32-bit offsets and sizes.

Public Functions

inlinevirtualDataTypeLayoutlayout()constoverride#

Return the data type layout.

Children are not included.

Note

Experimental API

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

classMapType:publicarrow::ListType#
#include <arrow/type.h>

Concrete type class for map data.

Map data is nested data where each value is a variable number of key-item pairs. Its physical representation is the same as a list of{key,item} structs.

Maps can be recursively nested, for example map(utf8, map(utf8, int32)).

Public Functions

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

classFixedSizeListType:publicarrow::BaseListType#
#include <arrow/type.h>

Concrete type class for fixed size list data.

Public Functions

inlinevirtualDataTypeLayoutlayout()constoverride#

Return the data type layout.

Children are not included.

Note

Experimental API

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

classStructType:publicarrow::NestedType#
#include <arrow/type.h>

Concrete type class for struct data.

Public Functions

inlinevirtualDataTypeLayoutlayout()constoverride#

Return the data type layout.

Children are not included.

Note

Experimental API

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

std::shared_ptr<Field>GetFieldByName(conststd::string&name)const#

Returns null if name not found.

FieldVectorGetAllFieldsByName(conststd::string&name)const#

Return all fields having this name.

intGetFieldIndex(conststd::string&name)const#

Returns -1 if name not found or if there are multiple fields having the same name.

std::vector<int>GetAllFieldIndices(conststd::string&name)const#

Return the indices of all fields having this name in sorted order.

Result<std::shared_ptr<StructType>>AddField(inti,conststd::shared_ptr<Field>&field)const#

Create a newStructType with field added at given index.

Result<std::shared_ptr<StructType>>RemoveField(inti)const#

Create a newStructType by removing the field at given index.

Result<std::shared_ptr<StructType>>SetField(inti,conststd::shared_ptr<Field>&field)const#

Create a newStructType by changing the field at given index.

classUnionType:publicarrow::NestedType#
#include <arrow/type.h>

Base type class for union data.

Subclassed byarrow::DenseUnionType,arrow::SparseUnionType

Public Functions

virtualDataTypeLayoutlayout()constoverride#

Return the data type layout.

Children are not included.

Note

Experimental API

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlineconststd::vector<int8_t>&type_codes()const#

The array of logical type ids.

For example, the first type in the union might be denoted by the id 5 (instead of 0).

inlineconststd::vector<int>&child_ids()const#

An array mapping logical type ids to physical child ids.

classSparseUnionType:publicarrow::UnionType#
#include <arrow/type.h>

Concrete type class for sparse union data.

A sparse union is a nested type where each logical value is taken from a single child. A buffer of 8-bit type ids indicates which child a given logical value is to be taken from.

In a sparse union, each child array should have the same length as the union array, regardless of the actual number of union values that refer to it.

Note that, unlike most other types, unions don’t have a top-level validity bitmap.

Public Functions

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

classDenseUnionType:publicarrow::UnionType#
#include <arrow/type.h>

Concrete type class for dense union data.

A dense union is a nested type where each logical value is taken from a single child, at a specific offset. A buffer of 8-bit type ids indicates which child a given logical value is to be taken from, and a buffer of 32-bit offsets indicates at which physical position in the given child array the logical value is to be taken from.

Unlike a sparse union, a dense union allows encoding only the child array values which are actually referred to by the union array. This is counterbalanced by the additional footprint of the offsets buffer, and the additional indirection cost when looking up values.

Note that, unlike most other types, unions don’t have a top-level validity bitmap.

Public Functions

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

classRunEndEncodedType:publicarrow::NestedType#
#include <arrow/type.h>

Type class for run-end encoded data.

Public Functions

inlinevirtualDataTypeLayoutlayout()constoverride#

Return the data type layout.

Children are not included.

Note

Experimental API

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

Dictionary-encoded#

classDictionaryType:publicarrow::FixedWidthType#

Dictionary-encoded value type with data-dependent dictionary.

Indices are represented by any integer types.

Public Functions

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

virtualintbit_width()constoverride#

Returns the type’s fixed bit width, if any.

Returns -1 for non-fixed-width types, and should only be used for subclasses of FixedWidthType

virtualDataTypeLayoutlayout()constoverride#

Return the data type layout.

Children are not included.

Note

Experimental API

Extension types#

classExtensionType:publicarrow::DataType#

The base class for custom / user-defined types.

Subclassed byarrow::extension::Bool8Type,arrow::extension::FixedShapeTensorType,arrow::extension::JsonExtensionType,arrow::extension::OpaqueType,arrow::extension::UuidType

Public Functions

inlineconststd::shared_ptr<DataType>&storage_type()const#

The type of array used to represent this extension type’s data.

inlinevirtualType::typestorage_id()constoverride#

Return the type category of the storage type.

virtualDataTypeLayoutlayout()constoverride#

Return the data type layout.

Children are not included.

Note

Experimental API

virtualstd::stringToString(boolshow_metadata=false)constoverride#

A string representation of the type, including any children.

inlinevirtualstd::stringname()constoverride#

A string name of the type, omitting any child fields.

Since

0.7.0

inlinevirtualint32_tbyte_width()constoverride#

Returns the type’s fixed byte width, if any.

Returns -1 for non-fixed-width types, and should only be used for subclasses of FixedWidthType

inlinevirtualintbit_width()constoverride#

Returns the type’s fixed bit width, if any.

Returns -1 for non-fixed-width types, and should only be used for subclasses of FixedWidthType

virtualstd::stringextension_name()const=0#

Unique name of extension type used to identify type for serialization.

Returns:

the string name of the extension

virtualboolExtensionEquals(constExtensionType&other)const=0#

Determine if two instances of the same extension types are equal.

Invoked fromExtensionType::Equals

Parameters:

other[in] the type to compare this type with

Returns:

bool true if type instances are equal

virtualstd::shared_ptr<Array>MakeArray(std::shared_ptr<ArrayData>data)const=0#

Wrap built-inArray type in a user-definedExtensionArray instance.

Parameters:

data[in] the physical storage for the extension type

virtualResult<std::shared_ptr<DataType>>Deserialize(std::shared_ptr<DataType>storage_type,conststd::string&serialized_data)const=0#

Create an instance of theExtensionType given the actual storage type and the serialized representation.

Parameters:
  • storage_type[in] the physical storage type of the extension

  • serialized_data[in] the serialized representation produced by Serialize

virtualstd::stringSerialize()const=0#

Create a serialized representation of the extension type’s metadata.

The storage type will be handled automatically in IPC code paths

Returns:

the serialized representation

Public Static Functions

staticstd::shared_ptr<Array>WrapArray(conststd::shared_ptr<DataType>&ext_type,conststd::shared_ptr<Array>&storage)#

Wrap the given storage array as an extension array.

staticstd::shared_ptr<ChunkedArray>WrapArray(conststd::shared_ptr<DataType>&ext_type,conststd::shared_ptr<ChunkedArray>&storage)#

Wrap the given chunked storage array as a chunked extension array.

Fields and Schemas#

std::shared_ptr<Field>field(std::stringname,std::shared_ptr<DataType>type,boolnullable=true,std::shared_ptr<constKeyValueMetadata>metadata=NULLPTR)#

Create aField instance.

Parameters:
  • name – the field name

  • type – the field value type

  • nullable – whether the values are nullable, default true

  • metadata – any custom key-value metadata, default null

std::shared_ptr<Field>field(std::stringname,std::shared_ptr<DataType>type,std::shared_ptr<constKeyValueMetadata>metadata)#

Create aField instance with metadata.

The field will be assumed to be nullable.

Parameters:
  • name – the field name

  • type – the field value type

  • metadata – any custom key-value metadata

std::shared_ptr<Schema>schema(FieldVectorfields,std::shared_ptr<constKeyValueMetadata>metadata=NULLPTR)#

Create aSchema instance.

Parameters:
  • fields – the schema’s fields

  • metadata – any custom key-value metadata, default null

Returns:

schema shared_ptr toSchema

std::shared_ptr<Schema>schema(std::initializer_list<std::pair<std::string,std::shared_ptr<DataType>>>fields,std::shared_ptr<constKeyValueMetadata>metadata=NULLPTR)#

Create aSchema instance from (name, type) pairs.

The schema’s fields will all be nullable with no associated metadata.

Parameters:
  • fields – (name, type) pairs of the schema’s fields

  • metadata – any custom key-value metadata, default null

Returns:

schema shared_ptr toSchema

std::shared_ptr<Schema>schema(FieldVectorfields,Endiannessendianness,std::shared_ptr<constKeyValueMetadata>metadata=NULLPTR)#

Create aSchema instance.

Parameters:
  • fields – the schema’s fields

  • endianness – the endianness of the data

  • metadata – any custom key-value metadata, default null

Returns:

schema shared_ptr toSchema

std::shared_ptr<Schema>schema(std::initializer_list<std::pair<std::string,std::shared_ptr<DataType>>>fields,Endiannessendianness,std::shared_ptr<constKeyValueMetadata>metadata=NULLPTR)#

Create aSchema instance.

The schema’s fields will all be nullable with no associated metadata.

Parameters:
  • fields – (name, type) pairs of the schema’s fields

  • endianness – the endianness of the data

  • metadata – any custom key-value metadata, default null

Returns:

schema shared_ptr toSchema

classField:publicarrow::detail::Fingerprintable,publicarrow::util::EqualityComparable<Field>#

The combination of a field name and data type, with optional metadata.

Fields are used to describe the individual constituents of a nestedDataType or aSchema.

A field’s metadata is represented by aKeyValueMetadata instance, which holds arbitrary key-value pairs.

Public Functions

inlinestd::shared_ptr<constKeyValueMetadata>metadata()const#

Return the field’s attached metadata.

boolHasMetadata()const#

Return whether the field has non-empty metadata.

std::shared_ptr<Field>WithMetadata(conststd::shared_ptr<constKeyValueMetadata>&metadata)const#

Return a copy of this field with the given metadata attached to it.

std::shared_ptr<Field>WithMergedMetadata(conststd::shared_ptr<constKeyValueMetadata>&metadata)const#

EXPERIMENTAL: Return a copy of this field with the given metadata merged with existing metadata (any colliding keys will be overridden by the passed metadata)

std::shared_ptr<Field>RemoveMetadata()const#

Return a copy of this field without any metadata attached to it.

std::shared_ptr<Field>WithType(conststd::shared_ptr<DataType>&type)const#

Return a copy of this field with the replaced type.

std::shared_ptr<Field>WithName(conststd::string&name)const#

Return a copy of this field with the replaced name.

std::shared_ptr<Field>WithNullable(boolnullable)const#

Return a copy of this field with the replaced nullability.

Result<std::shared_ptr<Field>>MergeWith(constField&other,MergeOptionsoptions=MergeOptions::Defaults())const#

Merge the current field with a field of the same name.

The two fields must be compatible, i.e:

  • have the same name

  • have the same type, or of compatible types according tooptions.

The metadata of the current field is preserved; the metadata of the other field is discarded.

boolEquals(constField&other,boolcheck_metadata=false)const#

Indicate if fields are equals.

Parameters:
  • other[in] field to check equality with.

  • check_metadata[in] controls if it should check for metadata equality.

Returns:

true if fields are equal, false otherwise.

boolIsCompatibleWith(constField&other)const#

Indicate if fields are compatibles.

See the criteria of MergeWith.

Returns:

true if fields are compatible, false otherwise.

std::stringToString(boolshow_metadata=false)const#

Return a string representation ot the field.

Parameters:

show_metadata[in] when true, ifKeyValueMetadata is non-empty, print keys and values in the output

inlineconststd::string&name()const#

Return the field name.

inlineconststd::shared_ptr<DataType>&type()const#

Return the field data type.

inlineboolnullable()const#

Return whether the field is nullable.

structMergeOptions:publicarrow::util::ToStringOstreamable<MergeOptions>#

Options that control the behavior ofMergeWith.

Options are to be added to allow type conversions, including integer widening, promotion from integer to float, or conversion to or from boolean.

Public Functions

std::stringToString()const#

Get a human-readable representation of the options.

Public Members

boolpromote_nullability=true#

If true, aField ofNullType can be unified with aField of another type.

The unified field will be of the other type and become nullable. Nullability will be promoted to the looser option (nullable if one is not nullable).

boolpromote_decimal=false#

Allow a decimal to be unified with another decimal of the same width, adjusting scale and precision as appropriate.

May fail if the adjustment is not possible.

boolpromote_decimal_to_float=false#

Allow a decimal to be promoted to a float.

The float type will not itself be promoted (e.g.Decimal128 + Float32 = Float32).

boolpromote_integer_to_decimal=false#

Allow an integer to be promoted to a decimal.

May fail if the decimal has insufficient precision to accommodate the integer (see promote_numeric_width).

boolpromote_integer_to_float=false#

Allow an integer of a given bit width to be promoted to a float; the result will be a float of an equal or greater bit width to both of the inputs.

Examples:

  • int8 + float32 = float32

  • int32 + float32 = float64

  • int32 + float64 = float64 Because an int32 cannot always be represented exactly in the 24 bits of a float32 mantissa.

boolpromote_integer_sign=false#

Allow an unsigned integer of a given bit width to be promoted to a signed integer that fits into the signed type: uint + int16 = int16 When widening is needed, set promote_numeric_width to true: uint16 + int16 = int32.

boolpromote_numeric_width=false#

Allow an integer, float, or decimal of a given bit width to be promoted to an equivalent type of a greater bit width.

boolpromote_binary=false#

Allow strings to be promoted to binary types.

Promotion of fixed size binary types to variable sized formats, and binary to large binary, and string to large string.

boolpromote_temporal_unit=false#

Second to millisecond, Time32 to Time64, Time32(SECOND) to Time32(MILLI), etc.

boolpromote_list=false#

Allow promotion from a list to a large-list and from a fixed-size list to a variable sized list.

boolpromote_dictionary=false#

Unify dictionary index types and dictionary value types.

boolpromote_dictionary_ordered=false#

Allow merging ordered and non-ordered dictionaries.

The result will be ordered if and only if both inputs are ordered.

Public Static Functions

staticinlineMergeOptionsDefaults()#

Get default options. OnlyNullType will be merged with other types.

staticMergeOptionsPermissive()#

Get permissive options.

All options are enabled, except promote_dictionary_ordered.

classSchema:publicarrow::detail::Fingerprintable,publicarrow::util::EqualityComparable<Schema>,publicarrow::util::ToStringOstreamable<Schema>#

Sequence ofarrow::Field objects describing the columns of a record batch or table data structure.

Public Functions

boolEquals(constSchema&other,boolcheck_metadata=false)const#

Returns true if all of the schema fields are equal.

std::shared_ptr<Schema>WithEndianness(Endiannessendianness)const#

Set endianness in the schema.

Returns:

newSchema

Endiannessendianness()const#

Return endianness in the schema.

boolis_native_endian()const#

Indicate if endianness is equal to platform-native endianness.

intnum_fields()const#

Return the number of fields (columns) in the schema.

conststd::shared_ptr<Field>&field(inti)const#

Return the ith schema element. Does not boundscheck.

std::shared_ptr<Field>GetFieldByName(std::string_viewname)const#

Returns null if name not found.

FieldVectorGetAllFieldsByName(std::string_viewname)const#

Return the indices of all fields having this name in sorted order.

intGetFieldIndex(std::string_viewname)const#

Returns -1 if name not found.

std::vector<int>GetAllFieldIndices(std::string_viewname)const#

Return the indices of all fields having this name.

StatusCanReferenceFieldByName(std::string_viewname)const#

Indicate if field namedname can be found unambiguously in the schema.

StatusCanReferenceFieldsByNames(conststd::vector<std::string>&names)const#

Indicate if fields namednames can be found unambiguously in the schema.

conststd::shared_ptr<constKeyValueMetadata>&metadata()const#

The custom key-value metadata, if any.

Returns:

metadata may be null

std::stringToString(boolshow_metadata=false)const#

Render a string representation of the schema suitable for debugging.

Parameters:

show_metadata[in] when true, ifKeyValueMetadata is non-empty, print keys and values in the output

Result<std::shared_ptr<Schema>>WithNames(conststd::vector<std::string>&names)const#

Replace field names with new names.

Parameters:

names[in] new names

Returns:

newSchema

std::shared_ptr<Schema>WithMetadata(conststd::shared_ptr<constKeyValueMetadata>&metadata)const#

Replace key-value metadata with new metadata.

Parameters:

metadata[in] newKeyValueMetadata

Returns:

newSchema

std::shared_ptr<Schema>RemoveMetadata()const#

Return copy ofSchema without theKeyValueMetadata.

boolHasMetadata()const#

Indicate that theSchema has non-empty KevValueMetadata.

boolHasDistinctFieldNames()const#

Indicate that theSchema has distinct field names.

classKeyValueMetadata#

A container for key-value pair type metadata. Not thread-safe.

Helpers for looking up fields#

classFieldPath#

Represents a path to a nested field using indices of child fields.

For example, given indices {5, 9, 3} the field would be retrieved with schema->field(5)->type()->field(9)->type()->field(3)

Attempting to retrieve a child field using aFieldPath which is not valid for a given schema will raise an error. Invalid FieldPaths include:

  • an index is out of range

  • the path is empty (note: a default constructedFieldPath will be empty)

FieldPaths provide a number of accessors for drilling down to potentially nested children. They are overloaded for convenience to supportSchema (returns a field),DataType (returns a child field),Field (returns a child field of this field’s type)Array (returns a child array),RecordBatch (returns a column).

Public Functions

FieldPath()=default#
inlineFieldPath(std::vector<int>indices)#
inlineFieldPath(std::initializer_list<int>indices)#
std::stringToString()const#
size_thash()const#
inlineboolempty()const#
inlinebooloperator==(constFieldPath&other)const#
inlinebooloperator!=(constFieldPath&other)const#
inlineconststd::vector<int>&indices()const#
inlineintoperator[](size_ti)const#
inlinestd::vector<int>::const_iteratorbegin()const#
inlinestd::vector<int>::const_iteratorend()const#
Result<std::shared_ptr<Field>>Get(constSchema&schema)const#

Retrieve the referenced childField from aSchema,Field, orDataType.

Result<std::shared_ptr<Field>>Get(constField&field)const#
Result<std::shared_ptr<Field>>Get(constDataType&type)const#
Result<std::shared_ptr<Field>>Get(constFieldVector&fields)const#
Result<std::shared_ptr<Array>>Get(constRecordBatch&batch)const#

Retrieve the referenced column from aRecordBatch orTable.

Result<std::shared_ptr<ChunkedArray>>Get(constTable&table)const#
Result<std::shared_ptr<Array>>Get(constArray&array)const#

Retrieve the referenced child from anArray orArrayData.

Result<std::shared_ptr<ArrayData>>Get(constArrayData&data)const#
Result<std::shared_ptr<ChunkedArray>>Get(constChunkedArray&chunked_array)const#

Retrieve the referenced child from aChunkedArray.

Result<std::shared_ptr<Array>>GetFlattened(constArray&array,MemoryPool*pool=NULLPTR)const#

Retrieve the referenced child/column from anArray,ArrayData,ChunkedArray,RecordBatch, orTable.

UnlikeFieldPath::Get, these variants are not zero-copy and the retrieved child’s null bitmap is ANDed with its ancestors’

Result<std::shared_ptr<ArrayData>>GetFlattened(constArrayData&data,MemoryPool*pool=NULLPTR)const#
Result<std::shared_ptr<ChunkedArray>>GetFlattened(constChunkedArray&chunked_array,MemoryPool*pool=NULLPTR)const#
Result<std::shared_ptr<Array>>GetFlattened(constRecordBatch&batch,MemoryPool*pool=NULLPTR)const#
Result<std::shared_ptr<ChunkedArray>>GetFlattened(constTable&table,MemoryPool*pool=NULLPTR)const#

Public Static Functions

staticResult<std::shared_ptr<Schema>>GetAll(constSchema&schema,conststd::vector<FieldPath>&paths)#
structHash#

Public Functions

inlinesize_toperator()(constFieldPath&path)const#
classFieldRef:publicarrow::util::EqualityComparable<FieldRef>#

Descriptor of a (potentially nested) field within a schema.

UnlikeFieldPath (which exclusively uses indices of child fields),FieldRef may reference a field by name. It is intended to replace parameters likeintfield_index andconststd::string&field_name; it can be implicitly constructed from either a field index or a name.

Nested fields can be referenced as well. Given schema({field(“a”, struct_({field(“n”,null())})), field(“b”,int32())})

the following all indicate the nested field named “n”:FieldRef ref1(0, 0);FieldRef ref2(“a”, 0);FieldRef ref3(“a”, “n”);FieldRef ref4(0, “n”); ARROW_ASSIGN_OR_RAISE(FieldRef ref5,FieldRef::FromDotPath(“.a[0]”));

FieldPaths matching aFieldRef are retrieved using the member function FindAll. Multiple matches are possible because field names may be duplicated within a schema. For example:Schema a_is_ambiguous({field(“a”,int32()), field(“a”,float32())}); auto matches =FieldRef(“a”).FindAll(a_is_ambiguous); assert(matches.size() == 2); assert(matches[0].Get(a_is_ambiguous)->Equals(a_is_ambiguous.field(0))); assert(matches[1].Get(a_is_ambiguous)->Equals(a_is_ambiguous.field(1)));

Convenience accessors are available which raise a helpful error if the field is not found or ambiguous, and for immediately callingFieldPath::Get to retrieve any matching children: auto maybe_match =FieldRef(“struct”, “field_i32”).FindOneOrNone(schema); auto maybe_column =FieldRef(“struct”, “field_i32”).GetOne(some_table);

Public Types

template<typenameT>
usingGetType=decltype(std::declval<FieldPath>().Get(std::declval<T>()).ValueOrDie())#

Public Functions

FieldRef()=default#
FieldRef(FieldPathindices)#

Construct aFieldRef using a string of indices.

The reference will be retrieved as: schema.fields[self.indices[0]].type.fields[self.indices[1]] …

Empty indices are not valid.

inlineFieldRef(std::stringname)#

Construct a by-nameFieldRef.

Multiple fields may match a by-nameFieldRef: [f for f in schema.fields where f.name == self.name]

inlineFieldRef(constchar*name)#
inlineFieldRef(intindex)#

Equivalent to a single index string of indices.

inlineexplicitFieldRef(std::vector<FieldRef>refs)#

Construct a nestedFieldRef.

template<typenameA0,typenameA1,typename...A>
inlineFieldRef(A0&&a0,A1&&a1,A&&...a)#

Convenience constructor for nested FieldRefs: each argument will be used to construct aFieldRef.

std::stringToDotPath()const#
inlineboolEquals(constFieldRef&other)const#
std::stringToString()const#
size_thash()const#
inlineexplicitoperatorbool()const#
inlinebooloperator!()const#
inlineboolIsFieldPath()const#
inlineboolIsName()const#
inlineboolIsNested()const#
inlineboolIsNameSequence()const#

Return true if this ref is a name or a nested sequence of only names.

Useful for determining if iteration is possible without recursion or inner loops

inlineconstFieldPath*field_path()const#
inlineconststd::string*name()const#
inlineconststd::vector<FieldRef>*nested_refs()const#
std::vector<FieldPath>FindAll(constSchema&schema)const#

RetrieveFieldPath of every child field which matches thisFieldRef.

std::vector<FieldPath>FindAll(constField&field)const#
std::vector<FieldPath>FindAll(constDataType&type)const#
std::vector<FieldPath>FindAll(constFieldVector&fields)const#
std::vector<FieldPath>FindAll(constArrayData&array)const#

Convenience function which applies FindAll to arg’s type or schema.

std::vector<FieldPath>FindAll(constArray&array)const#
std::vector<FieldPath>FindAll(constChunkedArray&chunked_array)const#
std::vector<FieldPath>FindAll(constRecordBatch&batch)const#
std::vector<FieldPath>FindAll(constTable&table)const#
template<typenameT>
inlineStatusCheckNonEmpty(conststd::vector<FieldPath>&matches,constT&root)const#

Convenience function: raise an error if matches is empty.

template<typenameT>
inlineStatusCheckNonMultiple(conststd::vector<FieldPath>&matches,constT&root)const#

Convenience function: raise an error if matches contains multiple FieldPaths.

template<typenameT>
inlineResult<FieldPath>FindOne(constT&root)const#

RetrieveFieldPath of a single child field which matches thisFieldRef.

Emit an error if none or multiple match.

template<typenameT>
inlineResult<FieldPath>FindOneOrNone(constT&root)const#

RetrieveFieldPath of a single child field which matches thisFieldRef.

Emit an error if multiple match. An empty (invalid)FieldPath will be returned if none match.

template<typenameT>
inlinestd::vector<GetType<T>>GetAll(constT&root)const#

Get all children matching thisFieldRef.

template<typenameT>
inlineResult<std::vector<GetType<T>>>GetAllFlattened(constT&root,MemoryPool*pool=NULLPTR)const#

Get all children matching thisFieldRef.

UnlikeFieldRef::GetAll, this variant is not zero-copy and the retrieved children’s null bitmaps are ANDed with their ancestors’

template<typenameT>
inlineResult<GetType<T>>GetOne(constT&root)const#

Get the single child matching thisFieldRef.

Emit an error if none or multiple match.

template<typenameT>
inlineResult<GetType<T>>GetOneFlattened(constT&root,MemoryPool*pool=NULLPTR)const#

Get the single child matching thisFieldRef.

UnlikeFieldRef::GetOne, this variant is not zero-copy and the retrieved child’s null bitmap is ANDed with its ancestors’

template<typenameT>
inlineResult<GetType<T>>GetOneOrNone(constT&root)const#

Get the single child matching thisFieldRef.

Return nullptr if none match, emit an error if multiple match.

template<typenameT>
inlineResult<GetType<T>>GetOneOrNoneFlattened(constT&root,MemoryPool*pool=NULLPTR)const#

Get the single child matching thisFieldRef.

Return nullptr if none match, emit an error if multiple match. UnlikeFieldRef::GetOneOrNone, this variant is not zero-copy and the retrieved child’s null bitmap is ANDed with its ancestors’

Public Static Functions

staticResult<FieldRef>FromDotPath(conststd::string&dot_path)#

Parse a dot path into aFieldRef.

dot_path = ‘.’ name | ‘[’ digit+ ‘]’ | dot_path+

Examples: “.alpha” =>FieldRef(“alpha”) “[2]” => FieldRef(2) “.beta[3]” =>FieldRef(“beta”, 3) “[5].gamma.delta[7]” =>FieldRef(5, “gamma”, “delta”, 7) “.hello world” =>FieldRef(“hello world”) R”(.\[y\]\tho.\)” =>FieldRef(R”([y]\tho.\)”)

Note: When parsing a name, a ‘' preceding any other character will be dropped from the resulting name. Therefore if a name must contain the characters ‘.’, ‘', or ‘[’ those must be escaped with a preceding ‘'.

structHash#

Public Functions

inlinesize_toperator()(constFieldRef&ref)const#

Utilities#

classTypeVisitor#

Abstract type visitor class.

Subclass this to create a visitor that can be used with theDataType::Accept() method.

Public Functions

virtual~TypeVisitor()=default#
virtualStatusVisit(constNullType&type)#
virtualStatusVisit(constBooleanType&type)#
virtualStatusVisit(constInt8Type&type)#
virtualStatusVisit(constInt16Type&type)#
virtualStatusVisit(constInt32Type&type)#
virtualStatusVisit(constInt64Type&type)#
virtualStatusVisit(constUInt8Type&type)#
virtualStatusVisit(constUInt16Type&type)#
virtualStatusVisit(constUInt32Type&type)#
virtualStatusVisit(constUInt64Type&type)#
virtualStatusVisit(constHalfFloatType&type)#
virtualStatusVisit(constFloatType&type)#
virtualStatusVisit(constDoubleType&type)#
virtualStatusVisit(constStringType&type)#
virtualStatusVisit(constStringViewType&type)#
virtualStatusVisit(constBinaryType&type)#
virtualStatusVisit(constBinaryViewType&type)#
virtualStatusVisit(constLargeStringType&type)#
virtualStatusVisit(constLargeBinaryType&type)#
virtualStatusVisit(constFixedSizeBinaryType&type)#
virtualStatusVisit(constDate64Type&type)#
virtualStatusVisit(constDate32Type&type)#
virtualStatusVisit(constTime32Type&type)#
virtualStatusVisit(constTime64Type&type)#
virtualStatusVisit(constTimestampType&type)#
virtualStatusVisit(constMonthDayNanoIntervalType&type)#
virtualStatusVisit(constMonthIntervalType&type)#
virtualStatusVisit(constDayTimeIntervalType&type)#
virtualStatusVisit(constDurationType&type)#
virtualStatusVisit(constDecimal32Type&type)#
virtualStatusVisit(constDecimal64Type&type)#
virtualStatusVisit(constDecimal128Type&type)#
virtualStatusVisit(constDecimal256Type&type)#
virtualStatusVisit(constListType&type)#
virtualStatusVisit(constLargeListType&type)#
virtualStatusVisit(constListViewType&scalar)#
virtualStatusVisit(constLargeListViewType&scalar)#
virtualStatusVisit(constMapType&type)#
virtualStatusVisit(constFixedSizeListType&type)#
virtualStatusVisit(constStructType&type)#
virtualStatusVisit(constSparseUnionType&type)#
virtualStatusVisit(constDenseUnionType&type)#
virtualStatusVisit(constDictionaryType&type)#
virtualStatusVisit(constRunEndEncodedType&type)#
virtualStatusVisit(constExtensionType&type)#
On this page