Acero Overview#

This page gives an overview of the basic Acero concepts and helps distinguish Acerofrom other modules in the Arrow code base. It’s intended for users, developers,potential contributors, and for those that would like to extend Acero, either forresearch or for business use. This page assumes the reader is already familiar withcore Arrow concepts. This page does not expect any existing knowledge in relationalalgebra.

What is Acero?#

Acero is a C++ library that can be used to analyze large (potentially infinite) streamsof data. Acero allows computation to be expressed as an “execution plan” (ExecPlan).An execution plan takes in zero or more streams of input data and emits a singlestream of output data. The plan describes how the data will be transformed as itpasses through. For example, a plan might:

  • Merge two streams of data using a common column

  • Create additional columns by evaluating expressions against the existing columns

  • Consume a stream of data by writing it to disk in a partitioned layout

A sample execution plan that joins three streams of data and writes to disk

Acero is not…#

A Library for Data Scientists#

Acero is not intended to be used directly by data scientists. It is expected thatend users will typically be using some kind of frontend. For example, Pandas, Ibis,or SQL. The API for Acero is focused around capabilities and available algorithms.However, such users may be interested in knowing more about how Acero works so thatthey can better understand how the backend processing for their libraries operates.

A Database#

A database (or DBMS) is typically a much more expansive application and often packagedas a standalone service. Acero could be a component in a database (most databases havesome kind of execution engine) or could be a component in some other data processingapplication that hardly resembles a database. Acero does not concern itself withuser management, external communication, isolation, durability, or consistency. Inaddition, Acero is focused primarily on the read path, and the write utilities lackany sort of transaction support.

An Optimizer#

Acero does not have an SQL parser. It does not have a query planner. It does not haveany sort of optimizer. Acero expects to be given very detailed and low-level instructionson how to manipulate data and then it will perform that manipulation exactly as described.

Creating the best execution plan is very hard. Small details can have a big impact onperformance. We do think an optimizer is important but we believe it should beimplemented independent of acero, hopefully in a composable way through standards suchas Substrait so that any backend could leverage it.

Distributed#

Acero does not provide distributed execution. However, Acero aims to be usable by a distributedquery execution engine. In other words, Acero will not configure and coordinate workers butit does expect to be used as a worker. Sometimes, the distinction is a bit fuzzy. For example,an Acero source may be a smart storage device that is capable of performing filtering or otheradvanced analytics. One might consider this a distributed plan. The key distinction is Acerodoes not have the capability of transforming a logical plan into a distributed execution plan.That step will need to be done elsewhere.

Acero vs…#

Arrow Compute#

This is described in more detail inRelation to Arrow C++ but the key differenceis that Acero handles streams of data and Arrow Compute handles situations where all thedata is in memory.

Arrow Datasets#

The Arrow datasets library provides some basic routines for discovering, scanning, andwriting collections of files. The datasets module depends on Acero. Both scanning andwriting datasets uses Acero. The scan node and the write node are part of the datasetsmodule. This helps to keep the complexity of file formats and filesystems out of the coreAcero logic.

Substrait#

Substrait is a project establishing standards for query plans. Acero executes query plansand generates data. This makes Acero a Substrait consumer. There are more details on theSubstrait capabilities inUsing Acero with Substrait.

Datafusion / DuckDb / Velox / Etc.#

There are many columnar data engines emerging. We view this as a good thing and encourageprojects like Substrait to help allow switching between engines as needed. We generallydiscourage comparative benchmarks as they are almost inevitably going to be workload-drivenand rarely manage to capture an apples-vs-apples comparison. Discussions of the pros andcons of each is beyond the scope of this guide.

Relation to Arrow C++#

The Acero module is part of the Arrow C++ implementation. It is built as a separatemodule but it depends on core Arrow modules and does not stand alone. Acero usesand extends the capabilities from the core Arrow module and the Arrow compute kernels.

A diagram of layers with core on the left, compute in the middle, and acero on the right

The core Arrow library provides containers for buffers and arrays that are laid out accordingto the Arrow columnar format. With few exceptions the core Arrow library does not examineor modify the contents of buffers. For example, converting a string array from lowercasestrings to uppercase strings would not be a part of the core Arrow library because that wouldrequire examining the contents of the array.

The compute module expands on the core library and provides functions which analyze andtransform data. The compute module’s capabilities are all exposed via a function registry.An Arrow “function” accepts zero or more arrays, batches, or tables, and produces an array,batch, or table. In addition, function calls can be combined, along with field referencesand literals, to form an expression (a tree of function calls) which the compute module canevaluate. For example, calculatingx+(y*3) given a table with columnsx andy.

A sample expression tree

Acero expands on these capabilities by adding compute operations for streams of data. Forexample, a project node can apply a compute expression on a stream of batches. This willcreate a new stream of batches with the result of the expression added as a new column. Thesenodes can be combined into a graph to form a more complex execution plan. This is very similarto the way functions are combined into a tree to form a complex expression.

A simple plan that uses compute expressions

Note

Acero does not use thearrow::Table orarrow::ChunkedArray containersfrom the core Arrow library. This is because Acero operates on streams of batches andso there is no need for a multi-batch container of data. This helps to reduce thecomplexity of Acero and avoids tricky situations that can arise from tables whosecolumns have different chunk sizes. Acero will often usearrow::Datumwhich is a variant from the core module that can hold many different types. WithinAcero, a datum will always hold either anarrow::Array or aarrow::Scalar.

Core Concepts#

ExecNode#

The most basic concept in Acero is the ExecNode. An ExecNode has zero or more inputs andzero or one outputs. If an ExecNode has zero inputs we call it a source and if an ExecNodedoes not have an output then we call it a sink. There are many different kinds of nodes andeach one transforms its inputs in different ways. For example:

  • A scan node is a source node that reads data from files

  • An aggregate node accumulates batches of data to compute summary statistics

  • A filter node removes rows from the data according to a filter expression

  • A table sink node accumulates data into a table

Note

A full list of the available compute modules is included in theuser’s guide

ExecBatch#

Batches of data are represented by the ExecBatch class. An ExecBatch is a 2D structure thatis very similar to a RecordBatch. It can have zero or more columns and all of the columnsmust have the same length. There are a few key differences from ExecBatch:

../../_images/rb_vs_eb.svg

Both the record batch and the exec batch have strong ownership of the arrays & buffers#

  • AnExecBatch does not have a schema. This is because anExecBatch is assumed to bepart of a stream of batches and the stream is assumed to have a consistent schema. Sothe schema for anExecBatch is typically stored in the ExecNode.

  • Columns in anExecBatch are either anArray or aScalar. When a column is aScalarthis means that the column has a single value for every row in the batch. AnExecBatchalso has a length property which describes how many rows are in a batch. So another way toview aScalar is a constant array withlength elements.

  • AnExecBatch contains additional information used by the exec plan. For example, anindex can be used to describe a batch’s position in an ordered stream. We expectthatExecBatch will also evolve to contain additional fields such as a selection vector.

../../_images/scalar_vs_array.svg

There are four different ways to represent the given batch of data using different combinationsof arrays and scalars. All four exec batches should be considered semantically equivalent.#

Converting from a record batch to an exec batch is always zero copy. Both RecordBatch and ExecBatchrefer to the exact same underlying arrays. Converting from an exec batch to a record batch isonly zero copy if there are no scalars in the exec batch.

Note

Both Acero and the compute module have “lightweight” versions of batches and arrays.In the compute module these are calledBatchSpan,ArraySpan, andBufferSpan. InAcero the concept is calledKeyColumnArray. These types were developed concurrentlyand serve the same purpose. They aim to provide an array container that can be completelystack allocated (provided the data type is non-nested) in order to avoid heap allocationoverhead. Ideally these two concepts will be merged someday.

ExecPlan#

An ExecPlan represents a graph of ExecNode objects. A valid ExecPlan must always have atleast one source node but it does not technically need to have a sink node. The ExecPlan containsresources shared by all of the nodes and has utility functions to control starting and stoppingexecution of the nodes. Both ExecPlan and ExecNode are tied to the lifecycle of a single execution.They have state and are not expected to be restartable.

Warning

The structures within Acero, includingExecBatch, are still experimental. TheExecBatchclass should not be used outside of Acero. Instead, anExecBatch should be converted toa more standard structure such as aRecordBatch.

Similarly, an ExecPlan is an internal concept. Users creating plans should be using Declarationobjects. APIs for consuming and executing plans should abstract away the details of the underlyingplan and not expose the object itself.

Declaration#

A Declaration is a blueprint for an ExecNode. Declarations can be combined into a graph toform the blueprint for an ExecPlan. A Declaration describes the computation that needs to bedone but is not actually responsible for carrying out the computation. In this way, a Declaration isanalogous to an expression. It is expected that Declarations will need to be converted to and fromvarious query representations (e.g. Substrait). The Declaration objects are the public API, combinedwith the DeclarationToXyz methods, are the current public API for Acero.

../../_images/decl_vs_ep.svg

A declaration is a blueprint that is used to instantiate exec plan instances#