PUBLIC INTERFACE FOR AIRFLOW 3.0+¶
Warning
This documentation covers the Public Interface for Airflow 3.0+
If you are using Airflow 2.x, please refer to theAirflow 2.11 Public Interface Documentationfor the legacy interface.
Public Interface of Airflow¶
The Public Interface of Apache Airflow is the collection of interfaces and behaviors in Apache Airflowwhose changes are governed by semantic versioning. A user interacts with Airflow’s public interfaceby creating and managing dags, managing tasks and dependencies,and extending Airflow capabilities by writing new executors, plugins, operators and providers. ThePublic Interface can be useful for building custom tools and integrations with other systems,and for automating certain aspects of the Airflow workflow.
The primary public interface for DAG Authors and task execution is using task SDKAirflow task SDK is the primary public interface for DAG Authors and for task executionairflow.sdk namespace. Direct access to the metadata databasefrom task code is no longer allowed. Instead, use theStable REST API,Python Client, or Task Context methods.
For comprehensive Task SDK documentation, see theTask SDK Reference.
Using Airflow Public Interfaces¶
The following are some examples of the public interface of Airflow:
When you are writing your own operators or hooks. This is commonly done when no hook or operator exists for your use case, or when perhaps when one exists but you need to customize the behavior.
When writing newPlugins that extend Airflow’s functionality beyondDAG building blocks. Secrets, Timetables, Triggers, Listeners are all examples of such functionality. Thisis usually done by users who manage Airflow instances.
Bundling custom Operators, Hooks, Plugins and releasing them together viaproviders - this is usually done by those who intend toprovide a reusable set of functionality for external services or applications Airflow integrates with.
Using the taskflow API to write tasks
Relying on the consistent behavior of Airflow objects
One aspect of “public interface” is extending or using Airflow Python classes and functions. The classesand functions mentioned below can be relied on to maintain backwards-compatible signatures and behaviours withinMAJOR version of Airflow. On the other hand, classes and methods starting with_
(also knownas protected Python methods) and__
(also known as private Python methods) are not part of the PublicAirflow Interface and might change at any time.
You can also use Airflow’s Public Interface via theStable REST API (based on theOpenAPI specification). For specific needs you can also use theAirflow Command Line Interface (CLI) though its behaviour might changein details (such as output format and available flags) so if you want to rely on those in programmaticway, the Stable REST API is recommended.
Using the Public Interface for DAG Authors¶
The primary interface for DAG Authors is theairflow.sdk namespace.This provides a stable, well-defined interface for creating DAGs and tasks that is not subject to internalimplementation changes. The goal of this change is to decouple DAG authoring from Airflow internals (Scheduler,API Server, etc.), providing a version-agnostic, stable interface for writing and maintaining DAGs across Airflow versions.
Key Imports from airflow.sdk:
Classes:
Asset
BaseHook
BaseNotifier
BaseOperator
BaseOperatorLink
BaseSensorOperator
Connection
Context
DAG
EdgeModifier
Label
ObjectStoragePath
Param
TaskGroup
Variable
Decorators and Functions:
@asset
@dag
@setup
@task
@task_group
@teardown
chain
chain_linear
cross_downstream
get_current_context
get_parsing_context
Migration from Airflow 2.x:
For detailed migration instructions from Airflow 2.x to 3.x, including import changes and other breaking changes,see theMigration Guide.
For an exhaustive list of available classes, decorators, and functions, checkairflow.sdk.__all__
.
All DAGs should update imports to useairflow.sdk
instead of referencing internal Airflow modules directly.Legacy import paths (e.g.,airflow.models.dag.DAG
,airflow.decorator.task
) are deprecated and will beremoved in a future Airflow version.
Dags¶
The DAG is Airflow’s core entity that represents a recurring workflow. You can create a DAG byinstantiating theDAG
class in your DAG file. Dags can also have parametersspecified viaParam
class.
The recommended way to create DAGs is using thedag()
decoratorfrom the airflow.sdk namespace.
Airflow has a set of example dags that you can use to learn how to write dags
You can read more about dags inDags.
References for the modules used in dags are here:
Note
The airflow.sdk namespace provides the primary interface for DAG Authors.For detailed API documentation, see theTask SDK Reference.
Note
TheDagBag
class is used internally by Airflow for loading DAGsfrom files and folders. DAG Authors should use theDAG
class from theairflow.sdk namespace instead.
Note
TheDagRun
class is used internally by Airflow for DAG runmanagement. DAG Authors should access DAG run information through the Task Context viaget_current_context()
or use theDagRunProtocol
interface.
Operators¶
The base classesBaseOperator
andBaseSensorOperator
are public and may be extended to make new operators.
The base class for new operators isBaseOperator
from the airflow.sdk namespace.
Subclasses of BaseOperator which are published in Apache Airflow are public inbehavior but not instructure. That is to say, the Operator’s parameters and behavior is governed by semver but the methods are subject to change at any time.
Task Instances¶
Task instances are the individual runs of a single task in a DAG (in a DAG Run). Task instances are accessed throughthe Task Context viaget_current_context()
. Direct database access is not possible.
Note
Task Context is part of the airflow.sdk namespace.For detailed API documentation, see theTask SDK Reference.
Task Instance Keys¶
Task instance keys are unique identifiers of task instances in a DAG (in a DAG Run). A key is a tuple that consists ofdag_id
,task_id
,run_id
,try_number
, andmap_index
.
Direct access to task instance keys via theTaskInstance
model is no longer allowed from task code. Instead, use the Task Context viaget_current_context()
to access task instance information.
Example of accessing task instance information through Task Context:
fromairflow.sdkimportget_current_contextdefmy_task():context=get_current_context()ti=context["ti"]dag_id=ti.dag_idtask_id=ti.task_idrun_id=ti.run_idtry_number=ti.try_numbermap_index=ti.map_indexprint(f"Task:{dag_id}.{task_id}, Run:{run_id}, Try:{try_number}, Map Index:{map_index}")
Note
TheTaskInstanceKey
class is used internally by Airflowfor identifying task instances. DAG Authors should access task instance information through theTask Context viaget_current_context()
instead.
Hooks¶
Hooks are interfaces to external platforms and databases, implementing a commoninterface when possible and acting as building blocks for operators. All hooksare derived fromBaseHook
.
Airflow has a set of Hooks that are considered public. You are free to extend their functionalityby extending them:
Public Airflow utilities¶
When writing or extending Hooks and Operators, DAG Authors and developers canuse the following classes:
The
Connection
, which provides access to external service credentials and configuration.The
Variable
, which provides access to Airflow configuration variables.The
XCom
which are used to access to inter-task communication data.
Connection and Variable operations should be performed through the Task Context usingget_current_context()
and the task instance’s methods, or through the airflow.sdk namespace.Direct database access toConnection
andVariable
models is no longer allowed from task code.
Example of accessing Connections and Variables through Task Context:
fromairflow.sdkimportget_current_contextdefmy_task():context=get_current_context()conn=context["conn"]my_connection=conn.get("my_connection_id")var=context["var"]my_variable=var.value.get("my_variable_name")
Example of using airflow.sdk namespace directly:
fromairflow.sdkimportConnection,Variableconn=Connection.get("my_connection_id")var=Variable.get("my_variable_name")
You can read more about the public Airflow utilities inManaging Connections,Variables,XComs
Reference for classes used for the utilities are here:
Note
Connection, Variable, and XCom classes are now part of the airflow.sdk namespace.For detailed API documentation, see theTask SDK Reference.
Public Exceptions¶
When writing the custom Operators and Hooks, you can handle and raise public Exceptions that Airflowexposes:
Public Utility classes¶
Using Public Interface to extend Airflow capabilities¶
Airflow uses Plugin mechanism to extend Airflow platform capabilities. They allow to extendAirflow UI but also they are the way to expose the below customizations (Triggers, Timetables, Listeners, etc.).Providers can also implement plugin endpoints and customize Airflow UI and the customizations.
You can read more about plugins inPlugins. You can read how to extendAirflow UI inCustomize view of Apache from Airflow web UI. Note that there are some simple customizations of the UIthat do not require plugins - you can read more about them inCustomizing the UI.
Here are the ways how Plugins can be used to extend Airflow:
Triggers¶
Airflow uses Triggers to implementasyncio
compatible Deferrable Operators.All Triggers derive fromBaseTrigger
.
Airflow has a set of Triggers that are considered public. You are free to extend their functionalityby extending them:
You can read more about Triggers inDeferrable Operators & Triggers.
Timetables¶
Custom timetable implementations provide Airflow’s scheduler additional logic toschedule DAG runs in ways not possible with built-in schedule expressions.All Timetables derive fromTimetable
.
Airflow has a set of Timetables that are considered public. You are free to extend their functionalityby extending them:
You can read more about Timetables inCustomizing DAG Scheduling with Timetables.
Listeners¶
Listeners enable you to respond to DAG/Task lifecycle events.
This is implemented viaListenerManager
class that provides hooks thatcan be implemented to respond to DAG/Task lifecycle events.
Added in version 2.5:Listener public interface has been added in version 2.5.
You can read more about Listeners inListeners.
Extra Links¶
Extra links are dynamic links that could be added to Airflow independently from custom Operators. Normallythey can be defined by the Operators, but plugins allow you to override the links on a global level.
You can read more about the Extra Links inDefine an operator extra link.
Using Public Interface to integrate with external services and applications¶
Tasks in Airflow can orchestrate external services via Hooks and Operators. The core functionality ofAirflow (such as authentication) can also be extended to leverage external services.You can read more about providersproviders and coreextensions they can provide inproviders.
Executors¶
Executors are the mechanism by which task instances get run. All executors arederived fromBaseExecutor
. There are severalexecutor implementations built-in Airflow, each with their own unique characteristics and capabilities.
The executor interface itself (the BaseExecutor class) is public, but the built-in executors are not (i.e. KubernetesExecutor, LocalExecutor, etc). This means that, to use KubernetesExecutor as an example, we may make changes to KubernetesExecutor in minor or patch Airflow releases which could break an executor that subclasses KubernetesExecutor. This is necessary to allow Airflow developers sufficient freedom to continue to improve the executors we offer. Accordingly, if you want to modify or extend a built-in executor, you should incorporate the full executor code into your project so that such changes will not break your derivative executor.
You can read more about executors and how to write your own inExecutor.
Added in version 2.6:The executor interface has been present in Airflow for quite some time but prior to 2.6, there was executor-specificcode elsewhere in the codebase. As of version 2.6 executors are fully decoupled, in the sense that Airflow core nolonger needs to know about the behavior of specific executors.You could have succeeded with implementing a custom executor before Airflow 2.6, and a numberof people did, but there were some hard-coded behaviours that preferred in-builtexecutors, and custom executors could not provide full functionality that built-in executors had.
Secrets Backends¶
Airflow can be configured to rely on secrets backends to retrieveConnection
andVariable
.All secrets backends derive fromBaseSecretsBackend
.
All Secrets Backend implementations are public. You can extend their functionality:
You can read more about Secret Backends inSecrets Backend.You can also find all the available Secrets Backends implemented in community providersinSecret backends.
Auth managers¶
Auth managers are responsible of user authentication and user authorization in Airflow. All auth managers arederived fromBaseAuthManager
.
The auth manager interface itself (theBaseAuthManager
class) ispublic, but the different implementations of auth managers are not (i.e. FabAuthManager).
You can read more about auth managers and how to write your own inAuth manager.
Connections¶
When creating Hooks, you can add custom Connections. You can read moreabout connections inConnections for availableConnections implemented in the community providers.
Extra Links¶
When creating Hooks, you can add custom Extra Links that are displayed when the tasks are run.You can find out more about extra links inExtra Linksthat also shows available extra links implemented in the community providers.
Logging and Monitoring¶
You can extend the way how logs are written by Airflow. You can find out more about log writing inLogging & Monitoring.
TheWriting logs that also shows available log writersimplemented in the community providers.
Decorators¶
DAG Authors can use decorators to author dags using theTaskFlow concept.All Decorators derive fromTaskDecorator
.
The primary decorators for DAG Authors are now in the airflow.sdk namespace:dag()
,task()
,asset()
,setup()
,task_group()
,teardown()
,chain()
,chain_linear()
,cross_downstream()
,get_current_context()
andget_parsing_context()
.
Airflow has a set of Decorators that are considered public. You are free to extend their functionalityby extending them:
Note
Decorators are now part of the airflow.sdk namespace.For detailed API documentation, see theTask SDK Reference.
You can read more about creating custom Decorators inCreating Custom @task Decorators.
Email notifications¶
Airflow has a built-in way of sending email notifications and it allows to extend it by adding customemail notification classes. You can read more about email notifications inEmail Configuration.
Notifications¶
Airflow has a built-in extensible way of sending notifications using the variouson_*_callback
. You can read moreabout notifications inCreating a notifier.
Cluster Policies¶
Cluster Policies are the way to dynamically apply cluster-wide policies to the dags being parsed or tasksbeing executed. You can read more about Cluster Policies inCluster Policies.
Lineage¶
Airflow can help track origins of data, what happens to it and where it moves over time. You can read moreabout lineage inLineage.
What is not part of the Public Interface of Apache Airflow?¶
Everything not mentioned in this document should be considered as non-Public Interface.
Sometimes in other applications those components could be relied on to keep backwards compatibility,but in Airflow they are not parts of the Public Interface and might change any time:
Database structure is considered to be an internal implementationdetail and you should not assume the structure is going to be maintained in abackwards-compatible way.
Web UI is continuously evolving and there are no backwardscompatibility guarantees on HTML elements.
Python classes except those explicitly mentioned in this document, are considered aninternal implementation detail and you should not assume they will be maintainedin a backwards-compatible way.
Direct metadata database access from task code is no longer allowed.Task code cannot directly access the metadata database to query DAG state, task history,or DAG runs. Instead, use one of the following alternatives:
Task Context: Use
get_current_context()
to access task instanceinformation and methods likeget_dr_count()
,get_dagrun_state()
, andget_task_states()
.REST API: Use theStable REST API for programmaticaccess to Airflow metadata.
Python Client: Use thePython Client for Python-basedinteractions with Airflow.
This change improves architectural separation and enables remote execution capabilities.
Example of using Task Context instead of direct database access:
fromairflow.sdkimportdag,get_current_context,taskfromairflow.utils.stateimportDagRunStatefromdatetimeimportdatetime@dag(dag_id="example_dag",start_date=datetime(2025,1,1),schedule="@hourly",tags=["misc"],catchup=False)defexample_dag():@task(task_id="check_dagrun_state")defcheck_state():context=get_current_context()ti=context["ti"]dag_run=context["dag_run"]# Use Task Context methods instead of direct DB accessdr_count=ti.get_dr_count(dag_id="example_dag")dagrun_state=ti.get_dagrun_state(dag_id="example_dag",run_id=dag_run.run_id)returnf"DAG run count:{dr_count}, current state:{dagrun_state}"check_state()example_dag()
Was this entry helpful?