Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Docker Apache Airflow

License

NotificationsYou must be signed in to change notification settings

puckel/docker-airflow

Repository files navigation

CI statusDocker Build status

Docker HubDocker PullsDocker Stars

This repository containsDockerfile ofapache-airflow forDocker'sautomated build published to the publicDocker Hub Registry.

Informations

Installation

Pull the image from the Docker repository.

docker pull puckel/docker-airflow

Build

Optionally installExtra Airflow Packages and/or python dependencies at build time :

docker build --rm --build-arg AIRFLOW_DEPS="datadog,dask" -t puckel/docker-airflow .docker build --rm --build-arg PYTHON_DEPS="flask_oauthlib>=0.9" -t puckel/docker-airflow .

or combined

docker build --rm --build-arg AIRFLOW_DEPS="datadog,dask" --build-arg PYTHON_DEPS="flask_oauthlib>=0.9" -t puckel/docker-airflow .

Don't forget to update the airflow images in the docker-compose files to puckel/docker-airflow:latest.

Usage

By default, docker-airflow runs Airflow withSequentialExecutor :

docker run -d -p 8080:8080 puckel/docker-airflow webserver

If you want to run another executor, use the other docker-compose.yml files provided in this repository.

ForLocalExecutor :

docker-compose -f docker-compose-LocalExecutor.yml up -d

ForCeleryExecutor :

docker-compose -f docker-compose-CeleryExecutor.yml up -d

NB : If you want to have DAGs example loaded (default=False), you've to set the following environment variable :

LOAD_EX=n

docker run -d -p 8080:8080 -e LOAD_EX=y puckel/docker-airflow

If you want to use Ad hoc query, make sure you've configured connections:Go to Admin -> Connections and Edit "postgres_default" set this values (equivalent to values in airflow.cfg/docker-compose*.yml) :

  • Host : postgres
  • Schema : airflow
  • Login : airflow
  • Password : airflow

For encrypted connection passwords (in Local or Celery Executor), you must have the same fernet_key. By default docker-airflow generates the fernet_key at startup, you have to set an environment variable in the docker-compose (ie: docker-compose-LocalExecutor.yml) file to set the same key accross containers. To generate a fernet_key :

docker run puckel/docker-airflow python -c "from cryptography.fernet import Fernet; FERNET_KEY = Fernet.generate_key().decode(); print(FERNET_KEY)"

Configuring Airflow

It's possible to set any configuration value for Airflow from environment variables, which are used over values from the airflow.cfg.

The general rule is the environment variable should be namedAIRFLOW__<section>__<key>, for exampleAIRFLOW__CORE__SQL_ALCHEMY_CONN sets thesql_alchemy_conn config option in the[core] section.

Check out theAirflow documentation for more details

You can also define connections via environment variables by prefixing them withAIRFLOW_CONN_ - for exampleAIRFLOW_CONN_POSTGRES_MASTER=postgres://user:password@localhost:5432/master for a connection called "postgres_master". The value is parsed as a URI. This will work for hooks etc, but won't show up in the "Ad-hoc Query" section unless an (empty) connection is also created in the DB

Custom Airflow plugins

Airflow allows for custom user-created plugins which are typically found in${AIRFLOW_HOME}/plugins folder. Documentation on plugins can be foundhere

In order to incorporate plugins into your docker container

  • Create the plugins foldersplugins/ with your custom plugins.
  • Mount the folder as a volume by doing either of the following:
    • Include the folder as a volume in command-line-v $(pwd)/plugins/:/usr/local/airflow/plugins
    • Use docker-compose-LocalExecutor.yml or docker-compose-CeleryExecutor.yml which contains support for adding the plugins folder as a volume

Install custom python package

  • Create a file "requirements.txt" with the desired python modules
  • Mount this file as a volume-v $(pwd)/requirements.txt:/requirements.txt (or add it as a volume in docker-compose file)
  • The entrypoint.sh script execute the pip install command (with --user option)

UI Links

Scale the number of workers

Easy scaling using docker-compose:

docker-compose -f docker-compose-CeleryExecutor.yml scale worker=5

This can be used to scale to a multi node setup using docker swarm.

Running other airflow commands

If you want to run other airflow sub-commands, such aslist_dags orclear you can do so like this:

docker run --rm -ti puckel/docker-airflow airflow list_dags

or with your docker-compose set up like this:

docker-compose -f docker-compose-CeleryExecutor.yml run --rm webserver airflow list_dags

You can also use this to run a bash shell or any other command in the same environment that airflow would be run in:

docker run --rm -ti puckel/docker-airflow bashdocker run --rm -ti puckel/docker-airflow ipython

Simplified SQL database configuration using PostgreSQL

If the executor type is set to anything else thanSequentialExecutor you'll need an SQL database.Here is a list of PostgreSQL configuration variables and their default values. They're used to computetheAIRFLOW__CORE__SQL_ALCHEMY_CONN andAIRFLOW__CELERY__RESULT_BACKEND variables when needed for youif you don't provide them explicitly:

VariableDefault valueRole
POSTGRES_HOSTpostgresDatabase server host
POSTGRES_PORT5432Database server port
POSTGRES_USERairflowDatabase user
POSTGRES_PASSWORDairflowDatabase password
POSTGRES_DBairflowDatabase name
POSTGRES_EXTRASemptyExtras parameters

You can also use those variables to adapt your compose file to match an existing PostgreSQL instance managed elsewhere.

Please refer to the Airflow documentation to understand the use of extras parameters, for example in order to configurea connection that uses TLS encryption.

Here's an important thing to consider:

When specifying the connection as URI (in AIRFLOW_CONN_* variable) you should specify it following the standard syntax of DB connections,where extras are passed as parameters of the URI (note that all components of the URI should be URL-encoded).

Therefore you must provide extras parameters URL-encoded, starting with a leading?. For example:

POSTGRES_EXTRAS="?sslmode=verify-full&sslrootcert=%2Fetc%2Fssl%2Fcerts%2Fca-certificates.crt"

Simplified Celery broker configuration using Redis

If the executor type is set toCeleryExecutor you'll need a Celery broker. Here is a list of Redis configuration variablesand their default values. They're used to compute theAIRFLOW__CELERY__BROKER_URL variable for you if you don't provideit explicitly:

VariableDefault valueRole
REDIS_PROTOredis://Protocol
REDIS_HOSTredisRedis server host
REDIS_PORT6379Redis server port
REDIS_PASSWORDemptyIf Redis is password protected
REDIS_DBNUM1Database number

You can also use those variables to adapt your compose file to match an existing Redis instance managed elsewhere.

Wanna help?

Fork, improve and PR.


[8]ページ先頭

©2009-2025 Movatter.jp