Troubleshooting KubernetesExecutor tasks Stay organized with collections Save and categorize content based on your preferences.
Cloud Composer 3 | Cloud Composer 2 | Cloud Composer 1
This page describes how to troubleshoot issues withtasks run by KubernetesExecutor and provides solutions for commonissues.
General approach to troubleshooting KubernetesExecutor
To troubleshoot issues with a task executed with KubernetesExecutor, dothe following actions in the listed order:
Check logs of the task in theDAG UI orAirflow UI.
Check scheduler logs in Google Cloud console:
In Google Cloud console, go to theEnvironments page.
In the list of environments, click the name of your environment.TheEnvironment details page opens.
Go to theLogs tab and check theAirflow logs>Scheduler section.
For a given time range, inspect the KubernetesExecutor worker pod that wasrunning the task. If the pod no longer exists, skip this step. The podhas the
airflow-k8s-workerprefix and a DAG or a task name in its name.Look for any reported issues such as a failed task or the task beingunschedulable.
Common troubleshooting scenarios for KubernetesExecutor
This section lists common troublehooting scenarions that you might encounter with KubernetesExecutor.
The task gets to theRunning state, then fails during the execution.
Symptoms:
- There are logs for the task in Airflow UI and on theLogs tab in theWorkers section.
Solution: The task logs indicate the problem.
Task instance gets to thequeued state, then it is marked asUP_FOR_RETRY orFAILED after some time.
Symptoms:
- There are no logs for task in Airflow UI and on theLogs tab in theWorkers section.
- There are logs on theLogs tab in theScheduler section with amessage that the task is marked as
UP_FOR_RETRYorFAILED.
Solution:
- Inspect scheduler logs for any details of the issue.
Possible causes:
- If the scheduler logs contain the
Adopted tasks were still pending after...message followed by theprinted task instance, check that CeleryKubernetesExecutor isenabled in your environment.
The task instance gets to theQueued state and is immediately marked asUP_FOR_RETRY orFAILED
Symptoms:
- There are no logs for the task in Airflow UI and on theLogs tab intheWorkers section.
- The scheduler logs on theLogs tab in theScheduler section hasthe
Pod creation failed with reason ... Failing taskmessage, and themessage that the task is marked asUP_FOR_RETRYorFAILED.
Solution:
- Check scheduler logs for the exact response and failure reason.
Possible reason:
If the error message isquantities must match the regular expression ...,then the issue is most-likely caused by a custom values set for k8sresources (requests/limits) of task worker pods.
KubernetesExecutor tasks fail without logs when a large number of tasks is executed
When your environment executes a large number of taskswith KubernetesExecutor orKubernetesPodOperator at the sametime, Cloud Composer 3 doesn't accept new tasks until some of theexisting tasks are finished. Extra tasks are marked as failed, and Airflowretries them later, if you define retries for the tasks (Airflow does this bydefault).
Symptom: Tasks executed with KubernetesExecutor or KubernetesPodOperatorfail without task logs in Airflow UI or DAG UI. In thescheduler's logs, you can see error messages similarto the following:
pods \"airflow-k8s-worker-*\" is forbidden: exceeded quota: k8s-resources-quota,requested: pods=1, used: pods=*, limited: pods=*","reason":"Forbidden"Possible solutions:
- Adjust the DAG run schedule so that tasks are distributed more evenly overtime.
- Reduce the number of tasks by consolidating small tasks.
Workaround:
If you prefer tasks to stay in the scheduled state until your environment canexecute them, you can define anAirflow pool with thelimited number of slots in the Airflow UI and then associate allcontainer-based tasks with this pool. We recommend to set the number of slotsin the pool to 50 or less. Extra tasks will stay in the scheduled state untilthe Airflow pool has a free slot to execute them. If you use this workaroundwithout applying possible solutions, you can still experience a large queue oftasks in the Airflow pool.
What's next
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.