how to get the list of all tasks along with their status for current dag run in airflow
Question:
Need help to extract the list of all tasks along with their current status [Success/Failed] for the current dag run.
I have a task with a python operator which executes at the end of the workflow. The responsibility of this task is to return the no of tasks executed with the status.
Answers:
You can use Airflow API to extract information about your workflows. You can read the Rest API documentation here. For instance, to list all DAG runs for a specific DAG:
http://<AIRFLOW_IP>:8080/api/v1/dags/{dag_id}/dagRuns
After, you can list all tasks for this specific DAG Run:
https://<AIRFLOW_IP>:8080/api/v1/dags/{dag_id}/dagRuns/{dag_run_id}/taskInstances
In Airflow UI >> Docs >> REST API Reference (Swagger UI) you can access and test the API using Swagger documentation.
You can create a PythonOperator that read all tasks from the dag_run.
The xcom of task_id="tasks" is :
with DAG(
dag_id="get_tasks",
description="get tasks",
schedule_interval=None,
start_date=datetime.datetime(2021, 1, 1),
tags=["tasks"],
) as dag:
start_dag = EmptyOperator(task_id="start")
end_dag = EmptyOperator(task_id="end")
def get_tasks(**context):
dagrun: DAG = context["dag_run"]
tasks = {}
for ti in dagrun.get_task_instances():
tasks[ti.task_id] = ti.state
return tasks
tasks = PythonOperator(
task_id="tasks",
python_callable=get_tasks,
)
start_dag >> tasks >> end_dag
you can query to airflow database for task at table task_instance & for dag at table dag_run.
you can check this erd airflow for detail
Need help to extract the list of all tasks along with their current status [Success/Failed] for the current dag run.
I have a task with a python operator which executes at the end of the workflow. The responsibility of this task is to return the no of tasks executed with the status.
You can use Airflow API to extract information about your workflows. You can read the Rest API documentation here. For instance, to list all DAG runs for a specific DAG:
http://<AIRFLOW_IP>:8080/api/v1/dags/{dag_id}/dagRuns
After, you can list all tasks for this specific DAG Run:
https://<AIRFLOW_IP>:8080/api/v1/dags/{dag_id}/dagRuns/{dag_run_id}/taskInstances
In Airflow UI >> Docs >> REST API Reference (Swagger UI) you can access and test the API using Swagger documentation.
You can create a PythonOperator that read all tasks from the dag_run.
The xcom of task_id="tasks" is :
with DAG(
dag_id="get_tasks",
description="get tasks",
schedule_interval=None,
start_date=datetime.datetime(2021, 1, 1),
tags=["tasks"],
) as dag:
start_dag = EmptyOperator(task_id="start")
end_dag = EmptyOperator(task_id="end")
def get_tasks(**context):
dagrun: DAG = context["dag_run"]
tasks = {}
for ti in dagrun.get_task_instances():
tasks[ti.task_id] = ti.state
return tasks
tasks = PythonOperator(
task_id="tasks",
python_callable=get_tasks,
)
start_dag >> tasks >> end_dag
you can query to airflow database for task at table task_instance & for dag at table dag_run.
you can check this erd airflow for detail