airflow-2.x

Restrict/Exclude specific logs from Airflow to Datadog

Restrict/Exclude specific logs from Airflow to Datadog Question: We are observing Airflow is sending large amount of logs to Datadog and we want to restrict/Reduce these logs by excluding logs from below tasks: pod_manager.py base.py base_aws.py logging_mixin.py Do we have any configuration settings where I can define this requirement? We have Airflow-2.0 running on Kubernetes. …

Total answers: 1

How to use a python list as global variable with in @task.external_python?

How to use a python list as global variable with in @task.external_python? Question: GOAL: Have a python list as a global variable between tasks. Currently it crashes at the 1st task. 1.) I am trying to have a simple python list that is carried from 1 task to the next and append a few string …

Total answers: 1

How to remove a downstream or upstream task dependency in Airflow

How to remove a downstream or upstream task dependency in Airflow Question: Assuming we have the two following Airflow tasks in a DAG, from airflow.operators.dummy import DummyOperator t1 = DummyOperator(task_id=’dummy_1′) t2 = DummyOperator(task_id=’dummy_2′) we can specify dependencies as: # Option A t1 >> t2 # Option B t2.set_upstream(t1) # Option C t1.set_downstream(t2) My question is …

Total answers: 1

Multiple inheritance using `BaseBranchOperator` in Airflow

Multiple inheritance using `BaseBranchOperator` in Airflow Question: Can one use multiple inheritance using BaseBranchOperator in Airflow? I want to define an operator like: from airflow.models import BaseOperator from airflow.operators.branch import BaseBranchOperator class MyOperator(BaseOperator, BaseBranchOperator): def execute(self, context): print(‘hi’) def choose_branch(self, context): if True: return ‘task_A’ else: return ‘task_B’ In that case, is it accurate to …

Total answers: 2

How to get Airflow Docker ExternalPythonOperator working with in a python venv?

How to get Airflow Docker ExternalPythonOperator working with in a python venv? Question: Situation Since 2022 Sept 19 The release of Apache Airflow 2.4.0 Airflow supports ExternalPythonOperator I have asked the main contributors as well and I should be able to add 2 python virtual environments to the base image of Airflow Docker 2.4.1 and …

Total answers: 1

What parameters can be passed to Airflow @task decorator?

What parameters can be passed to Airflow @task decorator? Question: I am using Airflow 2.4.0. from airflow.decorators import task from airflow import DAG with DAG( "hello_world", start_date=datetime(2022, 1, 1), schedule_interval="@daily", catchup=False, ) as dag: @task() def get_name(): return { ‘first_name’: ‘Hongbo’, ‘last_name’: ‘Miao’, } get_name() Currently I am using @task(). Based on the document https://airflow.apache.org/docs/apache-airflow/stable/tutorial/taskflow.html, …

Total answers: 1

Execute cleanup task on upstream task failure

Execute cleanup task on upstream task failure Question: I’m using the taskflow API on Airflow v2.3.0. I have a DAG that has a bunch of tasks. The first task (TaskA) turns on an EC2 instance and one of the last tasks (TaskX) turns off the EC2 instance. TaskA returns the instance ID of the EC2 …

Total answers: 2

All DAGs broken after MWAA update from 2.0.2 to 2.2.2

All DAGs broken after MWAA update from 2.0.2 to 2.2.2 Question: I am getting the following errors in AWS MWAA UI after I updated from 2.0.2 to 2.2.2 I have exhaustively searched for more details on these errors to no avail from airflow.providers.slack.operators.slack_webhook import SlackWebhookOperator ModuleNotFoundError: No module named ‘airflow.providers.slack’ Broken plugin: [/usr/local/airflow/plugins/__MACOSX/awsairflowlib/.___init__.py] source code …

Total answers: 2

How to execute multiple sql files in airflow using PostgresOperator?

How to execute multiple sql files in airflow using PostgresOperator? Question: I have multiple sql files in my sql folder. I am not sure how to execute all the sql files within a DAG? – dags – sql – dummy1.sql – dummy2.sql For a single file, below code works sql_insert= PostgresOperator(task_id=’sql_insert’, postgres_conn_id=’postgres_conn’, sql=’sql/dummy1.sql’) Asked By: …

Total answers: 2