How >> operator defines task dependencies in Airflow?

Question:

I was going through Apache Airflow tutorial https://github.com/hgrif/airflow-tutorial and encountered this section for defining task dependencies.

with DAG('airflow_tutorial_v01',
     default_args=default_args,
     schedule_interval='0 * * * *',
     ) as dag:

print_hello = BashOperator(task_id='print_hello',
                           bash_command='echo "hello"')
sleep = BashOperator(task_id='sleep',
                     bash_command='sleep 5')
print_world = PythonOperator(task_id='print_world',
                             python_callable=print_world)


print_hello >> sleep >> print_world

The line that confuses me is

print_hello >> sleep >> print_world

What does >> mean in Python? I know bitwise operator, but can’t relate to the code here.

Asked By: idazuwaika

||

Answers:

Airflow represents workflows as directed acyclic graphs. A workflow is any number of tasks that have to be executed, either in parallel or sequentially. The “>>” is Airflow syntax for setting a task downstream of another.

Diving into the incubator-airflow project repo, models.py in the airflow directory defines the behavior of much of the high level abstractions of Airflow. You can dig into the other classes if you’d like there, but the one that answers your question is the BaseOperator class. All operators in Airflow inherit from the BaseOperator. The __rshift__ method of the BaseOperator class implements the Python right shift logical operator in the context of setting a task or a DAG downstream of another.

See implementation here.

Answered By: rob