How to install packages(Pandas) in Airflow?

Question:

Airflow is installed on Linux(Debian), just following the official tutorial in the most clumsy way – no docker, etc.
(official tutorial: https://airflow.apache.org/docs/apache-airflow/stable/installation/installing-from-pypi.html)

I created a DAG and it has a Python Operator that uses the Pandas package. But I am getting an error:

Broken DAG: [/home/airflow/airflow/dags/air_etl.py] Traceback (most recent call last): File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed File "/home/airflow/airflow/dags/air_etl.py", line 12, in <module> import pandas as pd ModuleNotFoundError: No module named 'pandas'

I installed pandas with pip and it shows up in pip list.

I found a lot(How to install packages in Airflow?, How to install packages in Airflow (docker-compose)?) of similar questions on the forum, but they are about this kind of problem in docker. In such questions, it is usually recommended to rebuild the docker image with the addition of the necessary libraries. And if there is no docker, is it possible to somehow add libraries without reinstalling.

Or maybe I don’t understand something fundamentally.

Answers:

Honestly, I found problem. In general job with pandas module WORK correctly. But sometime web interface Airflow show this error:

enter image description here

How I understand, Airflow does not pull up libraries, without reloading its database => in order for the newly installed libraries to pull up, you need to do "airflow db reset". db init – not help here.

Answered By: Vladislav Zhilmanov
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.