Broken DAG issue (Airflow 2.5.0)

Question:

Broken DAG: [/opt/airflow/dags/dag.py] Traceback (most recent call last):
  File "/opt/airflow/dags/dag.py", line 7, in <module>
    from training import training
  File "/opt/airflow/dags/training.py", line 6, in <module>
    from joblib import dump
ModuleNotFoundError: No module named 'joblib'

I have ‘joblib’ module installed already then why it is showing this module not found error??

Asked By: Rohan Anand

||

Answers:

As you have not shared the content of docker-compose.yml file, so I’am considering the Airflow v2.5.0 mentioned in question.

So here is the quick update from Airflow v2.1.1 and latest version, where you can specify the python lib(s) while configuring docker-compose.yml file.

_PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:- joblib==1.2.0 lxml==4.6.3 charset-normalizer==1.4.1}

You can find the more information related to above snippet below:

docker-compose.yml

Environment variables supported by Docker Compose – Check bottom section.

And don’t forget to restart the docker after this changes.

docker-compose restart

if changes do not reflect by above command then use below command:-

docker-compose up

This will install all the required python lib(S) along with Airflow services.

Answered By: Nikhil Sawant

You should build your own image and extend it by adding the packages you need https://airflow.apache.org/docs/docker-stack/build.html#adding-new-pypi-packages-individually

Using _PIP_ADDITIONAL_REQUIREMENTS is highly discouraged for anything but the quick iteration while debugging your installation (see the note for details).

You don’t even have to do much. Docker compose fully supports automatic building image with new dependencies and using it. See this comment in the dockerfile:

https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/docker-compose/docker-compose.yaml#L46

  1. In order to add custom dependencies or upgrade provider packages you can use your extended image.
  2. Comment the image line, place your Dockerfile in the directory where you placed the docker-compose.yaml and uncomment the "build" line below,
  3. Then run docker-compose build to build the images.
    image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:|version|}
    build: .

You can also run docker compose up –build as a shortcut if you do not want to run docker compose build separately.

Answered By: Rohan Anand