DAG not visible in Web-UI
Question:
I am new to Airflow
. I am following a tutorial and written following code.
from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from datetime import datetime, timedelta
from models.correctness_prediction import CorrectnessPrediction
default_args = {
'owner': 'abc',
'depends_on_past': False,
'start_date': datetime.now(),
'email': ['[email protected]'],
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=5)
}
def correctness_prediction(arg):
CorrectnessPrediction.train()
dag = DAG('daily_processing', default_args=default_args)
task_1 = PythonOperator(
task_id='print_the_context',
provide_context=True,
python_callable=correctness_prediction,
dag=dag)
On running the script, it doesn’t show any errors but when I check for dags
in Web-UI
it doesn’t show under Menu->DAGs
But I can see the scheduled job
under Menu->Browse->Jobs
I also cannot see anything in $AIRFLOW_HOME/dags. Is it supposed to be like this only? Can someone explain why?
Answers:
The ScheduleJob
that you see on the jobs page is an entry for the Scheduler. Thats not the dag being scheduled.
Its weird that your $AIRFLOW_HOME/dags is empty. All dags must live within the $AIRFLOW_HOME/dags directory (specifically in the dags directory configured in your airflow.cfg
file). Looks like you are not storing the actual dag in the right directory (the dags directory).
Alternatively, sometimes you also need to restart the webserver for the dag to show up (though that doesn’t seem to be the issue here).
Check the dags_folder
variable in airflow.cfg
. If you have a virtual environment then run the command export AIRFLOW_HOME=$(pwd)
from the main project directory. Note that running export AIRFLOW_HOME=$(pwd)
expects your dags to be in a dags
subdirectory in the project directory.
Run airflow dags list
(or airflow list_dags
for Airflow 1.x)
to check, whether the dag file is located correctly.
For some reason, I didn’t see my dag in the browser UI before I executed this. Must be issue with browser cache or something.
If that doesn’t work, you should just restart the webserver with airflow webserver -p 8080 -D
We need to clarify several things:
- By no means you need to run the DAG file yourself (unless you’re testing it for syntax errors). This is the job of Scheduler/Executor.
- For DAG file to be visible by Scheduler (and consequently, Webserver), you need to add it to
dags_folder
(specified in airflow.cfg
. By default it’s $AIRFLOW_HOME/dags
subfolder).
Airflow Scheduler checks dags_folder
for new DAG files every 5 minutes by default (governed by dag_dir_list_interval
in airflow.cfg
). So if you just added a new file, you have two options:
- Restart Scheduler
- Wait until current Scheduler process picks up new DAGs.
I have the same issue. To resolve I need to run scheduler
airflow scheduler
Without this command, I don’t see my new DAGs
BTW: the UI show me warning related to that problem:
The scheduler does not appear to be running. Last heartbeat was received 9 seconds ago. The DAGs list may not update, and new tasks will not be scheduled.
Check the Paused dags
. Your DAG might have ended there. If you are sure that you have added .py
file correctly then manually type the url of the dag using dag_id
. For e.g. http://AIRFLOW_URL/graph?dag_id=dag_id
. Then you can see if Airflow has accepted your dag or not.
I had the same issue. I had put the downloaded Airflow twice, once without sudo and once with sudo. I was using with the sudo version, where the directories where under my user path. I simply ran the airflow command:
export AIRFLOW_HOME=~/airflow
I experienced the same issue. In my case, the permissions of the new DAG were incorrect.
Run ls -l
to see the permissions of the new DAG. For me, the owner was listed as myself, instead of default airflow user (which in my case should have been root
).
Once I changed permissions (chown root:root <file_name>
), the file showed up in the Web UI immediately.
listing the dag or restarting the webserver didn’t help me. but resetting db did.
airflow db reset
In my case, the DAG was exactly one of the default ones that I copy-pasted to check the correct volume mappings throughout the docker-compose installation.
It turns out that while the web UI shows no errors, the command line airflow dag list
return with the error
Error: Failed to load all files. For details, run airflow dags list-import-errors.
Which is the key to the solution:
- the DAG was not added since it was a duplicate of an already loaded dag
After reading previous answers, for me this worked:
- Restart the webserver, e.g.
pkill -f "airflow webserver"
and then airflow webserver -D
.
- Also restart the scheduler with
pkill -f "airflow scheduler"
and airflow scheduler -D
.
Besides that, make sure that your DAG is contained in the DAGS folder specified in airflow.cfg, located in $AIRFLOW_HOME.
This worked for me, after I could see the DAG with airflow dags list
, but not in the UI, and also not trigger it.
I just ran into the same problem. Airflow suggested me to use the following command to evaluate my dag:
Error: Failed to load all files. For details, run `airflow dags list-import-errors`
It was just a comma in my way :).
I had the same problem using WSL on Windows 10, so I had to shutdown the scheduler and the webserver, then I ran it again and worked fine…
NOTE: It seems that each time you change dags path in airflow.cfg, you have to restart the server.
There is an easier way than those described above.
DAGs are stored in the database and at the same time information about them is cached in the client.
You don’t need reboot you server or containers with airflow
You need to do a "cache flush and hard reload" of your browser page with airflow dags.
For chrome it is:
F12 -> Right click on restart icon -> clear cache and hard reboot
NIT: I’ll improve my answer when I find a variable like "cache lifetime" or a less hacky way to do it
Airflow uses heuristics to pre-check if a Python file contains a graph definition, or not. It checks for the presence of strings DAG
and airflow
in the file. If a file doesn’t contain any of those words, Airflow will ignore it. It’s documented as a note in the documentation in Core Concepts / DAGs / Loading DAGs section.
The check is case insensitive since Airflow 2. This behavior can be turned off with dag-discovery-safe-mode configuration variable since Airflow 1.10.3.
I am new to Airflow
. I am following a tutorial and written following code.
from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from datetime import datetime, timedelta
from models.correctness_prediction import CorrectnessPrediction
default_args = {
'owner': 'abc',
'depends_on_past': False,
'start_date': datetime.now(),
'email': ['[email protected]'],
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=5)
}
def correctness_prediction(arg):
CorrectnessPrediction.train()
dag = DAG('daily_processing', default_args=default_args)
task_1 = PythonOperator(
task_id='print_the_context',
provide_context=True,
python_callable=correctness_prediction,
dag=dag)
On running the script, it doesn’t show any errors but when I check for dags
in Web-UI
it doesn’t show under Menu->DAGs
But I can see the scheduled job
under Menu->Browse->Jobs
I also cannot see anything in $AIRFLOW_HOME/dags. Is it supposed to be like this only? Can someone explain why?
The ScheduleJob
that you see on the jobs page is an entry for the Scheduler. Thats not the dag being scheduled.
Its weird that your $AIRFLOW_HOME/dags is empty. All dags must live within the $AIRFLOW_HOME/dags directory (specifically in the dags directory configured in your airflow.cfg
file). Looks like you are not storing the actual dag in the right directory (the dags directory).
Alternatively, sometimes you also need to restart the webserver for the dag to show up (though that doesn’t seem to be the issue here).
Check the dags_folder
variable in airflow.cfg
. If you have a virtual environment then run the command export AIRFLOW_HOME=$(pwd)
from the main project directory. Note that running export AIRFLOW_HOME=$(pwd)
expects your dags to be in a dags
subdirectory in the project directory.
Run airflow dags list
(or airflow list_dags
for Airflow 1.x)
to check, whether the dag file is located correctly.
For some reason, I didn’t see my dag in the browser UI before I executed this. Must be issue with browser cache or something.
If that doesn’t work, you should just restart the webserver with airflow webserver -p 8080 -D
We need to clarify several things:
- By no means you need to run the DAG file yourself (unless you’re testing it for syntax errors). This is the job of Scheduler/Executor.
- For DAG file to be visible by Scheduler (and consequently, Webserver), you need to add it to
dags_folder
(specified inairflow.cfg
. By default it’s$AIRFLOW_HOME/dags
subfolder).
Airflow Scheduler checks dags_folder
for new DAG files every 5 minutes by default (governed by dag_dir_list_interval
in airflow.cfg
). So if you just added a new file, you have two options:
- Restart Scheduler
- Wait until current Scheduler process picks up new DAGs.
I have the same issue. To resolve I need to run scheduler
airflow scheduler
Without this command, I don’t see my new DAGs
BTW: the UI show me warning related to that problem:
The scheduler does not appear to be running. Last heartbeat was received 9 seconds ago. The DAGs list may not update, and new tasks will not be scheduled.
Check the Paused dags
. Your DAG might have ended there. If you are sure that you have added .py
file correctly then manually type the url of the dag using dag_id
. For e.g. http://AIRFLOW_URL/graph?dag_id=dag_id
. Then you can see if Airflow has accepted your dag or not.
I had the same issue. I had put the downloaded Airflow twice, once without sudo and once with sudo. I was using with the sudo version, where the directories where under my user path. I simply ran the airflow command:
export AIRFLOW_HOME=~/airflow
I experienced the same issue. In my case, the permissions of the new DAG were incorrect.
Run ls -l
to see the permissions of the new DAG. For me, the owner was listed as myself, instead of default airflow user (which in my case should have been root
).
Once I changed permissions (chown root:root <file_name>
), the file showed up in the Web UI immediately.
listing the dag or restarting the webserver didn’t help me. but resetting db did.
airflow db reset
In my case, the DAG was exactly one of the default ones that I copy-pasted to check the correct volume mappings throughout the docker-compose installation.
It turns out that while the web UI shows no errors, the command line airflow dag list
return with the error
Error: Failed to load all files. For details, run airflow dags list-import-errors.
Which is the key to the solution:
- the DAG was not added since it was a duplicate of an already loaded dag
After reading previous answers, for me this worked:
- Restart the webserver, e.g.
pkill -f "airflow webserver"
and thenairflow webserver -D
. - Also restart the scheduler with
pkill -f "airflow scheduler"
andairflow scheduler -D
.
Besides that, make sure that your DAG is contained in the DAGS folder specified in airflow.cfg, located in $AIRFLOW_HOME.
This worked for me, after I could see the DAG with airflow dags list
, but not in the UI, and also not trigger it.
I just ran into the same problem. Airflow suggested me to use the following command to evaluate my dag:
Error: Failed to load all files. For details, run `airflow dags list-import-errors`
It was just a comma in my way :).
I had the same problem using WSL on Windows 10, so I had to shutdown the scheduler and the webserver, then I ran it again and worked fine…
NOTE: It seems that each time you change dags path in airflow.cfg, you have to restart the server.
There is an easier way than those described above.
DAGs are stored in the database and at the same time information about them is cached in the client.
You don’t need reboot you server or containers with airflow
You need to do a "cache flush and hard reload" of your browser page with airflow dags.
For chrome it is:
F12 -> Right click on restart icon -> clear cache and hard reboot
NIT: I’ll improve my answer when I find a variable like "cache lifetime" or a less hacky way to do it
Airflow uses heuristics to pre-check if a Python file contains a graph definition, or not. It checks for the presence of strings DAG
and airflow
in the file. If a file doesn’t contain any of those words, Airflow will ignore it. It’s documented as a note in the documentation in Core Concepts / DAGs / Loading DAGs section.
The check is case insensitive since Airflow 2. This behavior can be turned off with dag-discovery-safe-mode configuration variable since Airflow 1.10.3.