I assume that you are familiar with the scheduling mechanism of Airflow, if this is not the case please read Problem with start date and scheduled date in Apache airflow before reading the rest of the answer.
As for your case:
You had one/several runs as expected when you deployed the dag. At some point you paused the dag on 2021-04-07
, today (2021-04-19
) you unpaused it. Airflow then executed a dag run with execution_date='2021-04-18'
.
This is expected.
The reason for this is based on the scheduling mechanism of Airflow.
Your last run was on 2021-04-07
the interval is 45 07 * * *
(every day at 07:45). Since you paused the DAG the runs of 2021-04-08, 2021-04-09, ... , 2021-04-17
were never created. When you unpaused the DAG Airflow didn't create these runs because of catchup=False
however today run (2021-04-19
) isn't part of the catchup it was scheduled because the interval of execution_date=2021-04-18
has reached its end cycle thus started running.
The behavior that you are experiencing isn't different than deploying this fresh DAG:
from airflow.operators.dummy_operator import DummyOperator
default_args = {
'owner': 'airflow',
'start_date': datetime(2020, 1, 1),
}
with DAG(dag_id='stackoverflow_question',
default_args=default_args,
schedule_interval='45 07 * * *',
catchup=False
) as dag:
DummyOperator(task_id='some_task')
As soon as you will deploy it a single run will be created:
The DAG start_date
is 2020-01-01
with catchup=False
I deployed the DAG today (19/Apr/2021
)so it created a run with execution_date='2021-04-18' that started to run today 2021-04-19
.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…