I'm learning Airflow and have a simple question. Below is my DAG called dog_retriever
:
import airflow
from airflow import DAG
from airflow.operators.http_operator import SimpleHttpOperator
from airflow.operators.sensors import HttpSensor
from datetime import datetime, timedelta
import json
default_args = {
'owner': 'Loftium',
'depends_on_past': False,
'start_date': datetime(2017, 10, 9),
'email': '[email protected]',
'email_on_failure': False,
'email_on_retry': False,
'retries': 3,
'retry_delay': timedelta(minutes=3),
}
dag = DAG('dog_retriever',
schedule_interval='@once',
default_args=default_args)
t1 = SimpleHttpOperator(
task_id='get_labrador',
method='GET',
http_conn_id='http_default',
endpoint='api/breed/labrador/images',
headers={"Content-Type": "application/json"},
dag=dag)
t2 = SimpleHttpOperator(
task_id='get_breeds',
method='GET',
http_conn_id='http_default',
endpoint='api/breeds/list',
headers={"Content-Type": "application/json"},
dag=dag)
t2.set_upstream(t1)
As a means to test out Airflow, I'm simply making two GET requests to some endpoints in this very simple http://dog.ceo API. The goal is to learn how to work with some data retrieved via Airflow
The execution is working- my code successfully calls the endpoints in tasks t1 and t2, I can see them being logged in the Airflow UI, in the correct order based on the set_upstream
rule I wrote.
What I cannot figure out is how to ACCESS the JSON response of these 2 tasks. It seems so simple, but I cannot figure it out. In the SimpleHtttpOperator I see a param for response_check, but nothing to simply print, or store, or view the JSON response.
Thanks.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…