Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.5k views
in Technique[技术] by (71.8m points)

airflow - Execute bash script (kerberos kinit) when reschedule DAG task (HiveCLIPartitionSensor)

I've got two tasks.

  1. Bash Operator [kinit], which takes kerberos ticket for hadoop
  2. Hive Sensor [check_partition ], which checks if partition exists.

My problem is that, Kerberos ticket is valid for 9 hours while the hive sensor might wait from 1 to 15 hours, because the time when data arrives is really fickle. Therefore I would like to execute kinit each time the hive sensor is reschedule (by 1 hour).

kinit = BashOperator(
     task_id="CIDF_BASH_KINIT",
     bash_command="bash kinit command",
     dag=dag
)


check_partition = HiveCLIPartitionSensor(
    task_id="CIDF_BASH_HIVE_CHECK_PARTITION",
    table='table',
    partition="partition='{}'".format('{{ ds }}'),
    poke_interval=60*60,
    mode='reschedule',
    retries=0,
    timeout=60*60*23,
    dag=dag
)

kinit >> check_partition 
question from:https://stackoverflow.com/questions/65906134/execute-bash-script-kerberos-kinit-when-reschedule-dag-task-hiveclipartitions

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

you can run a cron job or something scheduled on the background that generates a kerberos ticket every 5-6 hours automatically.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...