I use Spark 1.6.0 with Cloudera 5.8.3.
I have a DStream
object and plenty of transformations defined on top of it,
val stream = KafkaUtils.createDirectStream[...](...)
val mappedStream = stream.transform { ... }.map { ... }
mappedStream.foreachRDD { ... }
mappedStream.foreachRDD { ... }
mappedStream.map { ... }.foreachRDD { ... }
Is there a way to register a last foreachRDD
that is guaranteed to be executed last and only if the above foreachRDD
s finished executing?
In other words, when the Spark UI shows that the job was complete - that's when I want to execute a lightweight function.
Is there something in the API that allows me to achieve that?
Thanks
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…