I believe that documentation is a bit misleading here and when you work with Scala you actually see a warning like this:
... WARN SparkSession$Builder: Use an existing SparkSession, some configuration may not take effect.
It was more obvious prior to Spark 2.0 with clear separation between contexts:
SparkContext
configuration cannot be modified on runtime. You have to stop existing context first.
SQLContext
configuration can be modified on runtime.
spark.app.name
, like many other options, is bound to SparkContext
, and cannot be modified without stopping the context.
Reusing existing SparkContext
/ SparkSession
import org.apache.spark.SparkConf
import org.apache.spark.sql.SparkSession
spark.conf.get("spark.sql.shuffle.partitions")
String = 200
val conf = new SparkConf()
.setAppName("foo")
.set("spark.sql.shuffle.partitions", "2001")
val spark = SparkSession.builder.config(conf).getOrCreate()
... WARN SparkSession$Builder: Use an existing SparkSession ...
spark: org.apache.spark.sql.SparkSession = ...
spark.conf.get("spark.sql.shuffle.partitions")
String = 2001
While spark.app.name
config is updated:
spark.conf.get("spark.app.name")
String = foo
it doesn't affect SparkContext
:
spark.sparkContext.appName
String = Spark shell
Stopping existing SparkContext
/ SparkSession
Now let's stop the session and repeat the process:
spark.stop
val spark = SparkSession.builder.config(conf).getOrCreate()
... WARN SparkContext: Use an existing SparkContext ...
spark: org.apache.spark.sql.SparkSession = ...
spark.sparkContext.appName
String = foo
Interestingly when we stop the session we still get a warning about using existing SparkContext
, but you can check it is actually stopped.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…