Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
703 views
in Technique[技术] by (71.8m points)

intellij idea - How to set Master address for Spark examples from command line

NOTE: They author is looking for answers to set the Spark Master when running Spark examples that involves no changes to the source code, but rather only options that can be done from the command-line if at all possible.

Let us consider the run() method of the BinaryClassification example:

  def run(params: Params) {
    val conf = new SparkConf().setAppName(s"BinaryClassification with $params")
    val sc = new SparkContext(conf)

Notice that the SparkConf did not provide any means to configure the SparkMaster.

When running this program from Intellij with the following arguments:

--algorithm LR --regType L2 --regParam 1.0 data/mllib/sample_binary_classification_data.txt

the following error occurs:

Exception in thread "main" org.apache.spark.SparkException: A master URL must be set
in your configuration
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:166)
    at org.apache.spark.examples.mllib.BinaryClassification$.run(BinaryClassification.scala:105)

I have also tried adding in the Spark Master url anyways (though the code seems NOT to support it ..)

  spark://10.213.39.125:17088   --algorithm LR --regType L2 --regParam 1.0 
  data/mllib/sample_binary_classification_data.txt

and

--algorithm LR --regType L2 --regParam 1.0 spark://10.213.39.125:17088
data/mllib/sample_binary_classification_data.txt

Both do not work with error:

Error: Unknown argument 'data/mllib/sample_binary_classification_data.txt'

For reference here is the options parsing - which does nothing with SparkMaster:

val parser = new OptionParser[Params]("BinaryClassification") {
  head("BinaryClassification: an example app for binary classification.")
  opt[Int]("numIterations")
    .text("number of iterations")
    .action((x, c) => c.copy(numIterations = x))
  opt[Double]("stepSize")
    .text(s"initial step size, default: ${defaultParams.stepSize}")
    .action((x, c) => c.copy(stepSize = x))
  opt[String]("algorithm")
    .text(s"algorithm (${Algorithm.values.mkString(",")}), " +
    s"default: ${defaultParams.algorithm}")
    .action((x, c) => c.copy(algorithm = Algorithm.withName(x)))
  opt[String]("regType")
    .text(s"regularization type (${RegType.values.mkString(",")}), " +
    s"default: ${defaultParams.regType}")
    .action((x, c) => c.copy(regType = RegType.withName(x)))
  opt[Double]("regParam")
    .text(s"regularization parameter, default: ${defaultParams.regParam}")
  arg[String]("<input>")
    .required()
    .text("input paths to labeled examples in LIBSVM format")
    .action((x, c) => c.copy(input = x))

So .. yes .. I could go ahead and modify the source code. But I suspect instead I am missing an available tuning knob to make this work that does not involve modifying the source code.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can set the Spark master from the command-line by adding the JVM parameter:

-Dspark.master=spark://myhost:7077

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...