apache spark - Google Cloud Dataproc configuration issues

Question

Welcome To Ask or Share your Answers For Others

apache spark - Google Cloud Dataproc configuration issues

asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

apache spark - Google Cloud Dataproc configuration issues

I've been encountering various issues in some Spark LDA topic modeling (mainly disassociation errors at seemingly random intervals) I've been running, which I think mainly have to do with insufficient memory allocation on my executors. This would seem to be related to problematic automatic cluster configuration. My latest attempt uses n1-standard-8 machines (8 cores, 30GB RAM) for both the master and worker nodes (6 workers, so 48 total cores).

But when I look at /etc/spark/conf/spark-defaults.conf I see this:

spark.master yarn-client
spark.eventLog.enabled true
spark.eventLog.dir hdfs://cluster-3-m/user/spark/eventlog

# Dynamic allocation on YARN
spark.dynamicAllocation.enabled true
spark.dynamicAllocation.minExecutors 1
spark.dynamicAllocation.initialExecutors 100000
spark.dynamicAllocation.maxExecutors 100000
spark.shuffle.service.enabled true
spark.scheduler.minRegisteredResourcesRatio 0.0

spark.yarn.historyServer.address cluster-3-m:18080
spark.history.fs.logDirectory hdfs://cluster-3-m/user/spark/eventlog

spark.executor.cores 4
spark.executor.memory 9310m
spark.yarn.executor.memoryOverhead 930

# Overkill
spark.yarn.am.memory 9310m
spark.yarn.am.memoryOverhead 930

spark.driver.memory 7556m
spark.driver.maxResultSize 3778m
spark.akka.frameSize 512

# Add ALPN for Bigtable
spark.driver.extraJavaOptions -Xbootclasspath/p:/usr/local/share/google/alpn/alpn-boot-8.1.3.v20150130.jar
spark.executor.extraJavaOptions -Xbootclasspath/p:/usr/local/share/google/alpn/alpn-boot-8.1.3.v20150130.jar

But these values don't make much sense. Why use only 4/8 executor cores? And only 9.3 / 30GB RAM? My impression was that all this config was supposed to be handled automatically, but even my attempts at manual tweaking aren't getting me anywhere.

For instance, I tried launching the shell with:

spark-shell --conf spark.executor.cores=8 --conf spark.executor.memory=24g

But then this failed with

java.lang.IllegalArgumentException: Required executor memory (24576+930 MB) is above the max threshold (22528 MB) of this cluster! Please increase the value of 'yarn.scheduler.maximum-allocation-mb'.

I tried changing the associated value in /etc/hadoop/conf/yarn-site.xml, to no effect. Even when I try a different cluster setup (e.g. using executors with 60+ GB RAM) I end up with the same problem. For some reason the max threshold remains at 22528MB.

Is there something I'm doing wrong here, or is this is a problem with Google's automatic configuration?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Categories

apache spark - Google Cloud Dataproc configuration issues

apache spark - Google Cloud Dataproc configuration issues

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags