Spark – Error “A master URL must be set in your configuration” when submitting an app

The TLDR:

.config("spark.master", "local")

a list of the options for spark.master in spark 2.2.1

I ended up on this page after trying to run a simple Spark SQL java program in local mode. To do this, I found that I could set spark.master using:

SparkSession spark = SparkSession
.builder()
.appName("Java Spark SQL basic example")
.config("spark.master", "local")
.getOrCreate();

An update to my answer:

To be clear, this is not what you should do in a production environment. In a production environment, spark.master should be specified in one of a couple other places: either in $SPARK_HOME/conf/spark-defaults.conf (this is where cloudera manager will put it), or on the command line when you submit the app. (ex spark-submit –master yarn).

If you specify spark.master to be ‘local’ in this way, spark will try to run in a single jvm, as indicated by the comments below. If you then try to specify –deploy-mode cluster, you will get an error ‘Cluster deploy mode is not compatible with master “local”‘. This is because setting spark.master=local means that you are NOT running in cluster mode.

Instead, for a production app, within your main function (or in functions called by your main function), you should simply use:

SparkSession
.builder()
.appName("Java Spark SQL basic example")
.getOrCreate();

This will use the configurations specified on the command line/in config files.

Also, to be clear on this too: –master and “spark.master” are the exact same parameter, just specified in different ways. Setting spark.master in code, like in my answer above, will override attempts to set –master, and will override values in spark-defaults.conf, so don’t do it in production. Its great for tests though.

also, see this answer.
which links to a list of the options for spark.master and what each one actually does.

a list of the options for spark.master in spark 2.2.1

Leave a Comment