What is Spark's default cluster manager
Question:
When using PySaprk and getting the Spark Session using following statement:
spark = SparkSession.builder
.appName("sample-app")
.getOrCreate()
app works fine but I am unsure which cluster manager is being with this spark session. Is it local or standalone. I read through the docs but no where I found this thing documented. They tell about what are standalone and local cluster managers but no mention of which is default option.
Answers:
It depends on the multiple factors – is it submitted with the spark-submit
that may have --master
option specified, if the master is specified in the config file, etc. All of this is described in the Spark documentation.
If you just run this code as a Python script, then it will be local[*]
, meaning that it will locally on all cores. You can check this by calling spark.conf.get("spark.master")
When using PySaprk and getting the Spark Session using following statement:
spark = SparkSession.builder
.appName("sample-app")
.getOrCreate()
app works fine but I am unsure which cluster manager is being with this spark session. Is it local or standalone. I read through the docs but no where I found this thing documented. They tell about what are standalone and local cluster managers but no mention of which is default option.
It depends on the multiple factors – is it submitted with the spark-submit
that may have --master
option specified, if the master is specified in the config file, etc. All of this is described in the Spark documentation.
If you just run this code as a Python script, then it will be local[*]
, meaning that it will locally on all cores. You can check this by calling spark.conf.get("spark.master")