M1 trying to run SparkSession, but having RuntimeError: Java gateway process exited before sending its port number

Question:

I am trying to run a simple command spark = SparkSession.builder.appName("Basics").getOrCreate() in my M1 Mac, Monterey 12.6.2, but it throws an error:

The operation couldn’t be completed. Unable to locate a Java Runtime.
Please visit http://www.java.com for information on installing Java.

/Users/user/miniforge3/envs/bigdata/lib/python3.9/site-packages/pyspark/bin/spark-class: line 96: CMD: bad array subscript
head: illegal line count -- -1
Output exceeds the size limit. Open the full output data in a text editor
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[2], line 2
      1 # May take a little while on a local computer
----> 2 spark = SparkSession.builder.appName("Basics").getOrCreate()

File ~/miniforge3/envs/bigdata/lib/python3.9/site-packages/pyspark/sql/session.py:269, in SparkSession.Builder.getOrCreate(self)
    267     sparkConf.set(key, value)
    268 # This SparkContext may be an existing one.
--> 269 sc = SparkContext.getOrCreate(sparkConf)
    270 # Do not update `SparkConf` for existing `SparkContext`, as it's shared
    271 # by all sessions.
    272 session = SparkSession(sc, options=self._options)

File ~/miniforge3/envs/bigdata/lib/python3.9/site-packages/pyspark/context.py:483, in SparkContext.getOrCreate(cls, conf)
    481 with SparkContext._lock:
    482     if SparkContext._active_spark_context is None:
--> 483         SparkContext(conf=conf or SparkConf())
    484     assert SparkContext._active_spark_context is not None
    485     return SparkContext._active_spark_context

File ~/miniforge3/envs/bigdata/lib/python3.9/site-packages/pyspark/context.py:195, in SparkContext.__init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls, udf_profiler_cls)
    189 if gateway is not None and gateway.gateway_parameters.auth_token is None:
    190     raise ValueError(
    191         "You are trying to pass an insecure Py4j gateway to Spark. This"
...
--> 106     raise RuntimeError("Java gateway process exited before sending its port number")
    108 with open(conn_info_file, "rb") as info:
    109     gateway_port = read_int(info)

RuntimeError: Java gateway process exited before sending its port number

I googled a lot, and finally decided to follow this solution here ###RuntimeError: Java gateway process exited before sending its port number , and thus I need to go to zshrc by ~/.zshrc to add a line:

export JAVA_HOME="/path/to/java_home/". However it gives me this error zsh: permission denied: /Users/user/.zshrc I have tried these solutions here, but it doesn’t work. https://www.stellarinfo.com/blog/fixed-zsh-permission-denied-in-mac-terminal/. I have given Full Disk Access rights to Terminal.

Therefore I have 2 problems right now,

  1. Java gateway process exited before sending its port number.
  2. zsh permission denied.

Would anyone please help?

Asked By: yts61

||

Answers:

step.1

Open your terminal.

step.2

cd ~
vim .zshrc

step.3

Press i to insert, and use arrow keys to navigate.
Insert your command.

export JAVA_HOME="/path/to/java_home/"

Just try the above first. If it throws an error, you may need to remove the backslash in the end.

From your error code, it seems like you are running the in a virtual environment. If the error persists, please try conda env remove the current env, and create again. Then remember to conda install openjdk first, then conda install pyspark. Hope this helps.

Answered By: EntzY
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.