Spark submit error 'Cannot allocate memory'

Question:

I’m running standalone spark 1.6.0 on 64bit ubuntu machine.

When another application is already running on spark, and I try to submit a new application this error get raised as soon as I create the default config with conf = SparkConf():

# Native memory allocation (mmap) failed to map 17896046592 bytes for committing reserved memory.

I am however creating the context in this way

conf = SparkConf()
conf.setMaster(spark_master)
conf.set('spark.cores.max', 60)
conf.set('spark.executor.memory', '256m')
conf.set('spark.rpc.askTimeout', 240)
conf.set('spark.task.maxFailures', 1)
conf.set('spark.driver.memory', '128m')
conf.set('spark.dynamicAllocation.enabled', True)
conf.set('spark.shuffle.service.enabled', True)
ctxt = SparkContext(conf=conf)

So I cannot figure out where does the 17896046592 bytes (16.6 GB) comes from.

This is master’s output:

Successfully imported Spark Modules
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x0000000180000000, 17896046592, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 17896046592 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /home/spark/.pyenv/930221056_2001_st-dev/hs_err_pid631.log
Traceback (most recent call last):
  File "/home/spark/.pyenv/930221056_2001_st-dev/bin/run_backtes2.py", line 178, in <module>
    args.config_id, args.mnemonic, args.start_date, args.end_date, extra_backtest_args, cmd_line)
  File "/home/spark/.pyenv/930221056_2001_st-dev/bin/run_backtest2.py", line 26, in _main_notify
    args, extra_backtest_args, config_id, mnemonic, cmd_line, env_config, range)
  File "/home/spark/.pyenv/930221056_2001_st-dev/bin/run_backtest2.py", line 100, in run_backtest_main
    res = runner.run_and_log_backtest(backtest_range)
  File "/home/spark/.pyenv/930221056_2001_st-dev/local/lib/python2.7/site-packages/st/backtesting/backtest_runner.py", line 563, in run_and_log_backtest
    subranges_output = self._run_subranges_on_spark(subranges_to_run)
  File "/home/spark/.pyenv/930221056_2001_st-dev/local/lib/python2.7/site-packages/st/backtesting/backtest_runner.py", line 611, in _run_subranges_on_spark
    executor_memory='128m', max_failures=1, driver_memory='128m')
  File "/home/spark/.pyenv/930221056_2001_st-dev/local/lib/python2.7/site-packages/st/backtesting/backtest_runner.py", line 98, in __init__
    max_failures=max_failures, driver_memory=driver_memory)
  File "/home/spark/.pyenv/930221056_2001_st-dev/local/lib/python2.7/site-packages/st/backtesting/backtest_runner.py", line 66, in create_context
    conf = SparkConf()
  File "/home/spark/spark-1.6.0-bin-cdh4/python/pyspark/conf.py", line 104, in __init__
    SparkContext._ensure_initialized()
  File "/home/spark/spark-1.6.0-bin-cdh4/python/pyspark/context.py", line 245, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway()
  File "/home/spark/spark-1.6.0-bin-cdh4/python/pyspark/java_gateway.py", line 94, in launch_gateway
    raise Exception("Java gateway process exited before sending the driver its port number")

# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 17896046592 bytes for committing reserved memory.
# Possible reasons:
#   The system is out of physical RAM or swap space
#   In 32 bit mode, the process size limit was hit
# Possible solutions:
#   Reduce memory load on the system
#   Increase physical memory or swap space
#   Check if swap backing store is full
#   Use 64 bit Java on a 64 bit OS
#   Decrease Java heap size (-Xmx/-Xms)
#   Decrease number of Java threads
#   Decrease Java thread stack sizes (-Xss)
#   Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
#  Out of Memory Error (os_linux.cpp:2627), pid=19061, tid=0x00007fb15f814700
#
# JRE version:  (8.0_111-b14) (build )
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.111-b14 mixed mode linux-amd64 compressed oops)
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#

This seems to happen only when other applications are already running on the spark cluster and the master machine has ~10GB of free memory. Also other applications running all specify conf.set('spark.driver.memory', '1g')

Asked By: Lorenzo Belli

||

Answers:

Solution:
the variable

spark.executor.memory 22g 

in the config file has precedence over the configuration set with

conf.set('spark.executor.memory', '256m')
Answered By: Lorenzo Belli

You can try one or more of the following steps:

Increase the amount of memory available to your cluster nodes.

Use a memory profiler to identify and fix any memory leaks in your Spark application.

Reduce the number of concurrent tasks or increase the amount of memory available to each task.

Partition your data sets into smaller chunks or increase the amount of memory available to your cluster.

Answered By: steven
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.