Is there a way to prevent ray.init() from hanging when using Python on Apple silicon (the M1 Max)?

Question:

So I am trying to run ray[rllib] in a Jupyter notebook (in a Miniforge virtual environment) on Apple silicon (the M1 Max). Although I can import ray normally into the notebook, the very next step (of running ray.init()) causes the notebook to hang. No error is returned–ray.init() never completes. Is there a fix for this?

This is my first time using Ray. I don’t think the notebook or the commands I am entering is the issue because the notebook came pre-made from an instructor, and I have managed to get an identical notebook to run normally in a Miniforge environment on Windows 10.

I followed advice from developers at Ray M1 Mac (Apple Silicon) Support to install Miniforge for the M1 and create a virtual environment. I also leveraged this thread What is the proper way to install TensorFlow on Apple M1 in 2022 to devise a strategy for installing applications I need for a reinforcement learning application. Here are the contents of an environment.yml file I used to set up the Miniforge virtual environment:

name: tf-metal
channels:
  - apple
  - conda-forge
dependencies:
  - python=3.9
  - gym-all=0.21.0
  - pip
  - tensorflow-deps

  ## uncommented for use with Jupyter
  - ipykernel

  ## PyPI packages
  - pip:
    - jupyterlab
    - ray[rllib]==1.11
    - tensorflow-macos
    - tensorflow-metal

The steps I used in Terminal for creating the virtual environment were these:

# Download Miniforge3-MacOSX-arm64.sh and make it executable:
chmod u+x ./Miniforge3-MacOSX-arm64.sh

# run Miniforge
./Miniforge3-MacOSX-arm64.sh
# (or update it) ./Miniforge3-MacOSX-arm64.sh -u

# accept terms and conditions...
# run 'conda init' by entering 'yes'
# configure conda (then close and reopen Terminal):
conda config --set auto_activate_base false
# confirm '~/.bash_profile' reflects miniforge settings
# good-to-go...

# set up virtual environment
conda create --name rl_course2  # (choose any name you want)
# confirm acceptability of location (enter 'yes')
# activate env:
conda activate rl_course2
# configure channels (settings recommended by an instructor)
conda config --env --add channels conda-forge
conda config --env --set channel_priority strict
# install dependencies using environment.yml file shown above:
conda env update --name rl_course2 --file '/Users/.../environment.yml'
# check output for errors...(none found via text search)

So I created the virtual environment and installed all the dependencies with no errors, as far as I could tell:

Successfully installed MarkupSafe-2.1.1 PyWavelets-1.4.1 Send2Trash-1.8.0 absl-py-1.3.0 anyio-3.6.2 argon2-cffi-21.3.0 argon2-cffi-bindings-21.2.0 astunparse-1.6.3 async-timeout-4.0.2 attrs-22.1.0 babel-2.11.0 beautifulsoup4-4.11.1 bleach-5.0.1 cachetools-5.2.0 certifi-2022.9.24 cffi-1.15.1 charset-normalizer-2.1.1 click-8.1.3 contourpy-1.0.6 cycler-0.11.0 defusedxml-0.7.1 dm-tree-0.1.7 fastjsonschema-2.16.2 filelock-3.8.0 flatbuffers-22.10.26 fonttools-4.38.0 gast-0.4.0 google-auth-2.14.1 google-auth-oauthlib-0.4.6 google-pasta-0.2.0 grpcio-1.43.0 idna-3.4 imageio-2.22.4 importlib-metadata-5.0.0 ipython-genutils-0.2.0 jinja2-3.1.2 json5-0.9.10 jsonschema-4.17.1 jupyter-server-1.23.3 jupyterlab-3.5.0 jupyterlab-pygments-0.2.2 jupyterlab-server-2.16.3 keras-2.10.0 keras-preprocessing-1.1.2 kiwisolver-1.4.4 libclang-14.0.6 markdown-3.4.1 matplotlib-3.6.2 mistune-2.0.4 msgpack-1.0.4 nbclassic-0.4.8 nbclient-0.7.0 nbconvert-7.2.5 nbformat-5.7.0 networkx-2.8.8 notebook-6.5.2 notebook-shim-0.2.2 oauthlib-3.2.2 opt-einsum-3.3.0 pandas-1.5.1 pandocfilters-1.5.0 pillow-9.3.0 prometheus-client-0.15.0 pyasn1-0.4.8 pyasn1-modules-0.2.8 pycparser-2.21 pyrsistent-0.19.2 pytz-2022.6 pyyaml-6.0 ray-1.11.0 redis-4.3.5 requests-2.28.1 requests-oauthlib-1.3.1 rsa-4.9 scikit-image-0.19.3 sniffio-1.3.0 soupsieve-2.3.2.post1 tabulate-0.9.0 tensorboard-2.10.1 tensorboard-data-server-0.6.1 tensorboard-plugin-wit-1.8.1 tensorboardX-2.5.1 tensorflow-estimator-2.10.0 tensorflow-macos-2.10.0 tensorflow-metal-0.6.0 termcolor-2.1.1 terminado-0.17.0 tifffile-2022.10.10 tinycss2-1.2.1 tomli-2.0.1 typing-extensions-4.4.0 urllib3-1.26.12 webencodings-0.5.1 websocket-client-1.4.2 werkzeug-2.2.2 wrapt-1.14.1 zipp-3.10.0

Last step (while working in the rl_course2 environment) using Terminal: launch Jupyter…

(rl_course2) MacBook-Pro ~$ jupyter notebook

Now, in the Jupyter/Python notebook (Chrome browser):

import ray   # works!
ray.init()   # never completes (no errors)!

So I tried similar steps in the same environment using Terminal (no notebook):

(rl_course2) MacBook-Pro ~$ python3
Python 3.9.15 | packaged by conda-forge | (main, Nov 22 2022, 08:48:25) 
[Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
>>> import ray
>>> ray.init()
[no errors, but never completes]

Is there a way to fix this and run Ray normally in my Jupyter environment?

Update 1: Just now, I was able to run the simple TensorFlow test script recommended by Apple (see Get started with tensorflow-metal) using the virtual environment discussed above, and five epochs of training completed with no errors in about two minutes on an M1 Max with 64 GB of memory, so the environment appears to be working fine. Perhaps the issue involves Ray?

Asked By: hackr

||

Answers:

I have found one of possibly several answers to my question. Changing the environment.yml file (described above) slightly to import ray[rllib] rather than ray[rllib]==1.11 enabled Jupyter notebook to run ray.init() normally and execute the remainder of the code in the notebook. It appears there was a bug in ray[rllib] version 1.11 that prevented ray.init() from running on the M1 Max under some circumstances.

So to summarize: to overcome a hang involving ray.init() on Apple Silicon (M1 Max), I was able to solve it by modifying the environment.yml file to this:

name: tf-metal
channels:
  - apple
  - conda-forge
dependencies:
  - python=3.9
  - gym-all=0.21.0
  - pip
  - tensorflow-deps

  ## uncommented for use with Jupyter
  - ipykernel

  ## PyPI packages
  - pip:
    - jupyterlab
    - ray[rllib]
    - tensorflow-macos
    - tensorflow-metal

I subsequently created a Miniforge environment using the procedure described above. Python version 3.9.15 and Ray version 2.1.0 were installed in the notebook automatically, and the notebook ran normally on the M1 Max.

Update (12/09/2022): I recently learned from another source in the community that the following environment.yml is also effective for installing a stable version of Ray (1.11.0) with Gym (0.21.0) and TensorFlow (2.10.0) on Apple Silicon using the Miniforge environment described above. Using this enviornment.yml will omit tensorflow-metal so your GPUs may not be employed explicitly, however Ray will likely work smoothly with Gym environments and TensorFlow.

dependencies:
  - python=3.9
  - gym-all=0.21.0
  - grpcio=1.43.0
  - pip
  - pip:
      - jupyterlab
      - ray[rllib]==1.11
      - tensorflow-macos==2.10.0
Answered By: hackr

I followed the steps above and I run environment.yml file in my Mac M1 and
the following issues disappear:

ERROR: Could not find a version that satisfies the requirement tensorflow (from versions: none)
ERROR: No matching distribution found for tensorflow

it seems to work fine!

Thank you!

Answered By: Rusaac Saac