System libraries in conda environment not seen by reticulate

Question:

I’m trying to get the R package reticulate working on a CentOS 7.8 system using RStudio Server v1.2.5042 with a custom environment created with conda. When I initiate a Python job with reticulate, I get an error that some system libraries are not the correct versions, specifically, libstdc++.so.6 and libz.so.1.

First off, I realize CentOS 7.8 is a bit old and some of the problem might be solved by upgrading the OS, but that’s not an option in this case.

The conda environment does work and I can run the target Python script in a terminal window without any errors. In RStudio using reticulate, the code is extremely simple at this point:

   library(reticulate)
   use_condaenv('test')
   py_run_file('test_script.py')

When the script is run, I get the following error:

ImportError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /home/<user>/.conda/envs/test/lib/python3.8/site-packages/scipy/_lib/_uarray/_uarray.cpython-38-x86_64-linux-gnu.so)

When I look into the /usr/lib64 directory, I find libstdc++.so, but running strings libstdc++.so | grep ^GLIBC | sort shows me that it indeed does not support version GLIBCXX 3.4.21. No surprises. If I navigate to the /home/<user>/.conda/envs/test/lib directory, I find another copy of libstdc++.so.6 and this one does support version GLIBCXX 3.4.21. So, the correct version of the library is present in the correct conda environment directory, but for some reason RStudio and reticulate are not finding it.

I tried changing LD_LIBRARY_PATH to have the conda environment directory listed first, but that does not work. I found a lengthy discussion here, which points out that LD_LIBRARY_PATH isn’t really a good fix unless it is set prior to the RStudio process initialization. (And then goes on a tangent about which version of Python gets used.) For my situation, there may be multiple conda environments to support and it isn’t possible to know which will be active for any given session, and any given user my use different/multiple environments. I would rather not try to harmonize all conda environments into one, big, uber-environment.

I’ve also verified that the Python version and other libraries are correctly set:

python:         /home/<user>/.conda/envs/test/bin/python
libpython:      /home/<user>/.conda/envs/test/lib/libpython3.8.so
pythonhome:     /home/<user>/.conda/envs/test:/home/<user>/.conda/envs/test
version:        3.8.5 (default, Aug  5 2020, 08:36:46)  [GCC 7.3.0]
numpy:          /home/<user>/.conda/envs/test/lib/python3.8/site-packages/numpy
numpy_version:  1.19.1

NOTE: Python version was forced by use_python function

I have been able to get it running by resetting the links in /usr/lib64 to point to the copy in the conda directory. While this gets it working for this instance, I’m not sure I want to push a fix like this to production. My guess is that if I link to the most inclusive version of the library across all conda environments, and that version fully supports all versions that the system level library supports, everything will be fine, but this feels like a hack, at best.

If anyone has found a good solution to this, I would appreciate to know the details.

Asked By: KirkD-CO

||

Answers:

After struggling with a similar issue for several days, and trying many of the suggested solutions on the web (mostly based on symlinks, LD_LIBRARY_PATH variable, or installing/upgrading/downgrading packages like libgcc), I finally found something that is only mentioned once here : https://github.com/rstudio/reticulate/issues/338#issuecomment-415472406

The problem possibly seems to be a conflict when R is installed on the OS (with apt-get r-base for example) rather than with (Ana/Mini)conda.
In that case, it will try to load system libraries rather than conda environment ones.

So here is a possible solution to people still having this kind of issue, that worked for me :

  1. (maybe optional) uninstall R on the local system : apt-get remove r-base r-base-dev
  2. Activate conda environment : conda activate my_R_env
  3. Install R within this environment : conda install r r-essentials --channel conda-forge
  4. Install R packages like reticulate inside the same env : conda install r-reticulate --channel conda-forge

You can check that your Python packages installed within your conda environment are now properly loaded with reticulate :

  • R
  • > library(reticulate)
  • > repl_python()
  • >>> import pandas or any package that was causing the initial issue

Note : I still had some issues due to R loading base conda environment instead of activated one, but deactivating/reactivating environments or calling use_condaenv('my_R_env') seems to solve the problem.

Answered By: Beinje

I have also been struggling with this issue although I am unable to uninstall/reinstall R as I am on a work system and don’t have privileges for that.

What worked for me was abandoning the idea of using conda altogether and moving over to virtual environments instead, so consider that option if you’re going round in circles.

Workflow:

  1. In terminal: python -m venv ~/.virtualenvs/myenv
  2. In R: reticulate::use_virtualenv("myenv")
  3. In terminal: source ~/.virtualenvs/myenv/bin/activate
  4. In terminal: pip install matplotlib or whatever packages you need
  5. In python, either sourcing script from R studio or using reticulate::repl_python() to get the python prompt: install matplotlib.pyplot as plt or whatever was causing the issue

Hope that helps

Answered By: goblinshark