jax woes (on an NVDIA DGX box, no less)

Question:

I am trying to run jax on an nvidia dgx box, but am failing miserably, thus:

>>> import jax
>>> import jax.numpy as jnp
>>> x = jnp.arange(10)
2021-10-25 13:00:05.863667: W 
external/org_tensorflow/tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn't 
get ptxas version string: INTERNAL: Couldn't invoke ptxas --version
2021-10-25 13:00:05.864713: F 
external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:435] 
ptxas returned an error during compilation of ptx to sass: 'INTERNAL: Failed to 
launch ptxas'  If the error message indicates that a file could not be written, 
please verify that sufficient filesystem space is provided.
Aborted (core dumped)

Any suggestions would be much appreciated.

Asked By: Igor Rivin

||

Answers:

This means that your CUDA installation is not configured correctly, and can generally be fixed by ensuring that the CUDA toolkit binaries (including ptxas) are present in your $PATH. See https://github.com/google/jax/discussions/6843 and https://github.com/google/jax/issues/7239 for responses to users reporting similar issues.

Answered By: jakevdp

For this problem you need to install nvidia-driver, cuda and cudnn correctly and the risky command here would be: sudo apt install nvidia-cuda-toolkit avoid this command if you have installed those 3 already.

the way which works for me:

  • Install nvidia-driver: follow this and proper version also. you can try sudo ubuntu-drivers devices in ubuntu

  • Install cuda : for finding which cuda version works for you run nvidia-smi and on top-left you will see compatible version for the cuda then go nvidia cuda archive and follow the instructions there.
    at this step you should be able to see cuda foder when you type ls /usr/local. if you want to install header also you can find useful command from nvidia installation guide for cuda.

  • Install cudnn which means copy paste some files into /usr/local/cuda directory if you go through cuDNN nvidia guide you would find the best way.

  • the last step you need to refer to the cuda path (/usr/local/cuda if you follow above). for example if you use docker you need to mount it like here. avoid install nvidia-cuda-toolkit it would remove your previous installation and instead you can install it in conda-env by conda install -c nvidia cuda-nvcc which doesn’t interfere your cuda installation.

Answered By: Hamid
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.