Limit number of threads in numpy

Question:

It seems that my numpy library is using 4 threads, and setting OMP_NUM_THREADS=1 does not stop this.

numpy.show_config() gives me these results:

atlas_threads_info:
    libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas']
    library_dirs = ['/usr/lib64/atlas']
    define_macros = [('ATLAS_INFO', '"\"3.8.4\""')]
    language = f77
    include_dirs = ['/usr/include']
blas_opt_info:
    libraries = ['ptf77blas', 'ptcblas', 'atlas']
    library_dirs = ['/usr/lib64/atlas']
    define_macros = [('ATLAS_INFO', '"\"3.8.4\""')]
    language = c
    include_dirs = ['/usr/include']
atlas_blas_threads_info:
    libraries = ['ptf77blas', 'ptcblas', 'atlas']
    library_dirs = ['/usr/lib64/atlas']
    define_macros = [('ATLAS_INFO', '"\"3.8.4\""')]
    language = c
    include_dirs = ['/usr/include']
openblas_info:
  NOT AVAILABLE
lapack_opt_info:
    libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas']
    library_dirs = ['/usr/lib64/atlas']
    define_macros = [('ATLAS_INFO', '"\"3.8.4\""')]
    language = f77
    include_dirs = ['/usr/include']

So I know it is using blas, but I can’t figure out how to make it use 1 thread for matrix multiplication.

Asked By: drjrm3

||

Answers:

There are a few common multi CPU libraries that are used for numerical computations, including inside of NumPy. There are a few environment flags that you can set before running the script to limit the number of CPUS that they use.

Try setting all of the following:

export MKL_NUM_THREADS=1
export NUMEXPR_NUM_THREADS=1
export OMP_NUM_THREADS=1

Sometimes it’s a bit tricky to see where exactly multithreading is introduced.

Other answers show environment flags for other libraries. They may also work.

Answered By: Christian Zielinski

In regards to doing this from within a python script as opposed to at the bash prompt, per this thread you can do the following (same commands as the answer above):

import os
os.environ["MKL_NUM_THREADS"] = "1" 
os.environ["NUMEXPR_NUM_THREADS"] = "1" 
os.environ["OMP_NUM_THREADS"] = "1" 

but you have to put that before you do import numpy. Apparently numpy only checks for this at import.

(this is reposted as an answer based on @kηives comment above.)

Answered By: seth127

There are more than the 3 mentioned environmental variables. The followings are the complete list of environmental variables and the package that uses that variable to control the number of threads it spawns. Note than you need to set these variables before doing import numpy:

OMP_NUM_THREADS: openmp,
OPENBLAS_NUM_THREADS: openblas,
MKL_NUM_THREADS: mkl,
VECLIB_MAXIMUM_THREADS: accelerate,
NUMEXPR_NUM_THREADS: numexpr

So in practice you can do:

import os
os.environ["OMP_NUM_THREADS"] = "4" # export OMP_NUM_THREADS=4
os.environ["OPENBLAS_NUM_THREADS"] = "4" # export OPENBLAS_NUM_THREADS=4 
os.environ["MKL_NUM_THREADS"] = "6" # export MKL_NUM_THREADS=6
os.environ["VECLIB_MAXIMUM_THREADS"] = "4" # export VECLIB_MAXIMUM_THREADS=4
os.environ["NUMEXPR_NUM_THREADS"] = "6" # export NUMEXPR_NUM_THREADS=6

Note that as of November 2018 the Numpy developers are working on making this possible to do after you do import numpy as well. I’ll update this post once they commit those changes.

Answered By: Amir

I was able to fix this at run-time the following way:

import mkl
mkl.set_num_threads(1)

I use the following code to make this snippet less likely to cause problems in scripts/packages:

try:
    import mkl
    mkl.set_num_threads(1)
except:
    pass
Answered By: The Unfun Cat

After trying a number of the solutions above without luck, I found a reference to threadpoolctl in the Numpy docs. This worked and it can be used even if numpy is already imported.

with threadpool_limits(limits=1, user_api='blas'):
  # single threaded numpy code...

Just make sure to use the user_api which is listed when you do:

from threadpoolctl import threadpool_info
from pprint import pprint
import numpy
pprint(threadpool_info())
Answered By: Josh Broomberg
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.