Enabling numpy array to be modified by several threads

Question:

So I want to take the minimum of an array and instead of having a single thread do all the work I have several threads populating such array entry by entry (this is an expensive call and can’t be changed) and then having the main thread take the minimum.

Here is the code I have so far which only prints 1 as the array is not populated at all:

import multiprocessing
import numpy as np
import psutil
import sys

def limit(n):
    entry = expensiveFunction()
    min_array[n] = entry

def run_parallel(function, nmax, nthreads, debug=False):
    pool = multiprocessing.Pool(nthreads)
    try:
        pool.map_async(function, list(range(nmax))).get(720000)
    except KeyboardInterrupt:
        print('Caught interrupt!')
        pool.terminate()
        exit(1)
    else:
        pool.close()
    pool.join()

if __name__ == "__main__":
    nthreads = psutil.cpu_count()
    number_expensive_calls = sys.argv[1]
    min_array = np.ones((number_expensive_calls ))

    run_parallel(limit, number_expensive_calls, nthreads, debug=False)
    print(np.min(min_array)) #always printing 1

I tried using multiprocessing’s shared_memory example for numpy arrays from here: https://docs.python.org/3/library/multiprocessing.shared_memory.html but end up with nan values.

Asked By: Cesar Diaz Blanco

||

Answers:

Each process has its own min_array, so making a modification to one does nothing to affect the others. You should really just have each process return its value, and it’s up to the main thread to decide what to do with that.

Why not try something like:

def limit(n):
    return expensiveFunction()

def run_parallel(function, nmax, nthreads, debug=False):
    with multiprocessing.Pool(nthreads) as pool:
        return min(pool.map(function, range(nmax)))

After reading the documentation, you may want to change map to imap or imap_unordered, since you don’t really care in what order the results are coming back.

Answered By: Frank Yellin
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.