How to make a Python threaded program (with Locks) to run on multi-process?

Question:

I have a multi-threaded program, and I want to let the user choose how to run them, either in serial, multi threads or multi cores, at least at the top level. A runnable demo is shown below illustrating my program’s logic.

from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor
from queue import Queue
# from multiprocessing import Queue


class Handler:
    def __init__(self):
        self.queue = Queue()  # this object is essential

    def put(self, item):
        self.queue.put(item)

    def run(self):
        while True:
            item = self.queue.get()
            # do other things on the item ...
            print(item)


class Runner:
    def __init__(self, name):
        self.name = name
        self.a = Handler()
        self.b = Handler()

    def start(self):
        self.a.put(f'{self.name}: hello a')
        self.b.put(f'{self.name}: hello b')
        with ThreadPoolExecutor() as exe:
            futures = [exe.submit(r.run) for r in [self.a, self.b]]
        for future in futures:
            future.result()


# current implementation
def run_in_multi_thread():
    rA = Runner('A')
    rB = Runner('B')
    rC = Runner('C')
    with ThreadPoolExecutor() as exe:
        futures = [exe.submit(r.start) for r in [rA, rB, rC]]
        for future in futures:
            future.result()


# how to implement this?
def run_in_multi_process():
    rA = Runner('A')
    rB = Runner('B')
    rC = Runner('C')
    with ProcessPoolExecutor() as exe:
        futures = [exe.submit(r.start) for r in [rA, rB, rC]]
        for future in futures:
            future.result()


if __name__ == '__main__':
    # run_in_multi_thread()  # this is currently running fine
    run_in_multi_process()  # how to make this work as well?

My goal is simple, I want to put many Runners into separate processes to run in true parallel.

Problem is, when I try to change the outermost ThreadPoolExecutor to ProcessPoolExecutor, python always raises TypeError: cannot pickle '_thread.lock' object.

After googling, I know that this is because I used queue.Queue in all my Handlers, which uses threading.Lock, which is a non-pickleable class. However, I cannot avoid using them because the core functionalities are all supported by queue.Queue and threading.Event for all Handlers to communicate.

I have also tried to replace queue.Queue with multiprocessing.Queue, but this time it raises RuntimeError: Queue objects should only be shared between processes through inheritance. I also heard of third party libs, such as dill or pathos, but it causes other pickling issues, so I end up sticking to the built-in libs.

Any suggestion on how to refactor my code is welcomed.

Asked By: quantum.snowball

||

Answers:

I had run into the same problem myself. The standard answer is:

def init_pool_processes(q):
    global queue
    queue = q

...

def main(tasks):
    queue = Queue()
    with Pool(initializer=init_pool_processes, initargs=(queue,)) as pool:
        ...

This will allow all the threads to share the same queue.

Note that as written, your code will never exit. Your Handler.run method never exits, so your futures will never have a value. You need some way of letter your runners know that there are no more jobs to run, and that there will never be any more jobs added.

Answered By: Frank Yellin

Synchronization primitives indeed need to be passed by inheritance in a multiprocessing context. The mpire library supports this using the shared_objects functionality. See docs.

from multiprocessing import Queue
from mpire import WorkerPool

class Handler: ...
class Runner: ...

def task(runners, idx):
    runners[idx].start()

if __name__ == '__main__':
    runners = [Runner('A'), Runner('B'), Runner('C')]
    with WorkerPool(n_jobs=3, shared_objects=runners, 
                    start_method='fork') as pool:
        pool.map(task, range(len(runners)), chunk_size=1)

Shared objects are passed by inheritance by mpire, which solves your problem. You do have to use multiprocessing.Queue if you need to communicate over multiple processes.

You can change start_method to 'threading' and it will use threading instead of the default 'fork' (on unix) or 'spawn' on Windows.

Note that mpire will very soon release support for apply/apply_async, which will let you change this to:

runners = [Runner('A'), Runner('B'), Runner('C')]
    with WorkerPool(n_jobs=3, shared_objects=runners, 
                    start_method='fork') as pool:
        futures = [pool.apply_async(task, idx) 
                   for idx in range(len(runners))]
        for future in futures:
            future.get() 

However, what Frank Yellin says is true. Your run method will never exit, so you might want to have a look at that part.

Answered By: Sybren Jansen

After much trial and error, I finally came up with a working solution. I think I can also answer my own question. I have given up on using ProcessPoolExecutor. Right now I use multiprocessing.Process directly. Below is the revised demo code describing how this can work in my use case.

from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor
from multiprocessing import Process
from queue import Queue
from time import sleep


class Handler:
    def __init__(self):
        self.queue = Queue()  # this object is essential

    def put(self, item):
        self.queue.put(item)

    def run(self):
        while True:
            item = self.queue.get()
            if item == 'exit':
                break
            # do other things on the item ...
            print(item)
            sleep(1)


class Runner:
    def __init__(self, name):
        self.name = name
        self.a = Handler()
        self.b = Handler()

    def start(self):
        # some dummy messages
        for _ in range(3):
            self.a.put(f'{self.name}: hello a')
            self.b.put(f'{self.name}: hello b')
        # request to shutdown gracefully
        self.a.put('exit')
        self.b.put('exit')
        with ThreadPoolExecutor() as exe:
            futures = [exe.submit(r.run) for r in [self.a, self.b]]
            for f in futures:
                f.result()


# this requires everything to be picklable
def run_in_process_pool():
    rA = Runner('A')
    rB = Runner('B')
    rC = Runner('C')
    with ProcessPoolExecutor() as exe:
        futures = [exe.submit(r.start) for r in [rA, rB, rC]]
        for future in futures:
            future.result()


# this does not pickle anything, but why?
def run_in_processes():
    rA = Runner('A')
    rB = Runner('B')
    rC = Runner('C')
    procs = [Process(target=r.start) for r in [rA, rB, rC]]
    for p in procs:
        p.start()
    for p in procs:
        p.join()


if __name__ == '__main__':
    # run_in_process_pool()  # `TypeError: cannot pickle '_thread.lock' object`
    run_in_processes()  # this is working

Surprisingly, this code won’t complain about classes not being pickleable anymore! Ends up I just use the old fashion way (i.e. create process -> start -> join). Total run time reduces from 4:39 to 1:05 (my cpu has 4 cores and true parallel run time reduces to around 1/4 makes perfect sense). The next time if you have some multi-threaded code, using queue.Queue or threading.Lock under the hood, you may consider wrapping them using multiprocessing.Process directly.

However I still want to know why only ProcessPoolExecutor requires everything to be pickleable while Process does not. To my knowledge, the Process will make a system call to create a new process, and the implementation is dependent on the os. Since I am on a Windows machine running WSL2, ,so I guess I am on Unix, and the default start method should be ‘fork’, and it just clones the whole process without the need to pickle anything?

Answered By: quantum.snowball