How can I append to class variables using multiprocessing in python?

Question

I have this program where everything is built in a class object. There is a function that does 50 computations of a another function, each with a different input, so I decided to use multiprocessing to speed it up. However, the list that needs to be returned in the end always returns empty. any ideas? Here is a simplified version of my problem. The output of main_function() should be a list containing the numbers 0-9, however the list returns empty.

class MyClass(object):
    def __init__(self):
        self.arr = list()

    def helper_function(self, n):
        self.arr.append(n)

    def main_function(self):
        jobs = []

        for i in range(0,10):
            p = multiprocessing.Process(target=self.helper_function, args=(i,))
            jobs.append(p)
            p.start()

        for job in jobs:
            jobs.join()

        print(self.arr)

Asked By: Randy Maldonado

||

Source

Answer 1

arr is a list that’s not going to be shared across subprocess instances.

For that you have to use a Manager object to create a managed list that is aware of the fact that it’s shared between processes.

The key is:

self.arr = multiprocessing.Manager().list()

full working example:

import multiprocessing

class MyClass(object):
    def __init__(self):
        self.arr = multiprocessing.Manager().list()

    def helper_function(self, n):
        self.arr.append(n)

    def main_function(self):
        jobs = []

        for i in range(0,10):
            p = multiprocessing.Process(target=self.helper_function, args=(i,))
            jobs.append(p)
            p.start()

        for job in jobs:
            job.join()

        print(self.arr)

if __name__ == "__main__":
    a = MyClass()
    a.main_function()

this code now prints: [7, 9, 2, 8, 6, 0, 4, 3, 1, 5]

(well of course the order cannot be relied on between several executions, but all numbers are here which means that all processes contributed to the result)

Answered By: Jean-François Fabre

Answer 2

multiprocessing is touchy.

For simple multiprocessing tasks, I would recomend:

from multiprocessing.dummy import Pool as ThreadPool


class MyClass(object):
    def __init__(self):
        self.arr = list()

    def helper_function(self, n):
        self.arr.append(n)

    def main_function(self):
        pool = ThreadPool(4)
        pool.map(self.helper_function, range(10))
        print(self.arr)


if __name__ == '__main__':
    c = MyClass()
    c.main_function()

The idea of using map instead of complicated multithreading calls is from one of my favorite blog posts: https://chriskiehl.com/article/parallelism-in-one-line

Answered By: James Gabriel

How can I append to class variables using multiprocessing in python?

Question:

Answers: