python multiprocessing – Sharing a dictionary of classes between processes with subsequent writes from the process reflected to the shared memory

Question

Problem

I need to share a dictionary between processes that contains an instance of a class inside of the value component of the key-value pair. The dictionary created using multiprocessing’s dict() from the manager class is able to store values, but subsequent writes to update the values aren’t reflected to the shared memory.

What I’ve tried

To attempt to solve this problem, I know I have to use a dict() created by a manager from python’s multiprocessing library so that it can be shared between processes. This works with simple values likes integers and strings. However, I had hoped that the created dictionary would handle deeper levels of synchronization for me so I could just create a class inside of the dictionary and that change would be reflected, but it seems multiprocessing is much more complicated than that.

Example

Below I have provided an example program that doesn’t work as intended. The printed values aren’t what they were set to be inside of the worker function f().

Note: I am using python3 for this example

from multiprocessing import Manager
import multiprocessing as mp
import random


class ExampleClass:
    def __init__(self, stringVar):
        # these variables aren't saved across processes?
        self.stringVar = stringVar
        self.count = 0


class ProcessContainer(object):
    processes = []

    def __init__(self, *args, **kwargs):
        manager = Manager()
        self.dict = manager.dict()

    def f(self, dict):
        # generate a random index to add the class to
        index = str(random.randint(0, 100))

        # create a new class at that index
        dict[index] = ExampleClass(str(random.randint(100, 200)))

        # this is the problem, it doesn't share the updated variables in the dictionary between the processes <----------------------
        # attempt to change the created variables
        dict[index].count += 1
        dict[index].stringVar = "yeAH"

        # print what's inside
        for x in dict.values():
            print(x.count, x.stringVar)

    def Run(self):
        # create the processes
        for str in range(3):
            p = mp.Process(target=self.f, args=(self.dict,))
            self.processes.append(p)

        # start the processes
        [proc.start() for proc in self.processes]

        # wait for the processes to finish
        [proc.join() for proc in self.processes]


if __name__ == '__main__':
    test = ProcessContainer()
    test.Run()

Asked By: saist

||

Source

Answer 1

This is a "gotcha" that holds a lot of surprises for the uninitiated. The problem is that when you have a managed dictionary, to see updates propagated you need to change a key or a value of a key. Here technically you have not changed the value, that is, you are still referencing the same object instance (type ExampleClass) and are only changing something within that reference. Bizarre, I know. This is the modified method f that you need:

def f(self, dict):
    # generate a random index to add the class to
    index = str(random.randint(0, 100))

    # create a new class at that index
    dict[index] = ExampleClass(str(random.randint(100, 200)))

    # this is the problem, it doesn't share the updated variables in the dictionary between the processes <----------------------
    # attempt to change the created variables
    ec = dict[index]
    ec.count += 1
    ec.stringVar = "yeAH"
    dict[index] = ec # show new reference
    # print what's inside
    for x in dict.values():
        print(x.count, x.stringVar)

Note:

Had you used the following code to set the key/pair values, the following would actually print False:

ec = ExampleClass(str(random.randint(100, 200)))
dict[index] = ec
print(dict[index] is ec)

This is why in the modifed method f, dict[index] = ec # show new reference appears to be a new reference being set as the value.

Also, you should consider not using dict, a builtin data type, as a variable name.

Answered By: Booboo

python multiprocessing – Sharing a dictionary of classes between processes with subsequent writes from the process reflected to the shared memory

Question:

Problem

What I’ve tried

Example

Answers: