python appending does not work while using multiprocessing

Question:

So the problem is that appending "e" to my list "ok = []" does not have an effect, but that is weird considering that, when doing print(e) just one line above ok.append(e), the value of "e" is printed out as it should.

no need to understand the program and what it does, the main issure here is just that appending some value to my list does not have effect, even though the value is real.

I tried to use ok = [] inside of if __name__=='__main__': however that gave me the error NameError: name 'ok' is not defined so i then tried to use "global ok" inside of "some_function" however that gave me the same results

import time
import multiprocessing as mp

ratios1 = [1/x for x in range(1,11)]
ratios2 = [y/1 for y in range(1,11)]

x = 283
y = 436

ok = []

def some_function(x_, y_):

    list_ = [[a, b] for a in range(1, 1980 + 1) for b in range(1, 1980 + 1) if a / b == x_ / y_]
    for e in list_:
        if not e[0] in [h[0] for h in ok]:
            if not e[1] in [u[1] for u in ok]:
                print(e)
                ok.append(e)


if __name__=='__main__':

    processes = []

    if x / y in ratios1 or x / y in ratios2:
        some_function(x_=x, y_=y)
    else:

        for X_, Y_ in [

            [x, y],
            [x - 1, y], [x, y - 1], [x + 1, y], [x, y + 1],
            [x - 2, y], [x, y - 2], [x + 2, y], [x, y + 2],
            [x - 3, y], [x, y - 3], [x + 3, y], [x, y + 3]

        ]:

            p = mp.Process(target=some_function, args=(X_,Y_))
            processes.append(p)

    start = time.time()

    for p_ in processes:
        p_.start()

    for p_ in processes:
        p_.join()

    end = time.time()

    print(f"finished in {end - start} sec")
    print(ok)

when running this is output:

[...]                                   # other values of "e"
[283, 433]                              # some random "e" value
[566, 866]                              # some random "e" value
[849, 1299]                             # some random "e" value
[1132, 1732]                            # some random "e" value
finished in 0.8476874828338623 sec      # execution time
[]                                      # the "ok" list being printed out at the end

after adding print(id(ok)) both in "some_function" and in the end, it gives me the following output:

OBS: I removed print(e) for this output

2489040444480
3014871358528
2324227431488
2471301880896
1803966487616
2531583073344
1665411652672
2149818113088
2330038901824
1283883998272
2498472320064
2147028311104
2509405887552
finished in 0.8341867923736572 sec
2589544128640
[]
Asked By: Neverland1337

||

Answers:

This should work, the problem was that when you start the process the objects it uses are not really passed to it as much as they are cloned. Using muliprocessing.Pool.starmap allows us to return values from the process which circumvents this issue.
We use starmap and not just map, so that we can pass multiple parameters to some_function.

import time
import multiprocessing as mp
from multiprocessing import Pool

ratios1 = [1/x for x in range(1,11)]
ratios2 = [y/1 for y in range(1,11)]

x = 283
y = 436

ok = []

def some_function(x_, y_):

    list_ = [[a, b] for a in range(1, 1980 + 1) for b in range(1, 1980 + 1) if a / b == x_ / y_]
    for e in list_:
        if not e[0] in [h[0] for h in ok]:
            if not e[1] in [u[1] for u in ok]:
                print(e)
                ok.append(e)
    return ok

if __name__=='__main__':

    processes = []
    res=[]

    if x / y in ratios1 or x / y in ratios2:
        some_function(x_=x, y_=y)
    else:
            start = time.time()
            with Pool(13) as p:
                res = p.starmap(some_function, [[x, y],
                    [x - 1, y], [x, y - 1], [x + 1, y], [x, y + 1],
                    [x - 2, y], [x, y - 2], [x + 2, y], [x, y + 2],
                    [x - 3, y], [x, y - 3], [x + 3, y], [x, y + 3]])
            ok = res
            end = time.time()

    print(f"finished in {end - start} sec")
    print(ok)
Answered By: thomas

you need a list that can be accessed from more than one process, which is made by using a multiprocessing.Manager.list, and you have to pass it as an argument, you cannot have it as a global, as inheriting globals is OS sepecific.

using a managed list is slower than a normal list, so if you find the performance unacceptable you should really try to work with only local variables and forget about using globals, as IPC is an expensive process.

import time
import multiprocessing as mp

ratios1 = [1/x for x in range(1,11)]
ratios2 = [y/1 for y in range(1,11)]

x = 283
y = 436

def some_function(x_, y_, ok_list):

    list_ = [[a, b] for a in range(1, 1980 + 1) for b in range(1, 1980 + 1) if a / b == x_ / y_]
    for e in list_:
        if not e[0] in [h[0] for h in ok_list]:
            if not e[1] in [u[1] for u in ok_list]:
                print(e)
                ok_list.append(e)


if __name__=='__main__':
    manager = mp.Manager()
    ok_list = manager.list()
    processes = []

    if x / y in ratios1 or x / y in ratios2:
        some_function(x_=x, y_=y)
    else:

        for X_, Y_ in [

            [x, y],
            [x - 1, y], [x, y - 1], [x + 1, y], [x, y + 1],
            [x - 2, y], [x, y - 2], [x + 2, y], [x, y + 2],
            [x - 3, y], [x, y - 3], [x + 3, y], [x, y + 3]

        ]:

            p = mp.Process(target=some_function, args=(X_,Y_,ok_list))
            processes.append(p)

    start = time.time()

    for p_ in processes:
        p_.start()

    for p_ in processes:
        p_.join()

    end = time.time()

    print(f"finished in {end - start} sec")
    print(ok_list)
Answered By: Ahmed AEK