Python: Multiprocessing functions with parameters

Question:

I have some python functions that read files from remote hosts and process their content. I have put all these functions in a python file to run. But the problem is each function doesn’t start to run until the previous one finished its work and this take unwanted time. I’ve thought about running functions in parallel so google and find interesting solution included multiprocessing.

I have tried the accepted answer in:

Python: How can I run python functions in parallel?

from multiprocessing import Process

def func1():
   print('func1: starting')
   for i in range(10000000): pass
   print('func1: finishing')

def func2():
   print('func2: starting')
   for i in range(10000000): pass
   print('func2: finishing')

def runInParallel(*fns):
   proc = []
   for fn in fns:
      p = Process(target=fn)
      p.start()
      proc.append(p)
   for p in proc:
      p.join()

if __name__ == '__main__':
   runInParallel(func1, func2)

It works an give this output:

$ python multi-process.py
func1: starting
func2: starting
func2: finishing
func1: finishing

But I need to pass parameters to functions. So I changed the example code into:

def func1(a): 
   ......

def func2(b):
   ......

if __name__ == '__main__':
   runInParallel(func1(1), func2(2))

But the output changed to:

$ python multi-process.py 
func1: starting
func1: finishing
func2: starting
func2: finishing

And the functions don’t run in parallel way.

I don’t know why.

Asked By: hd.

||

Answers:

runInParallel(func1(1), func2(2)) actually invokes the functions (synchronously, in the current process) and applies runInParallel to their return values, not to the functions.

Instead, parameters to the functions should be passed thorough args or kwargs parameters to Process(target=fcn, ...), e.g. by modifying runInParallel to accept tuples of (function, function args, function kwargs), like so:

def runInParallel(*fns_params):
   proc = []
   for fn, fn_args, fn_kwargs in fns_params:
      p = Process(target=fn, args=fn_args, kwargs=fn_kwargs)
      p.start()
      proc.append(p)
   for p in proc:
      p.join()

if __name__ == '__main__':
   runInParallel(
      (func1, ('positional', 'argument', 'values'), {'name': 'value', 'argument': 'pairs'}), 
      (func2, func2_args, func2_kwargs)
   )

Answered By: Yuri Feldman
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.