multiprocessing: map vs map_async
Question:
What’s the difference between using map
and map_async
? Are they not running the same function after distributing the items from the list to 4 processes?
So is it wrong to presume both are running asynchronous and parallel?
def f(x):
return 2*x
p=Pool(4)
l=[1,2,3,4]
out1=p.map(f,l)
#vs
out2=p.map_async(f,l)
Answers:
There are four choices to mapping jobs to processes. You have to consider multi-args, concurrency, blocking, and ordering. map
and map_async
only differ with respect to blocking. map_async
is non-blocking where as map
is blocking
So let’s say you had a function
from multiprocessing import Pool
import time
def f(x):
print x*x
if __name__ == '__main__':
pool = Pool(processes=4)
pool.map(f, range(10))
r = pool.map_async(f, range(10))
# DO STUFF
print 'HERE'
print 'MORE'
r.wait()
print 'DONE'
Example output:
0
1
9
4
16
25
36
49
64
81
0
HERE
1
4
MORE
16
25
36
9
49
64
81
DONE
pool.map(f, range(10))
will wait for all 10 of those function calls to finish so we see all the prints in a row.
r = pool.map_async(f, range(10))
will execute them asynchronously and only block when r.wait()
is called so we see HERE
and MORE
in between but DONE
will always be at the end.
What’s the difference between using map
and map_async
? Are they not running the same function after distributing the items from the list to 4 processes?
So is it wrong to presume both are running asynchronous and parallel?
def f(x):
return 2*x
p=Pool(4)
l=[1,2,3,4]
out1=p.map(f,l)
#vs
out2=p.map_async(f,l)
There are four choices to mapping jobs to processes. You have to consider multi-args, concurrency, blocking, and ordering. map
and map_async
only differ with respect to blocking. map_async
is non-blocking where as map
is blocking
So let’s say you had a function
from multiprocessing import Pool
import time
def f(x):
print x*x
if __name__ == '__main__':
pool = Pool(processes=4)
pool.map(f, range(10))
r = pool.map_async(f, range(10))
# DO STUFF
print 'HERE'
print 'MORE'
r.wait()
print 'DONE'
Example output:
0
1
9
4
16
25
36
49
64
81
0
HERE
1
4
MORE
16
25
36
9
49
64
81
DONE
pool.map(f, range(10))
will wait for all 10 of those function calls to finish so we see all the prints in a row.
r = pool.map_async(f, range(10))
will execute them asynchronously and only block when r.wait()
is called so we see HERE
and MORE
in between but DONE
will always be at the end.