Astroscrappy not working on multiprocessing Jupyter
Question:
I have some multiprocessing code here that tries to run multiple astroscrappy processes at once. However, everything stops when it actually has to call astroscrappy. I am running this in a jupyter notebook.
def a_test(i, q):
import astroscrappy
print(1)
path = i
s = fits.getdata(path)
print(2)
print(2.5)
a = astroscrappy.detect_cosmics(s)
print(3)
q.put([a, i])
bundle = []
import multiprocessing as mp
queue = mp.Manager().Queue()
processes = []
for k, item in enumerate(paths):
processes.append(mp.Process(target=a_test, args=(item, queue)))
# Run processes
for p in processes:
p.start()
for p in processes:
bundle.append(queue.get())
It will only print out 1, 2, 2.5, but not 3 which comes after calling astroscrappy. Any ideas why it won’t work?
Answers:
The code is spawning multiple processes and each process must be given proper time to terminate. Calling join()
for each process does exactly that. I tested the code below for 3 files and was able to observe concurrency of processes. The execution time is 7-9 seconds on my machine.
import time
import astroscrappy
from astropy.io import fits
import multiprocessing as mp
bundle = []
processes = []
paths = [r"C:UsersxxxxFITSsample.fits", r"C:UsersxxxxFITSsample1.fits", r"C:UsersxxxxFITSsample2.fits"]
def a_test(i,q):
print(1)
path = i
s = fits.getdata(path)
print(2)
print(2.5)
a = astroscrappy.detect_cosmics(s)
print(3)
q.put([a, path])
if __name__ == '__main__':
start = time.time()
queue = mp.Manager().Queue()
for item in paths:
processes.append(mp.Process(target=a_test, args=(item,queue)))
# # Run processes
for p in processes:
p.start()
for p in processes:
bundle.append(queue.get())
#wait for child processes to finish
for p in processes:
p.join()
end = time.time()
print(f'Execution time: {end - start} seconds')
Output:
1
2
2.5
1
2
2.5
1
2
2.5
3
3
3
Execution time: 9.224327564239502 seconds
This code does not run in jupyter notebook out of the box. To read why multiprocessing
does not work as expected in jupyter notebook, you can refer to this discussion.
But there is a workaround. To make this code work in Jupyter notebook, you need to invoke a_test
from a separate python file. For example, I created a python file(in the same directory where your notebook is running) called functest.py with the following code:
import astroscrappy
from astropy.io import fits
def a_test(*args):
path = args[0]
s = fits.getdata(path)
a = astroscrappy.detect_cosmics(s)
return [a,path]
Now, run the code below in your notebook. Note that, I’ve used a Pool
instead of Process
and the output will not have print statements from a_test
such as 1,2,2,5 etc. I have removed them from a_test
deliberately because they will not print to the Jupyter notebook output. Instead, I’ve printed out bundle
to verify the processing.
import time
import multiprocessing as mp
import functest as f
paths = [r"C:UsersxxxxFITSsample.fits", r"C:UsersxxxxFITSsample1.fits", r"C:UsersxxxxFITSsample2.fits"]
def main():
t1 = time.time()
pool_size = len(paths)
with mp.Pool(processes=pool_size) as pool:
bundle = pool.map(f.a_test, [item for item in paths])
print(bundle)
print(f"Execution time: {time.time() - t1} seconds")
if __name__ == '__main__':
main()
Output:
[[(array([[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]]), array([[4.8158903, 4.808729 , 4.7792015, ..., 4.6927767, 4.7188225,
4.706318 ],
[4.765932 , 4.75....
Execution time: 8.333278179168701
Another alternative to multiprocessing
is the concurrent.futures
module which runs in jupyter notebook without any issues. The below code can be run in jupyter notebook. I was able to bring down the execution times to 5-6 seconds.
import time
import astroscrappy
from astropy.io import fits
from concurrent.futures import ThreadPoolExecutor
paths = [r"C:UsersxxxxFITSsample.fits", r"C:UsersxxxxFITSsample1.fits", r"C:UsersxxxxFITSsample2.fits"]
def a_test1(*args):
print(1)
path = args[0]
s = fits.getdata(path)
print(2)
print(2.5)
a = astroscrappy.detect_cosmics(s)
print(3)
return [a, path]
def main():
t1 = time.time()
n_threads = len(paths)
with ThreadPoolExecutor(n_threads) as executor:
futures = [executor.submit(a_test1, item) for item in paths]
bundle = [future.result() for future in futures]
print(bundle)
print(f"Execution time: {time.time() - t1}")
if __name__ == '__main__':
main()
Output:
1
1
1
2
2.5
2
2.5
2
2.5
3
3
3
[[(array([[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]]), array([[4.8158903, 4.808729 , 4.7792015, ..., 4.6927767, 4.7188225,
4.706318 ]....
Execution time: 5.936185836791992
Another option is to use the threading
module. Here’s a simple example which can be run in jupyter. I get execution times in the order of 5-6 seconds.
import time
import astroscrappy
from astropy.io import fits
from threading import Thread
paths = [r"C:UsersxxxxFITSsample.fits", r"C:UsersxxxxFITSsample1.fits", r"C:UsersxxxxFITSsample2.fits"]
thread_objs = []
def a_test(i):
print(1)
path = i
s = fits.getdata(path)
print(2)
print(2.5)
a = astroscrappy.detect_cosmics(s)
print(3)
def main():
t1 = time.time()
for item in paths:
thread_objs.append(Thread(target=a_test, args=(item,)))
# run each thread
for thread in thread_objs:
thread.start()
# wait for each thread to finish
for thread in thread_objs:
thread.join()
print(f"Execution time: {time.time() - t1}")
main()
Output:
1
1
1
2
2.5
2
2.5
2
2.5
3
3
3
Execution time: 6.320343971252441
Note that if you don’t have a requirement to process the result of a_test
outside of that function, then you don’t need to return anything from it. This would save further time.
I also ran some tests with joblib
. I’ve refactored your code and shared some test results below:
from joblib import Parallel, delayed
import astroscrappy
from astropy.io import fits
paths = [r"C:UsersxxxxxFITSsample.fits", r"C:UsersxxxxxFITSsample1.fits", r"C:UsersxxxxFITSsample2.fits"]
def a_test(i):
print(1)
path = i
s = fits.getdata(path)
print(2)
print(2.5)
a = astroscrappy.detect_cosmics(s)
print(3)
return ([a,i])
def main():
t1 = time.time()
a = Parallel(n_jobs=len(paths))(delayed(a_test)(i) for i in paths)
print(f"Execution time: {time.time() - t1}")
main()
Output:
Execution time: 6.360047101974487
The print statement outputs didn’t appear in jupyter notebook output, but the execution time was in the range of 6-7 seconds.
In conclusion, I didn’t observe significant time difference in execution times for any of the methods. This could also be because I tried on a small dataset(just 3 files). However, concurrent.futures
showed slightly better results consistently. You can try all these methods and compare which one works best for your use-case.
Using joblib’s Parallel, I was able to make this code work much faster without getting stuck.
def a_test_parallel(i):
import astroscrappy
print(1)
path = i
s = fits.getdata(path)
print(2)
print(2.5)
a = astroscrappy.detect_cosmics(s)
print(3)
return ([a,i])
a = Parallel(n_jobs=15)(delayed(a_test_parallel)(i) for i in paths[:100])
I ran it a couple of times in comparison with code that doesn’t use parallel. This runs almost twice as fast. Still not sure why multiprocessing doesn’t work, but at least this does.
Edit: After running it on a larger dataset with some extra adjustments, the code here doesn’t actually run as fast as twice the speed. It actually is a bit slower than the synchronous code.
If you look at astroscrappy’s github here: https://github.com/astropy/astroscrappy, you will see that it already uses its own multiprocessing called OpenMP. Trying to further multiprocess this kind of function will yield marginal results or not even anything at all. Your best bet is to just stick with the speed you already have, which should be fairly quick.
I have some multiprocessing code here that tries to run multiple astroscrappy processes at once. However, everything stops when it actually has to call astroscrappy. I am running this in a jupyter notebook.
def a_test(i, q):
import astroscrappy
print(1)
path = i
s = fits.getdata(path)
print(2)
print(2.5)
a = astroscrappy.detect_cosmics(s)
print(3)
q.put([a, i])
bundle = []
import multiprocessing as mp
queue = mp.Manager().Queue()
processes = []
for k, item in enumerate(paths):
processes.append(mp.Process(target=a_test, args=(item, queue)))
# Run processes
for p in processes:
p.start()
for p in processes:
bundle.append(queue.get())
It will only print out 1, 2, 2.5, but not 3 which comes after calling astroscrappy. Any ideas why it won’t work?
The code is spawning multiple processes and each process must be given proper time to terminate. Calling join()
for each process does exactly that. I tested the code below for 3 files and was able to observe concurrency of processes. The execution time is 7-9 seconds on my machine.
import time
import astroscrappy
from astropy.io import fits
import multiprocessing as mp
bundle = []
processes = []
paths = [r"C:UsersxxxxFITSsample.fits", r"C:UsersxxxxFITSsample1.fits", r"C:UsersxxxxFITSsample2.fits"]
def a_test(i,q):
print(1)
path = i
s = fits.getdata(path)
print(2)
print(2.5)
a = astroscrappy.detect_cosmics(s)
print(3)
q.put([a, path])
if __name__ == '__main__':
start = time.time()
queue = mp.Manager().Queue()
for item in paths:
processes.append(mp.Process(target=a_test, args=(item,queue)))
# # Run processes
for p in processes:
p.start()
for p in processes:
bundle.append(queue.get())
#wait for child processes to finish
for p in processes:
p.join()
end = time.time()
print(f'Execution time: {end - start} seconds')
Output:
1
2
2.5
1
2
2.5
1
2
2.5
3
3
3
Execution time: 9.224327564239502 seconds
This code does not run in jupyter notebook out of the box. To read why multiprocessing
does not work as expected in jupyter notebook, you can refer to this discussion.
But there is a workaround. To make this code work in Jupyter notebook, you need to invoke a_test
from a separate python file. For example, I created a python file(in the same directory where your notebook is running) called functest.py with the following code:
import astroscrappy
from astropy.io import fits
def a_test(*args):
path = args[0]
s = fits.getdata(path)
a = astroscrappy.detect_cosmics(s)
return [a,path]
Now, run the code below in your notebook. Note that, I’ve used a Pool
instead of Process
and the output will not have print statements from a_test
such as 1,2,2,5 etc. I have removed them from a_test
deliberately because they will not print to the Jupyter notebook output. Instead, I’ve printed out bundle
to verify the processing.
import time
import multiprocessing as mp
import functest as f
paths = [r"C:UsersxxxxFITSsample.fits", r"C:UsersxxxxFITSsample1.fits", r"C:UsersxxxxFITSsample2.fits"]
def main():
t1 = time.time()
pool_size = len(paths)
with mp.Pool(processes=pool_size) as pool:
bundle = pool.map(f.a_test, [item for item in paths])
print(bundle)
print(f"Execution time: {time.time() - t1} seconds")
if __name__ == '__main__':
main()
Output:
[[(array([[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]]), array([[4.8158903, 4.808729 , 4.7792015, ..., 4.6927767, 4.7188225,
4.706318 ],
[4.765932 , 4.75....
Execution time: 8.333278179168701
Another alternative to multiprocessing
is the concurrent.futures
module which runs in jupyter notebook without any issues. The below code can be run in jupyter notebook. I was able to bring down the execution times to 5-6 seconds.
import time
import astroscrappy
from astropy.io import fits
from concurrent.futures import ThreadPoolExecutor
paths = [r"C:UsersxxxxFITSsample.fits", r"C:UsersxxxxFITSsample1.fits", r"C:UsersxxxxFITSsample2.fits"]
def a_test1(*args):
print(1)
path = args[0]
s = fits.getdata(path)
print(2)
print(2.5)
a = astroscrappy.detect_cosmics(s)
print(3)
return [a, path]
def main():
t1 = time.time()
n_threads = len(paths)
with ThreadPoolExecutor(n_threads) as executor:
futures = [executor.submit(a_test1, item) for item in paths]
bundle = [future.result() for future in futures]
print(bundle)
print(f"Execution time: {time.time() - t1}")
if __name__ == '__main__':
main()
Output:
1
1
1
2
2.5
2
2.5
2
2.5
3
3
3
[[(array([[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]]), array([[4.8158903, 4.808729 , 4.7792015, ..., 4.6927767, 4.7188225,
4.706318 ]....
Execution time: 5.936185836791992
Another option is to use the threading
module. Here’s a simple example which can be run in jupyter. I get execution times in the order of 5-6 seconds.
import time
import astroscrappy
from astropy.io import fits
from threading import Thread
paths = [r"C:UsersxxxxFITSsample.fits", r"C:UsersxxxxFITSsample1.fits", r"C:UsersxxxxFITSsample2.fits"]
thread_objs = []
def a_test(i):
print(1)
path = i
s = fits.getdata(path)
print(2)
print(2.5)
a = astroscrappy.detect_cosmics(s)
print(3)
def main():
t1 = time.time()
for item in paths:
thread_objs.append(Thread(target=a_test, args=(item,)))
# run each thread
for thread in thread_objs:
thread.start()
# wait for each thread to finish
for thread in thread_objs:
thread.join()
print(f"Execution time: {time.time() - t1}")
main()
Output:
1
1
1
2
2.5
2
2.5
2
2.5
3
3
3
Execution time: 6.320343971252441
Note that if you don’t have a requirement to process the result of a_test
outside of that function, then you don’t need to return anything from it. This would save further time.
I also ran some tests with joblib
. I’ve refactored your code and shared some test results below:
from joblib import Parallel, delayed
import astroscrappy
from astropy.io import fits
paths = [r"C:UsersxxxxxFITSsample.fits", r"C:UsersxxxxxFITSsample1.fits", r"C:UsersxxxxFITSsample2.fits"]
def a_test(i):
print(1)
path = i
s = fits.getdata(path)
print(2)
print(2.5)
a = astroscrappy.detect_cosmics(s)
print(3)
return ([a,i])
def main():
t1 = time.time()
a = Parallel(n_jobs=len(paths))(delayed(a_test)(i) for i in paths)
print(f"Execution time: {time.time() - t1}")
main()
Output:
Execution time: 6.360047101974487
The print statement outputs didn’t appear in jupyter notebook output, but the execution time was in the range of 6-7 seconds.
In conclusion, I didn’t observe significant time difference in execution times for any of the methods. This could also be because I tried on a small dataset(just 3 files). However, concurrent.futures
showed slightly better results consistently. You can try all these methods and compare which one works best for your use-case.
Using joblib’s Parallel, I was able to make this code work much faster without getting stuck.
def a_test_parallel(i):
import astroscrappy
print(1)
path = i
s = fits.getdata(path)
print(2)
print(2.5)
a = astroscrappy.detect_cosmics(s)
print(3)
return ([a,i])
a = Parallel(n_jobs=15)(delayed(a_test_parallel)(i) for i in paths[:100])
I ran it a couple of times in comparison with code that doesn’t use parallel. This runs almost twice as fast. Still not sure why multiprocessing doesn’t work, but at least this does.
Edit: After running it on a larger dataset with some extra adjustments, the code here doesn’t actually run as fast as twice the speed. It actually is a bit slower than the synchronous code.
If you look at astroscrappy’s github here: https://github.com/astropy/astroscrappy, you will see that it already uses its own multiprocessing called OpenMP. Trying to further multiprocess this kind of function will yield marginal results or not even anything at all. Your best bet is to just stick with the speed you already have, which should be fairly quick.