Vectorization a code to make it faster than this
Question:
I have a little bit code which I’ll have to vectorizate it to make it faster. I’m not very attached into python and thinking that the for loop is not so efficient.
Is there any way to reduce the time?
import numpy as np
import time
start = time.time()
N = 10000000 #9 seconds
#N = 100000000 #93 seconds
alpha = np.linspace(0.00000000000001, np.pi/2, N)
tmp = 2.47*np.sin(alpha)
for i in range(N):
if (abs(tmp[i])>1.0):
tmp[i]=1.0*np.sign(tmp[i])
beta = np.arcsin(tmp)
end = time.time()
print("Executed time: ",round(end-start,1),"Seconds")
I read about some numpy functions but I don’t have a solution for this.
Answers:
Instead of using loop with condition, you can access the values by computing a mask. Here is example:
N = 10000000
alpha = np.linspace(0.00000000000001, np.pi/2, N)
tmp = 2.47*np.sin(alpha)
indices = np.abs(tmp) > 1.0
tmp[indices] = np.sign(tmp[indices])
beta = np.arcsin(tmp)
Results on my setup:
- before:
5.66 s ± 30.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
,
- after:
182 ms ± 877 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
.
Clip the array:
tmp = np.clip(2.47 * np.sin(alpha), -1.0, 1.0)
I have a little bit code which I’ll have to vectorizate it to make it faster. I’m not very attached into python and thinking that the for loop is not so efficient.
Is there any way to reduce the time?
import numpy as np
import time
start = time.time()
N = 10000000 #9 seconds
#N = 100000000 #93 seconds
alpha = np.linspace(0.00000000000001, np.pi/2, N)
tmp = 2.47*np.sin(alpha)
for i in range(N):
if (abs(tmp[i])>1.0):
tmp[i]=1.0*np.sign(tmp[i])
beta = np.arcsin(tmp)
end = time.time()
print("Executed time: ",round(end-start,1),"Seconds")
I read about some numpy functions but I don’t have a solution for this.
Instead of using loop with condition, you can access the values by computing a mask. Here is example:
N = 10000000
alpha = np.linspace(0.00000000000001, np.pi/2, N)
tmp = 2.47*np.sin(alpha)
indices = np.abs(tmp) > 1.0
tmp[indices] = np.sign(tmp[indices])
beta = np.arcsin(tmp)
Results on my setup:
- before:
5.66 s ± 30.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
, - after:
182 ms ± 877 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
.
Clip the array:
tmp = np.clip(2.47 * np.sin(alpha), -1.0, 1.0)