Different reslults with np.searchsorted and np.argmin during finding nearest indexes
Question:
I have a set of timestamp (arr)
data and list with starts and ends (cuts)
, the purpose is to intercept the data of the timestamp between the start and end and generate a new array. I have tried with two methodes, with np.searchsorted()
and np.argmin()
, but they give the different results. Any explication for this?
Thank you!
Here is my code:
import numpy as np
# Initialization data
arr = np.arange(761.55643, 1525.5704932002686, 1/ 1000)
cuts = [[810.211186646, 899.102014549], [903.520741867, 982.000921478], [985.201032795, 993.400610844],
[998.303881868, 1085.500698357], [1090.200656211, 1168.101925871], [1171.299249968, 1179.611318749],
[1184.610645285, 1271.597569677], [1275.600586067, 1363.696138556], [1368.301122947, 1455.500707533]]
# Function
vector_validity = np.zeros(len(arr))
new_arr_with_argmin = np.zeros(0)
for cut in cuts:
vector_validity[int(np.searchsorted(arr, cut[0])) : int(np.searchsorted(arr, cut[1]))] = 1
print(f"np.searchsorted start: {np.searchsorted(arr, cut[0])}")
print(f"np.argmin start: {np.argmin(abs(arr - cut[0]))}")
print(f"np.searchsorted end: {np.searchsorted(arr, cut[1])}")
print(f"np.argmin end: {np.argmin(abs(arr - cut[1]))}")
new_arr_with_argmin = np.concatenate((new_arr_with_argmin, arr[np.argmin(abs(arr - cut[0])) : np.argmin(abs(arr - cut[1]))]))
new_arr_with_searchsorted = arr[vector_validity == 1]
The result of the print:
> np.searchsorted start: 48655
> np.argmin start: 48655
> np.searchsorted end: 137546
> np.argmin end: 137546
> np.searchsorted start: 141965
> np.argmin start: 141964
> np.searchsorted end: 220445
> np.argmin end: 220444
> np.searchsorted start: 223645
> np.argmin start: 223645
> np.searchsorted end: 231845
> np.argmin end: 231844
> np.searchsorted start: 236748
> np.argmin start: 236747
> np.searchsorted end: 323945
> np.argmin end: 323944
> np.searchsorted start: 328645
> np.argmin start: 328644
> np.searchsorted end: 406546
> np.argmin end: 406545
> np.searchsorted start: 409743
> np.argmin start: 409743
> np.searchsorted end: 418055
> np.argmin end: 418055
> np.searchsorted start: 423055
> np.argmin start: 423054
> np.searchsorted end: 510042
> np.argmin end: 510041
> np.searchsorted start: 514045
> np.argmin start: 514044
> np.searchsorted end: 602140
> np.argmin end: 602140
> np.searchsorted start: 606745
> np.argmin start: 606745
> np.searchsorted end: 693945
> np.argmin end: 693944
So we can find that from interval 2, two methodes give different indexes.
Any explication for this result?
Answers:
The argmin
method finds the index of closest value, which is not what searchsorted
does.
Here’s a simple example:
In [130]: a = np.array([1, 2])
For inputs such as v=1.05 and v=1.95 (both between 1 and 2), the position returned by searchsorted(a, v)
is 1:
In [131]: np.searchsorted(a, [1.05, 1.95])
Out[131]: array([1, 1])
Your method based on argmin
does not give the same result for input values that are closer to 1 than 2:
In [137]: np.argmin(abs(a - 1.05))
Out[137]: 0
In [138]: np.argmin(abs(a - 1.5))
Out[138]: 0
In [139]: np.argmin(abs(a - 1.51))
Out[139]: 1
In [140]: np.argmin(abs(a - 1.95))
Out[140]: 1
I have a set of timestamp (arr)
data and list with starts and ends (cuts)
, the purpose is to intercept the data of the timestamp between the start and end and generate a new array. I have tried with two methodes, with np.searchsorted()
and np.argmin()
, but they give the different results. Any explication for this?
Thank you!
Here is my code:
import numpy as np
# Initialization data
arr = np.arange(761.55643, 1525.5704932002686, 1/ 1000)
cuts = [[810.211186646, 899.102014549], [903.520741867, 982.000921478], [985.201032795, 993.400610844],
[998.303881868, 1085.500698357], [1090.200656211, 1168.101925871], [1171.299249968, 1179.611318749],
[1184.610645285, 1271.597569677], [1275.600586067, 1363.696138556], [1368.301122947, 1455.500707533]]
# Function
vector_validity = np.zeros(len(arr))
new_arr_with_argmin = np.zeros(0)
for cut in cuts:
vector_validity[int(np.searchsorted(arr, cut[0])) : int(np.searchsorted(arr, cut[1]))] = 1
print(f"np.searchsorted start: {np.searchsorted(arr, cut[0])}")
print(f"np.argmin start: {np.argmin(abs(arr - cut[0]))}")
print(f"np.searchsorted end: {np.searchsorted(arr, cut[1])}")
print(f"np.argmin end: {np.argmin(abs(arr - cut[1]))}")
new_arr_with_argmin = np.concatenate((new_arr_with_argmin, arr[np.argmin(abs(arr - cut[0])) : np.argmin(abs(arr - cut[1]))]))
new_arr_with_searchsorted = arr[vector_validity == 1]
The result of the print:
> np.searchsorted start: 48655
> np.argmin start: 48655
> np.searchsorted end: 137546
> np.argmin end: 137546
> np.searchsorted start: 141965
> np.argmin start: 141964
> np.searchsorted end: 220445
> np.argmin end: 220444
> np.searchsorted start: 223645
> np.argmin start: 223645
> np.searchsorted end: 231845
> np.argmin end: 231844
> np.searchsorted start: 236748
> np.argmin start: 236747
> np.searchsorted end: 323945
> np.argmin end: 323944
> np.searchsorted start: 328645
> np.argmin start: 328644
> np.searchsorted end: 406546
> np.argmin end: 406545
> np.searchsorted start: 409743
> np.argmin start: 409743
> np.searchsorted end: 418055
> np.argmin end: 418055
> np.searchsorted start: 423055
> np.argmin start: 423054
> np.searchsorted end: 510042
> np.argmin end: 510041
> np.searchsorted start: 514045
> np.argmin start: 514044
> np.searchsorted end: 602140
> np.argmin end: 602140
> np.searchsorted start: 606745
> np.argmin start: 606745
> np.searchsorted end: 693945
> np.argmin end: 693944
So we can find that from interval 2, two methodes give different indexes.
Any explication for this result?
The argmin
method finds the index of closest value, which is not what searchsorted
does.
Here’s a simple example:
In [130]: a = np.array([1, 2])
For inputs such as v=1.05 and v=1.95 (both between 1 and 2), the position returned by searchsorted(a, v)
is 1:
In [131]: np.searchsorted(a, [1.05, 1.95])
Out[131]: array([1, 1])
Your method based on argmin
does not give the same result for input values that are closer to 1 than 2:
In [137]: np.argmin(abs(a - 1.05))
Out[137]: 0
In [138]: np.argmin(abs(a - 1.5))
Out[138]: 0
In [139]: np.argmin(abs(a - 1.51))
Out[139]: 1
In [140]: np.argmin(abs(a - 1.95))
Out[140]: 1