Different reslults with np.searchsorted and np.argmin during finding nearest indexes

Question:

I have a set of timestamp (arr) data and list with starts and ends (cuts), the purpose is to intercept the data of the timestamp between the start and end and generate a new array. I have tried with two methodes, with np.searchsorted() and np.argmin(), but they give the different results. Any explication for this?

Thank you!

Here is my code:

import numpy as np

# Initialization data  
arr = np.arange(761.55643, 1525.5704932002686, 1/ 1000)

cuts = [[810.211186646, 899.102014549], [903.520741867, 982.000921478], [985.201032795, 993.400610844], 
       [998.303881868, 1085.500698357], [1090.200656211, 1168.101925871], [1171.299249968, 1179.611318749], 
       [1184.610645285, 1271.597569677], [1275.600586067, 1363.696138556], [1368.301122947, 1455.500707533]]
# Function


vector_validity = np.zeros(len(arr))
new_arr_with_argmin = np.zeros(0)
for cut in cuts:
    vector_validity[int(np.searchsorted(arr, cut[0])) : int(np.searchsorted(arr, cut[1]))] = 1
    print(f"np.searchsorted start: {np.searchsorted(arr, cut[0])}")
    print(f"np.argmin start: {np.argmin(abs(arr - cut[0]))}")
    print(f"np.searchsorted end: {np.searchsorted(arr, cut[1])}")
    print(f"np.argmin end: {np.argmin(abs(arr - cut[1]))}")
    
    new_arr_with_argmin = np.concatenate((new_arr_with_argmin, arr[np.argmin(abs(arr - cut[0])) : np.argmin(abs(arr - cut[1]))]))
new_arr_with_searchsorted = arr[vector_validity == 1] 

The result of the print:


>     np.searchsorted start: 48655
>     np.argmin start: 48655
>     np.searchsorted end: 137546
>     np.argmin end: 137546
>     np.searchsorted start: 141965
>     np.argmin start: 141964
>     np.searchsorted end: 220445
>     np.argmin end: 220444
>     np.searchsorted start: 223645
>     np.argmin start: 223645
>     np.searchsorted end: 231845
>     np.argmin end: 231844
>     np.searchsorted start: 236748
>     np.argmin start: 236747
>     np.searchsorted end: 323945
>     np.argmin end: 323944
>     np.searchsorted start: 328645
>     np.argmin start: 328644
>     np.searchsorted end: 406546
>     np.argmin end: 406545
>     np.searchsorted start: 409743
>     np.argmin start: 409743
>     np.searchsorted end: 418055
>     np.argmin end: 418055
>     np.searchsorted start: 423055
>     np.argmin start: 423054
>     np.searchsorted end: 510042
>     np.argmin end: 510041
>     np.searchsorted start: 514045
>     np.argmin start: 514044
>     np.searchsorted end: 602140
>     np.argmin end: 602140
>     np.searchsorted start: 606745
>     np.argmin start: 606745
>     np.searchsorted end: 693945
>     np.argmin end: 693944

So we can find that from interval 2, two methodes give different indexes.
Any explication for this result?

Asked By: HMH1013

||

Answers:

The argmin method finds the index of closest value, which is not what searchsorted does.

Here’s a simple example:

In [130]: a = np.array([1, 2])

For inputs such as v=1.05 and v=1.95 (both between 1 and 2), the position returned by searchsorted(a, v) is 1:

In [131]: np.searchsorted(a, [1.05, 1.95])
Out[131]: array([1, 1])

Your method based on argmin does not give the same result for input values that are closer to 1 than 2:

In [137]: np.argmin(abs(a - 1.05))
Out[137]: 0

In [138]: np.argmin(abs(a - 1.5))
Out[138]: 0

In [139]: np.argmin(abs(a - 1.51))
Out[139]: 1

In [140]: np.argmin(abs(a - 1.95))
Out[140]: 1
Answered By: Warren Weckesser
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.