numpy get start, exact, end of convergence

Question

I have a numpy array of floats:

[..., 50.0, 51.0, 52.2, ..., 59.3, 60.4, 61.3, 62.1, ..., 67.9, 68.1, 69.2, ...]

You can see that the numbers are first converging to 60, and then diverging from it. There is a range: from 52.0 to 68.0; in the middle of this range is 60.0; 60.0 is called the exact, 52.0 is called the start, 68.0 is called the end.

I’m trying to find indexes of start, exact, and end. In the array above: 52.2 is the start, 60.4 is the exact, 67.9 is the end. You can see that 52.2 is not exactly 52.0, but it is still the start because it’s closest to 52.0; 60.4 is not quite the same as 60.0, but it’s closest to 60.0 so it works as exact; same goes for 67.9 which is closest to 68.0

Is there a way to find the indexes of start, exact, end with numpy? What complicates the problems a bit is: we can have an array, floats of which are descending instead of ascending:

[..., 69.2, 68.1, 67.9, ..., 62.1, 61.3, 60.4, 59.3, ..., 52.2, 51.0, 50.0, ...]

In this case: 67.9 is the start, 60.4 is the exact, and 52.2 is the end.

So the numbers can be ascending or descending, and in any case I need to find the indexes.

I can more or less easily do it with loops, keeping track of current and next number, comparing them, and seeing whether it’s converging to 60.0 and then diverging from it. But I have lots of numbers, even more: I have an array of arrays, and for each array I need to get the indexes. That will be very slow with plain loops. And I’m a newbie in numpy, so I’m not sure how to vectorize it.

Please help if you can. If you need more information, you can ask it in the comments; I will be answering as quickly as possible. Thank you.

UPDATE

Thank you Sajad Safarveisi! Your solution works, and you are absolutely awesome!

However, it turns out that my arrays are not that simple. Sometimes I would have arrays with multiple ranges, so like this:

[..., 52.2, ..., 60.4, ..., 67.9, ..., 52.4, ..., 60.0, ..., 67.7, ..., 52.0, ..., 60.6, ..., 67.3, ...]

So in this case I would need a list of lists of indexes:

[
    [start0, exact0, end0],
    [start1, exact1, end1],
    [start2, exact2, end2],
]

Sajad, do you know a way to do this? I’d appreciate it. The problem is that different arrays can have different number of ranges: sometimes it can be 1 range, sometimes 3, and sometimes 0. I’m sorry for not posting the whole problem.

Asked By: acmpo6ou

||

Source

Answer 1

You can try the following to get the indexes for each array.

(1) Define a 1*3 numpy array that holds the start, exact, and end.

bound = np.array([[52, 60, 68]])

(2) Now, assume that the numpy array for which the indexes should be found is as follows

arr = np.array([50.0, 51.0, 60, 59.3, 60.4, 61.3, 62.1, 67.9, 68.1, 52.2])

(3) Finally, benefit from broadcasting and use the argmin method of a numpy array

np.abs((arr[:, np.newaxis] - bound)).argmin(axis=0)

The result is

array([9, 2, 7])

The first entry corresponds to the index of the closest entry to 52 (start), the second entry corresponds to the index of the closest entry to 60 (exact) and so on. As you can see, the order by which the entries in the arr are appearing does not matter for this implementation.

Answered By: Sajad Safarveisi

Answer 2

Regarding the extension you made (in your update), you can do the following which is an extension to my previous answer.

(1) Assume that you have three ranges for the sequence at hand (the same as for my previous answer). This time, bound (holds different ranges) will be a 3 * 3 matrix. For the general case of having r different ranges, you come up with a r * 3 matrix. So, when r = 3, we have

bound = np.array([[52, 60, 68], [60, 63, 69], [51.1, 61, 66]])

(2) The trick here is to reshape the matrix so that it has 3 dimensions.

bound = bound.reshape(3, 1, 3)

Note that the first argument should always be r (number of different ranges).

(3) Now, the rest is the same with one difference. The axis is now 1 when we call argmin because we added previously a new dimension.

arr = np.array([50.0, 51.0, 60, 59.3, 60.4, 61.3, 62.1, 67.9, 68.1, 52.2])
np.abs((arr[:, np.newaxis] - bound)).argmin(axis=1)

The dimensionality of the resulting numpy array is the same as bound.

array([[9, 2, 7],
       [2, 6, 8],
       [1, 5, 7]])

To sum up, you can have the following python function that takes care of all steps described above.

def index(arr: np.ndarray, ranges: np.ndarray) -> np.ndarray:
    assert ranges.shape[1] == 3, "The second dimension of 'ranges' should be of size 3"
    assert arr.ndim == 1, "'arr' should be uni-dimensional"
    ranges = ranges.reshape(ranges.shape[0], 1, 3)
    return np.abs((arr[:, np.newaxis] - ranges)).argmin(axis=1)

Example,

index(arr=arr, ranges=ranges)

array([[9, 2, 7],
       [2, 6, 8],
       [1, 5, 7]])

Answered By: Sajad Safarveisi

numpy get start, exact, end of convergence

Question:

Answers: