# Find the index of the k smallest values of a numpy array

## Question:

In order to find the index of the smallest value, I can use `argmin`:

``````import numpy as np
A = np.array([1, 7, 9, 2, 0.1, 17, 17, 1.5])
print A.argmin()     # 4 because A[4] = 0.1
``````

But how can I find the indices of the k-smallest values?

I’m looking for something like:

``````print A.argmin(numberofvalues=3)
# [4, 0, 7]  because A[4] <= A[0] <= A[7] <= all other A[i]
``````

Note: in my use case A has between ~ 10 000 and 100 000 values, and I’m interested for only the indices of the k=10 smallest values. k will never be > 10.

You can use `numpy.argsort` with slicing

``````>>> import numpy as np
>>> A = np.array([1, 7, 9, 2, 0.1, 17, 17, 1.5])
>>> np.argsort(A)[:3]
array([4, 0, 7], dtype=int32)
``````

Use `np.argpartition`. It does not sort the entire array. It only guarantees that the `kth` element is in sorted position and all smaller elements will be moved before it. Thus the first `k` elements will be the k-smallest elements.

``````import numpy as np

A = np.array([1, 7, 9, 2, 0.1, 17, 17, 1.5])
k = 3

idx = np.argpartition(A, k)
print(idx)
# [4 0 7 3 1 2 6 5]
``````

This returns the k-smallest values. Note that these may not be in sorted order.

``````print(A[idx[:k]])
# [ 0.1  1.   1.5]
``````

To obtain the k-largest values use

``````idx = np.argpartition(A, -k)
# [4 0 7 3 1 2 6 5]

A[idx[-k:]]
# [  9.  17.  17.]
``````

WARNING: Do not (re)use `idx = np.argpartition(A, k); A[idx[-k:]]` to obtain the k-largest.
That won’t always work. For example, these are NOT the 3 largest values in `x`:

``````x = np.array([100, 90, 80, 70, 60, 50, 40, 30, 20, 10, 0])
idx = np.argpartition(x, 3)
x[idx[-3:]]
array([ 70,  80, 100])
``````

Here is a comparison against `np.argsort`, which also works but just sorts the entire array to get the result.

``````In [2]: x = np.random.randn(100000)

In [3]: %timeit idx0 = np.argsort(x)[:100]
100 loops, best of 3: 8.26 ms per loop

In [4]: %timeit idx1 = np.argpartition(x, 100)[:100]
1000 loops, best of 3: 721 µs per loop

In [5]: np.alltrue(np.sort(np.argsort(x)[:100]) == np.sort(np.argpartition(x, 100)[:100]))
Out[5]: True
``````

`numpy.partition(your_array, k)` is an alternative. No slicing necessary as it gives the values sorted until the `kth` element.

For n-dimentional arrays, this function works well. The indecies are returned in a callable form. If you want a list of the indices to be returned, then you need to transpose the array before you make a list.

To retrieve the `k` largest, simply pass in `-k`.

``````def get_indices_of_k_smallest(arr, k):
idx = np.argpartition(arr.ravel(), k)
return tuple(np.array(np.unravel_index(idx, arr.shape))[:, range(min(k, 0), max(k, 0))])
# if you want it in a list of indices . . .
# return np.array(np.unravel_index(idx, arr.shape))[:, range(k)].transpose().tolist()
``````

Example:

``````r = np.random.RandomState(1234)
arr = r.randint(1, 1000, 2 * 4 * 6).reshape(2, 4, 6)

indices = get_indices_of_k_smallest(arr, 4)
indices
# (array([1, 0, 0, 1], dtype=int64),
#  array([3, 2, 0, 1], dtype=int64),
#  array([3, 0, 3, 3], dtype=int64))

arr[indices]
# array([ 4, 31, 54, 77])

%%timeit
get_indices_of_k_smallest(arr, 4)
# 17.1 µs ± 651 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
``````
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.