Is it possible to use argsort in descending order?

Question:

Consider the following code:

avgDists = np.array([1, 8, 6, 9, 4])
ids = avgDists.argsort()[:n]

This gives me indices of the n smallest elements. Is it possible to use this same argsort in descending order to get the indices of n highest elements?

Asked By: shn

||

Answers:

Just like Python, in that [::-1] reverses the array returned by argsort() and [:n] gives that last n elements:

>>> avgDists=np.array([1, 8, 6, 9, 4])
>>> n=3
>>> ids = avgDists.argsort()[::-1][:n]
>>> ids
array([3, 1, 2])

The advantage of this method is that ids is a view of avgDists:

>>> ids.flags
  C_CONTIGUOUS : False
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

(The ‘OWNDATA’ being False indicates this is a view, not a copy)

Another way to do this is something like:

(-avgDists).argsort()[:n]

The problem is that the way this works is to create negative of each element in the array:

>>> (-avgDists)
array([-1, -8, -6, -9, -4])

ANd creates a copy to do so:

>>> (-avgDists_n).flags['OWNDATA']
True

So if you time each, with this very small data set:

>>> import timeit
>>> timeit.timeit('(-avgDists).argsort()[:3]', setup="from __main__ import avgDists")
4.2879798610229045
>>> timeit.timeit('avgDists.argsort()[::-1][:3]', setup="from __main__ import avgDists")
2.8372560259886086

The view method is substantially faster (and uses 1/2 the memory…)

Answered By: dawg

If you negate an array, the lowest elements become the highest elements and vice-versa. Therefore, the indices of the n highest elements are:

(-avgDists).argsort()[:n]

Another way to reason about this, as mentioned in the comments, is to observe that the big elements are coming last in the argsort. So, you can read from the tail of the argsort to find the n highest elements:

avgDists.argsort()[::-1][:n]

Both methods are O(n log n) in time complexity, because the argsort call is the dominant term here. But the second approach has a nice advantage: it replaces an O(n) negation of the array with an O(1) slice. If you’re working with small arrays inside loops then you may get some performance gains from avoiding that negation, and if you’re working with huge arrays then you can save on memory usage because the negation creates a copy of the entire array.

Note that these methods do not always give equivalent results: if a stable sort implementation is requested to argsort, e.g. by passing the keyword argument kind='mergesort', then the first strategy will preserve the sorting stability, but the second strategy will break stability (i.e. the positions of equal items will get reversed).

Example timings:

Using a small array of 100 floats and a length 30 tail, the view method was about 15% faster

>>> avgDists = np.random.rand(100)
>>> n = 30
>>> timeit (-avgDists).argsort()[:n]
1.93 µs ± 6.68 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
>>> timeit avgDists.argsort()[::-1][:n]
1.64 µs ± 3.39 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
>>> timeit avgDists.argsort()[-n:][::-1]
1.64 µs ± 3.66 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

For larger arrays, the argsort is dominant and there is no significant timing difference

>>> avgDists = np.random.rand(1000)
>>> n = 300
>>> timeit (-avgDists).argsort()[:n]
21.9 µs ± 51.2 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> timeit avgDists.argsort()[::-1][:n]
21.7 µs ± 33.3 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> timeit avgDists.argsort()[-n:][::-1]
21.9 µs ± 37.1 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Please note that the comment from nedim below is incorrect. Whether to truncate before or after reversing makes no difference in efficiency, since both of these operations are only striding a view of the array differently and not actually copying data.

Answered By: wim

You could create a copy of the array and then multiply each element with -1.
As an effect the before largest elements would become the smallest.
The indeces of the n smallest elements in the copy are the n greatest elements in the original.

Answered By: MentholBonbon

Instead of using np.argsort you could use np.argpartition – if you only need the indices of the lowest/highest n elements.

That doesn’t require to sort the whole array but just the part that you need but note that the “order inside your partition” is undefined, so while it gives the correct indices they might not be correctly ordered:

>>> avgDists = [1, 8, 6, 9, 4]
>>> np.array(avgDists).argpartition(2)[:2]  # indices of lowest 2 items
array([0, 4], dtype=int64)

>>> np.array(avgDists).argpartition(-2)[-2:]  # indices of highest 2 items
array([1, 3], dtype=int64)
Answered By: MSeifert

You can use the flip commands numpy.flipud() or numpy.fliplr() to get the indexes in descending order after sorting using the argsort command. Thats what I usually do.

Answered By: Kanmani

Another way is to use only a ‘-‘ in the argument for argsort as in : “df[np.argsort(-df[:, 0])]”, provided df is the dataframe and you want to sort it by the first column (represented by the column number ‘0’). Change the column-name as appropriate. Of course, the column has to be a numeric one.

Answered By: Biswajit Ghoshal

With your example:

avgDists = np.array([1, 8, 6, 9, 4])

Obtain indexes of n maximal values:

ids = np.argpartition(avgDists, -n)[-n:]

Sort them in descending order:

ids = ids[np.argsort(avgDists[ids])[::-1]]

Obtain results (for n=4):

>>> avgDists[ids]
array([9, 8, 6, 4])
Answered By: Alexey Antonenko

As @Kanmani hinted, an easier to interpret implementation may use numpy.flip, as in the following:

import numpy as np

avgDists = np.array([1, 8, 6, 9, 4])
ids = np.flip(np.argsort(avgDists))
print(ids)

By using the visitor pattern rather than member functions, it is easier to read the order of operations.

Answered By: Adam Erickson

An elegant way could be as follows –

ids = np.flip(np.argsort(avgDists))

This will give you indices of elements sorted in descending order.
Now you can use regular slicing…

top_n = ids[:n]
Answered By: NiteshK

consider order of equal elements

If you run a sorting routine and 2 elements are equal, the order is usually not changed. However, the flip/[::-1] approach changes the order of equal elements.

>>> arr = np.array([3, 5, 4, 7, 3])
>>> 
>>> np.argsort(arr)[::-1]
array([3, 1, 2, 4, 0])  # equal elements reorderd
>>> np.argsort(-arr)
array([3, 1, 2, 0, 4])  # equal elements not reorderd (compatible to other sorting)

For compatibility reasons I would hence prefer the argsort of the negative array approach. This is especially relevant, when arr represents some number representation of more complex elements.

Example:

obj = ['street', 'house', 'bridge', 'station', 'rails']
arr = np.array([3, 5, 4, 7, 3])  # cost of obj in coins

Disclaimer: A more common approach is to solve the example above with sorted(list_of_tuples_obj_cost, key=lambda x: x[1])

Answered By: Markus Dutschke
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.