undo or reverse argsort(), python

Question

Given an array ‘a’ I would like to sort the array by columns sort(a, axis=0) do some stuff to the array and then undo the sort. By that I don’t mean re sort but basically reversing how each element was moved. I assume argsort() is what I need but it is not clear to me how to sort an array with the results of argsort() or more importantly apply the reverse/inverse of argsort()

Here is a little more detail

I have an array a, shape(a) = rXc I need to sort each column

aargsort = a.argsort(axis=0)  # May use this later
aSort = a.sort(axis=0)

now average each row

aSortRM = asort.mean(axis=1)

now replace each col in a row with the row mean.
is there a better way than this

aWithMeans = ones_like(a)
for ind in range(r)  # r = number of rows
    aWithMeans[ind]* aSortRM[ind]

Now I need to undo the sort I did in the first step. ????

Asked By: Vincent

||

Source

Answer 1

I’m not sure how best to do it in numpy, but, in pure Python, the reasoning would be:

aargsort is holding a permutation of range(len(a)) telling you where the items of aSort came from — much like, in pure Python:

>>> x = list('ciaobelu')
>>> r = range(len(x))
>>> r.sort(key=x.__getitem__)
>>> r
[2, 4, 0, 5, 1, 6, 3, 7]
>>>

i.e., the first argument of sorted(x) will be x[2], the second one x[4], and so forth.

So given the sorted version, you can reconstruct the original by “putting items back where they came from”:

>>> s = sorted(x)
>>> s
['a', 'b', 'c', 'e', 'i', 'l', 'o', 'u']
>>> original = [None] * len(s)
>>> for i, c in zip(r, s): original[i] = c
... 
>>> original
['c', 'i', 'a', 'o', 'b', 'e', 'l', 'u']
>>>

Of course there are going to be tighter and faster ways to express this in numpy (which unfortunately I don’t know inside-out as much as I know Python itself;-), but I hope this helps by showing the underlying logic of the “putting things back in place” operation you need to perform.

Answered By: Alex Martelli

Answer 2

I was not able to follow your example, but the more abstract problem–i.e., how to sort an array then reverse the sort–is straightforward.

import numpy as NP
# create an 10x6 array to work with
A = NP.random.randint(10, 99, 60).reshape(10, 6)
# for example, sort this array on the second-to-last column, 
# breaking ties using the second column (numpy requires keys in
# "reverse" order for some reason)
keys = (A[:,1], A[:,4])
ndx = NP.lexsort(keys, axis=0)
A_sorted = NP.take(A, ndx, axis=0)

To “reconstruct” A from A_sorted is trivial because remember that you used an index array (‘ndx’) to sort the array in the first place.

# ndx array for example above:  array([6, 9, 8, 0, 1, 2, 4, 7, 3, 5])

In other words, the 4th row in A_sorted was the 1st row in the original array, A, etc.

Answered By: doug

Answer 3

There are probably better solutions to the problem you are actually trying to solve than this (performing an argsort usually precludes the need to actually sort), but here you go:

>>> import numpy as np
>>> a = np.random.randint(0,10,10)
>>> aa = np.argsort(a)
>>> aaa = np.argsort(aa)
>>> a # original
array([6, 4, 4, 6, 2, 5, 4, 0, 7, 4])
>>> a[aa] # sorted
array([0, 2, 4, 4, 4, 4, 5, 6, 6, 7])
>>> a[aa][aaa] # undone
array([6, 4, 4, 6, 2, 5, 4, 0, 7, 4])

Answered By: Paul

Answer 4

For all those still looking for an answer:

In [135]: r = rand(10)

In [136]: i = argsort(r)

In [137]: r_sorted = r[i]

In [138]: i_rev = zeros(10, dtype=int)

In [139]: i_rev[i] = arange(10)

In [140]: allclose(r, r_sorted[i_rev])

Out[140]: True

Answered By: jesse

Answer 5

Super late to the game, but here:

import numpy as np
N = 1000 # or any large integer
x = np.random.randn( N )
I = np.argsort( x )
J = np.argsort( I )
print( np.allclose( x[I[J]] , x ) )
>> True

Basically, argsort the argsort because the nth element of the reverse sort is J[n] = k : I[k] = n. That is, I[J[n]] = n, so J sorts I.

Answered By: W. Ross Morrow

Answer 6

indices=np.argsort(a) gives you the sorting indices, such that x = a[indices] is the sorted array. y = b[indices] pushs forward the array b to sorted domain. c[indices] = z pulls back z from sorted domain into c in source domain.

For instance,

import numpy as np

n = 3
a = np.random.randint(0,10,n) # [1, 5, 2]
b = np.random.randn(n) # [-.1, .5, .2]
c = np.empty_like(a)

# indices that sort a: x=a[indices], x==np.sort(a) is all True
indices = np.argsort(a) # [0,2,1]

# y[i] is the value in b at the index of the i-th smallest value in a
y = b[indices] # [-.1, .2, .5]

# say z[i] is some value related to the i-th smallest entry in a 
z = np.random.randn(n) # [-1.1, 1.2, 1.3]
c[indices] = z # inverted the sorting map, c = [-1.1, 1.3, 1.2]

Answered By: talbon

undo or reverse argsort(), python

Question:

Answers: