undo or reverse argsort(), python

Question:

Given an array ‘a’ I would like to sort the array by columns sort(a, axis=0) do some stuff to the array and then undo the sort. By that I don’t mean re sort but basically reversing how each element was moved. I assume argsort() is what I need but it is not clear to me how to sort an array with the results of argsort() or more importantly apply the reverse/inverse of argsort()

Here is a little more detail

I have an array a, shape(a) = rXc I need to sort each column

aargsort = a.argsort(axis=0)  # May use this later
aSort = a.sort(axis=0)

now average each row

aSortRM = asort.mean(axis=1)

now replace each col in a row with the row mean.
is there a better way than this

aWithMeans = ones_like(a)
for ind in range(r)  # r = number of rows
    aWithMeans[ind]* aSortRM[ind]

Now I need to undo the sort I did in the first step. ????

Asked By: Vincent

||

Answers:

I’m not sure how best to do it in numpy, but, in pure Python, the reasoning would be:

aargsort is holding a permutation of range(len(a)) telling you where the items of aSort came from — much like, in pure Python:

>>> x = list('ciaobelu')
>>> r = range(len(x))
>>> r.sort(key=x.__getitem__)
>>> r
[2, 4, 0, 5, 1, 6, 3, 7]
>>> 

i.e., the first argument of sorted(x) will be x[2], the second one x[4], and so forth.

So given the sorted version, you can reconstruct the original by “putting items back where they came from”:

>>> s = sorted(x)
>>> s
['a', 'b', 'c', 'e', 'i', 'l', 'o', 'u']
>>> original = [None] * len(s)
>>> for i, c in zip(r, s): original[i] = c
... 
>>> original
['c', 'i', 'a', 'o', 'b', 'e', 'l', 'u']
>>> 

Of course there are going to be tighter and faster ways to express this in numpy (which unfortunately I don’t know inside-out as much as I know Python itself;-), but I hope this helps by showing the underlying logic of the “putting things back in place” operation you need to perform.

Answered By: Alex Martelli

I was not able to follow your example, but the more abstract problem–i.e., how to sort an array then reverse the sort–is straightforward.

import numpy as NP
# create an 10x6 array to work with
A = NP.random.randint(10, 99, 60).reshape(10, 6)
# for example, sort this array on the second-to-last column, 
# breaking ties using the second column (numpy requires keys in
# "reverse" order for some reason)
keys = (A[:,1], A[:,4])
ndx = NP.lexsort(keys, axis=0)
A_sorted = NP.take(A, ndx, axis=0)

To “reconstruct” A from A_sorted is trivial because remember that you used an index array (‘ndx’) to sort the array in the first place.

# ndx array for example above:  array([6, 9, 8, 0, 1, 2, 4, 7, 3, 5])

In other words, the 4th row in A_sorted was the 1st row in the original array, A, etc.

Answered By: doug

There are probably better solutions to the problem you are actually trying to solve than this (performing an argsort usually precludes the need to actually sort), but here you go:

>>> import numpy as np
>>> a = np.random.randint(0,10,10)
>>> aa = np.argsort(a)
>>> aaa = np.argsort(aa)
>>> a # original
array([6, 4, 4, 6, 2, 5, 4, 0, 7, 4])
>>> a[aa] # sorted
array([0, 2, 4, 4, 4, 4, 5, 6, 6, 7])
>>> a[aa][aaa] # undone
array([6, 4, 4, 6, 2, 5, 4, 0, 7, 4])
Answered By: Paul

For all those still looking for an answer:

In [135]: r = rand(10)

In [136]: i = argsort(r)

In [137]: r_sorted = r[i]

In [138]: i_rev = zeros(10, dtype=int)

In [139]: i_rev[i] = arange(10)

In [140]: allclose(r, r_sorted[i_rev])

Out[140]: True
Answered By: jesse

Super late to the game, but here:

import numpy as np
N = 1000 # or any large integer
x = np.random.randn( N )
I = np.argsort( x )
J = np.argsort( I )
print( np.allclose( x[I[J]] , x ) )
>> True

Basically, argsort the argsort because the nth element of the reverse sort is J[n] = k : I[k] = n. That is, I[J[n]] = n, so J sorts I.

Answered By: W. Ross Morrow

indices=np.argsort(a) gives you the sorting indices, such that x = a[indices] is the sorted array. y = b[indices] pushs forward the array b to sorted domain. c[indices] = z pulls back z from sorted domain into c in source domain.

For instance,

import numpy as np

n = 3
a = np.random.randint(0,10,n) # [1, 5, 2]
b = np.random.randn(n) # [-.1, .5, .2]
c = np.empty_like(a)

# indices that sort a: x=a[indices], x==np.sort(a) is all True
indices = np.argsort(a) # [0,2,1]

# y[i] is the value in b at the index of the i-th smallest value in a
y = b[indices] # [-.1, .2, .5]

# say z[i] is some value related to the i-th smallest entry in a 
z = np.random.randn(n) # [-1.1, 1.2, 1.3]
c[indices] = z # inverted the sorting map, c = [-1.1, 1.3, 1.2] 
Answered By: talbon
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.