numpy unique without sort
Question:
How can I use numpy unique without sorting the result but just in the order they appear in the sequence? Something like this?
a = [4,2,1,3,1,2,3,4]
np.unique(a) = [4,2,1,3]
rather than
np.unique(a) = [1,2,3,4]
Use naive solution should be fine to write a simple function. But as I need to do this multiple times, are there any fast and neat way to do this?
Answers:
You can do this with the return_index
parameter:
>>> import numpy as np
>>> a = [4,2,1,3,1,2,3,4]
>>> np.unique(a)
array([1, 2, 3, 4])
>>> indexes = np.unique(a, return_index=True)[1]
>>> [a[index] for index in sorted(indexes)]
[4, 2, 1, 3]
You could do this using numpy by doing something like this, the mergsort is stable so it’ll let you pick out the first or last occurrence of each value:
def unique(array, orderby='first'):
array = np.asarray(array)
order = array.argsort(kind='mergesort')
array = array[order]
diff = array[1:] != array[:-1]
if orderby == 'first':
diff = np.concatenate([[True], diff])
elif orderby == 'last':
diff = np.concatenate([diff, [True]])
else:
raise ValueError
uniq = array[diff]
index = order[diff]
return uniq[index.argsort()]
This answer is very similar to:
def unique(array):
uniq, index = np.unique(array, return_index=True)
return uniq[index.argsort()]
But, numpy.unique uses an unstable sort internally so you’re not guaranteed to get any specific index, ie first or last.
I think an ordered dict might also work:
def unique(array):
uniq = OrderedDict()
for i in array:
uniq[i] = 1
return uniq.keys()
How can I use numpy unique without sorting the result but just in the order they appear in the sequence? Something like this?
a = [4,2,1,3,1,2,3,4]
np.unique(a) = [4,2,1,3]
rather than
np.unique(a) = [1,2,3,4]
Use naive solution should be fine to write a simple function. But as I need to do this multiple times, are there any fast and neat way to do this?
You can do this with the return_index
parameter:
>>> import numpy as np >>> a = [4,2,1,3,1,2,3,4] >>> np.unique(a) array([1, 2, 3, 4]) >>> indexes = np.unique(a, return_index=True)[1] >>> [a[index] for index in sorted(indexes)] [4, 2, 1, 3]
You could do this using numpy by doing something like this, the mergsort is stable so it’ll let you pick out the first or last occurrence of each value:
def unique(array, orderby='first'):
array = np.asarray(array)
order = array.argsort(kind='mergesort')
array = array[order]
diff = array[1:] != array[:-1]
if orderby == 'first':
diff = np.concatenate([[True], diff])
elif orderby == 'last':
diff = np.concatenate([diff, [True]])
else:
raise ValueError
uniq = array[diff]
index = order[diff]
return uniq[index.argsort()]
This answer is very similar to:
def unique(array):
uniq, index = np.unique(array, return_index=True)
return uniq[index.argsort()]
But, numpy.unique uses an unstable sort internally so you’re not guaranteed to get any specific index, ie first or last.
I think an ordered dict might also work:
def unique(array):
uniq = OrderedDict()
for i in array:
uniq[i] = 1
return uniq.keys()