Quickest way to find the nth largest value in a numpy Matrix
Question:
There are lots of solutions to do this for a single array, but what about a matrix, such as:
>>> k
array([[ 35, 48, 63],
[ 60, 77, 96],
[ 91, 112, 135]])
You can use k.max()
, but of course this only returns the highest value, 135
. What if I want the second or third?
Answers:
You can flatten the matrix and then sort it:
>>> k = np.array([[ 35, 48, 63],
... [ 60, 77, 96],
... [ 91, 112, 135]])
>>> flat=k.flatten()
>>> flat.sort()
>>> flat
array([ 35, 48, 60, 63, 77, 91, 96, 112, 135])
>>> flat[-2]
112
>>> flat[-3]
96
As said, np.partition
should be faster (at most O(n) running time):
np.partition(k.flatten(), -2)[-2]
should return the 2nd largest element. (partition
guarantees that the numbered element is in position, all elements before are smaller, and all behind are bigger).
import numpy as np
a=np.array([[1,2,3],[4,5,6]])
a=a.reshape((a.shape[0])*(a.shape[1])) # n is the nth largest taken by us
print(a[np.argsort()[-n]])
Another way of doing this when repeating elements are presented in the array at hand.
If we have something like
a = np.array([[1,1],[3,4]])
then the second largest element will be 3, not 1.
Alternatively, one could use the following snippet:
second_largest = sorted(list(set(a.flatten().tolist())))[-2]
First, flatten matrix, then only leave unique elements, then back to the mutable list, sort it and take the second element. This should return the second largest element from the end even if there are repeating elements in the array.
nums = [[ 35, 48, 63],
[ 60, 77, 96],
[ 91, 112, 135]]
highs = [max(lst) for lst in nums]
highs[nth]
Using the ‘unique’ function is a very clean way to do it, but likely not the fastest:
k = array([[ 35, 48, 63],
[ 60, 77, 96],
[ 91, 112, 135]])
i = numpy.unique(k)[-2]
for the second largest
There are lots of solutions to do this for a single array, but what about a matrix, such as:
>>> k
array([[ 35, 48, 63],
[ 60, 77, 96],
[ 91, 112, 135]])
You can use k.max()
, but of course this only returns the highest value, 135
. What if I want the second or third?
You can flatten the matrix and then sort it:
>>> k = np.array([[ 35, 48, 63],
... [ 60, 77, 96],
... [ 91, 112, 135]])
>>> flat=k.flatten()
>>> flat.sort()
>>> flat
array([ 35, 48, 60, 63, 77, 91, 96, 112, 135])
>>> flat[-2]
112
>>> flat[-3]
96
As said, np.partition
should be faster (at most O(n) running time):
np.partition(k.flatten(), -2)[-2]
should return the 2nd largest element. (partition
guarantees that the numbered element is in position, all elements before are smaller, and all behind are bigger).
import numpy as np
a=np.array([[1,2,3],[4,5,6]])
a=a.reshape((a.shape[0])*(a.shape[1])) # n is the nth largest taken by us
print(a[np.argsort()[-n]])
Another way of doing this when repeating elements are presented in the array at hand.
If we have something like
a = np.array([[1,1],[3,4]])
then the second largest element will be 3, not 1.
Alternatively, one could use the following snippet:
second_largest = sorted(list(set(a.flatten().tolist())))[-2]
First, flatten matrix, then only leave unique elements, then back to the mutable list, sort it and take the second element. This should return the second largest element from the end even if there are repeating elements in the array.
nums = [[ 35, 48, 63],
[ 60, 77, 96],
[ 91, 112, 135]]
highs = [max(lst) for lst in nums]
highs[nth]
Using the ‘unique’ function is a very clean way to do it, but likely not the fastest:
k = array([[ 35, 48, 63],
[ 60, 77, 96],
[ 91, 112, 135]])
i = numpy.unique(k)[-2]
for the second largest