Finding mode of unique array combination in the rows of 2d numpy array

Question:

I have a 2d numpy array which I’m trying to return the mode array along axis = 0 (rows). However, I would like to return the most frequent unique row combination. And not the three modes for all three columns which is what scipy stats mode does. The desired output in the example below would be [9,9,9], because thats the most common unique row. Thanks

from scipy import stats

arr1 = np.array([[2,3,4],[2,1,5],[1,2,3],[2,4,4],[2,8,2],[2,3,1],[9,9,9],[9,9,9]])

stats.mode(arr1, axis = 0)

output:

ModeResult(mode=array([[2, 3, 4]]), count=array([[5, 2, 2]]))
Asked By: lingyau lee

||

Answers:

you could use the numpy unique funtion and return counts.

unique_arr1, count = np.unique(arr1,axis=0, return_counts=True)
unique_arr1[np.argmax(count)]

output:

array([9, 9, 9])

np.unique return the unique array in sorted order, which means it is guranteed that last one is the maximum. you could simply do:

out = np.unique(arr1,axis=0)[-1]

however, I do not know for what purpose you want to use this but just to mention that you could have all counts just in case you want to verify or account for multiple rows with same counts as well.

Update
given additional information that this is for images (which could be big) and most importantly second dim could fit in a int (either each values is uin8 or 16) could fir in int32 or 64. (considering of values of each pixel in uint8):

pixel, count = np.unique(np.dot(arr, np.array([2**16,2**8,1])),  return_counts=True)
pixel = pixel[np.argmax(count)]
b,g, r, = np.ndarray((3,), buffer=pixel, dtype=np.uint8)

This could result in a big speedup.

Answered By: amirhm
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.