How does np.partition() interpret the argument kth?

Question:

I am trying to figure out how np.partition function works.
For example, consider

arr = np.array([5, 4, 1, 0, -1, -3, -4, 0])

If I call np.partition(arr, kth=2), I get

np.array([-4, -3, -1, 0, 1, 4, 5, 0])

I expect that, after partition, the array will split into elements less than one, one, and elements greater than one.
But the second zero is placed on the last array position, which isn’t its right place after partition.

Asked By: artem zholus

||

Answers:

The documentation says:

Creates a copy of the array with its elements rearranged in such a way that
the value of the element in kth position is in the position it would be in
a sorted array. All elements smaller than the kth element are moved before
this element and all equal or greater are moved behind it. The ordering of
the elements in the two partitions is undefined.

In the example you give, you have selected 2th element of the sorted list (starting from zero), which is -1, and it seems to be in the right position if the array was sorted.

Answered By: J. P. Petersen

The docs talk of ‘a sorted array’.

np.partition starts by sorting (see comment by @norok2) the elements in the array provided. In this case the original array is:

arr = [ 5,  4,  1,  0, -1, -3, -4,  0]

When sorted, we have:

arr_sorted = [-4 -3 -1  0  0  1  4  5]

Hence the call, np.partition(arr, kth=2), will actually have the kth as the the element in position 2 of the arr_sorted, not arr. The element is correctly picked as -1.

Answered By: Gathide

When I first read the official document of numpy.partition, I also interpreted its meaning in the same way as the OP did. So I was confused when I read the examples given in the documents, but could not figure out where my understanding is wrong. I google it and got here.

Considering that the confusion is frequent, so the document should be revised. I suggest using the following:

Creates a copy of the array with its elements rearranged in such a way that: 
the k-th element of the new array is in the position it would be in a sorted array. All elements smaller than the k-th element are moved before this element and all greater are moved behind it. The ordering of the elements in the two partitions is undefined. If there are other elements that are equal to the k-th element, these elements may appear before or hehind the k-th element.

Answered By: Youjun Hu
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.