How does np.partition() interpret the argument kth?
Question:
I am trying to figure out how np.partition
function works.
For example, consider
arr = np.array([5, 4, 1, 0, -1, -3, -4, 0])
If I call np.partition(arr, kth=2)
, I get
np.array([-4, -3, -1, 0, 1, 4, 5, 0])
I expect that, after partition, the array will split into elements less than one, one, and elements greater than one.
But the second zero is placed on the last array position, which isn’t its right place after partition.
Answers:
The documentation says:
Creates a copy of the array with its elements rearranged in such a way that
the value of the element in kth position is in the position it would be in
a sorted array. All elements smaller than the kth element are moved before
this element and all equal or greater are moved behind it. The ordering of
the elements in the two partitions is undefined.
In the example you give, you have selected 2th element of the sorted list (starting from zero), which is -1, and it seems to be in the right position if the array was sorted.
The docs talk of ‘a sorted array’.
np.partition
starts by sorting (see comment by @norok2) the elements in the array provided. In this case the original array is:
arr = [ 5, 4, 1, 0, -1, -3, -4, 0]
When sorted, we have:
arr_sorted = [-4 -3 -1 0 0 1 4 5]
Hence the call, np.partition(arr, kth=2)
, will actually have the kth
as the the element in position 2
of the arr_sorted
, not arr
. The element is correctly picked as -1
.
When I first read the official document of numpy.partition, I also interpreted its meaning in the same way as the OP did. So I was confused when I read the examples given in the documents, but could not figure out where my understanding is wrong. I google it and got here.
Considering that the confusion is frequent, so the document should be revised. I suggest using the following:
Creates a copy of the array with its elements rearranged in such a way that:
the k-th element of the new array is in the position it would be in a sorted array. All elements smaller than the k-th element are moved before this element and all greater are moved behind it. The ordering of the elements in the two partitions is undefined. If there are other elements that are equal to the k-th element, these elements may appear before or hehind the k-th element.
I am trying to figure out how np.partition
function works.
For example, consider
arr = np.array([5, 4, 1, 0, -1, -3, -4, 0])
If I call np.partition(arr, kth=2)
, I get
np.array([-4, -3, -1, 0, 1, 4, 5, 0])
I expect that, after partition, the array will split into elements less than one, one, and elements greater than one.
But the second zero is placed on the last array position, which isn’t its right place after partition.
The documentation says:
Creates a copy of the array with its elements rearranged in such a way that
the value of the element in kth position is in the position it would be in
a sorted array. All elements smaller than the kth element are moved before
this element and all equal or greater are moved behind it. The ordering of
the elements in the two partitions is undefined.
In the example you give, you have selected 2th element of the sorted list (starting from zero), which is -1, and it seems to be in the right position if the array was sorted.
The docs talk of ‘a sorted array’.
np.partition
starts by sorting (see comment by @norok2) the elements in the array provided. In this case the original array is:
arr = [ 5, 4, 1, 0, -1, -3, -4, 0]
When sorted, we have:
arr_sorted = [-4 -3 -1 0 0 1 4 5]
Hence the call, np.partition(arr, kth=2)
, will actually have the kth
as the the element in position 2
of the arr_sorted
, not arr
. The element is correctly picked as -1
.
When I first read the official document of numpy.partition, I also interpreted its meaning in the same way as the OP did. So I was confused when I read the examples given in the documents, but could not figure out where my understanding is wrong. I google it and got here.
Considering that the confusion is frequent, so the document should be revised. I suggest using the following:
Creates a copy of the array with its elements rearranged in such a way that:
the k-th element of the new array is in the position it would be in a sorted array. All elements smaller than the k-th element are moved before this element and all greater are moved behind it. The ordering of the elements in the two partitions is undefined. If there are other elements that are equal to the k-th element, these elements may appear before or hehind the k-th element.