How to compute percentile for an external element with given array?
Question:
I’m looking for a percentile function that accepts an array and an element where it would return closest percentile of the element.
Some examples
percentile([1,2,3,4,5], 2) => 40%
percentile([1,2,3,4,5], 2.5) => 40%
percentile([1,2,3,4,5], 6) => 100%
Does anything like this or similar exist within python or numpy?
Numpy does this np.percentile(a=[1,2,3,4,5], q=3) => 1.12
which is not desired.
Answers:
np.percentile(a, q)
tells you the q
th percentile in the a
array. This is the inverse of what you want. I don’t think numpy has a function to do what you want, but it’s easy enough to make your own.
The percentile tells you the percentage of elements of the array that are smaller than the given element, so just do that:
def percentile(lst: list, val) -> float:
return sum(i <= val for i in lst) / len(lst)
If you have a numpy array, you don’t need to iterate over it since <=
will broadcast over the array:
def percentile(arr: np.ndarray, val) -> float:
return np.mean(arr <= val) # thanks Chrysophylaxs!
# return (arr <= val).sum() / len(arr)
>>> percentile([1,2,3,4,5], 2)
# 0.4
>>> percentile([1,2,3,4,5], 2.5)
# 0.4
>>> percentile([1,2,3,4,5], 6)
# 1.0
you can also calculate like this:
it sorts the array, then finds the index of the element in the sorted array. and then it calculates the percentile by taking the index of the element, adding 1 to it .
def percentile(arr, element):
arr = sorted(arr)
index = arr.index(element)
percentile = (index + 1) / len(arr) * 100
return percentile
I’m looking for a percentile function that accepts an array and an element where it would return closest percentile of the element.
Some examples
percentile([1,2,3,4,5], 2) => 40%
percentile([1,2,3,4,5], 2.5) => 40%
percentile([1,2,3,4,5], 6) => 100%
Does anything like this or similar exist within python or numpy?
Numpy does this np.percentile(a=[1,2,3,4,5], q=3) => 1.12
which is not desired.
np.percentile(a, q)
tells you the q
th percentile in the a
array. This is the inverse of what you want. I don’t think numpy has a function to do what you want, but it’s easy enough to make your own.
The percentile tells you the percentage of elements of the array that are smaller than the given element, so just do that:
def percentile(lst: list, val) -> float:
return sum(i <= val for i in lst) / len(lst)
If you have a numpy array, you don’t need to iterate over it since <=
will broadcast over the array:
def percentile(arr: np.ndarray, val) -> float:
return np.mean(arr <= val) # thanks Chrysophylaxs!
# return (arr <= val).sum() / len(arr)
>>> percentile([1,2,3,4,5], 2)
# 0.4
>>> percentile([1,2,3,4,5], 2.5)
# 0.4
>>> percentile([1,2,3,4,5], 6)
# 1.0
you can also calculate like this:
it sorts the array, then finds the index of the element in the sorted array. and then it calculates the percentile by taking the index of the element, adding 1 to it .
def percentile(arr, element):
arr = sorted(arr)
index = arr.index(element)
percentile = (index + 1) / len(arr) * 100
return percentile