Understanding the subtle difference in calculating percentile

Question

When calculating the percentile using numpy, I see some authors use:

Q1, Q3 = np.percentile(X, [25, 75])

which is clear to me. However, I also see others use:

loss = np.percentile(X, 4)

I presume 4 implies dividing the 100 into 4 percentiles but how the loss is calculated here (i.e., in the second case)?

Thank you

Asked By: Dave

||

Answer 1

I don’t know where you found the second case but it’s incorrect (or misinterpreted).

np.percentile(X, 4) simply calculates the 4th percentile.

X = np.arange(0, 101)

np.percentile(X, [25, 75])
# array([25., 75.])

np.percentile(X, 4)
# 4.0

Answered By: mozway

Question: