Get shape parameters of scipy.stats.beta from shape of PDF

Question:

I have the following script from the docs

import numpy as np
import scipy.stats as ss
import matplotlib.pyplot as plt

fig, ax = plt.subplots(1)


a = 1.25
b = 1.5

x = np.linspace(ss.beta.ppf(0.00, a, b),
                ss.beta.ppf(1, a, b), 100)

ax.plot(x, ss.beta.pdf(x, a, b),
       'r-', lw=5, alpha=0.6, label='beta pdf')

plt.show()

I would like to reverse the process and get a and b from the shape of the pdf. I will only have the output of

ss.beta.pdf(x, a, b)

where only x is known (specified earlier). So how do I get a and b if I only have the following data:

[0.         0.63154416 0.74719517 0.82263393 0.87936158 0.92490497
 0.96287514 0.9953117  1.02349065 1.04826858 1.07025112 1.08988379
 1.10750482 1.12337758 1.13771153 1.15067624 1.16241105 1.17303196
 1.18263662 1.19130806 1.19911747 1.20612637 1.21238825 1.21794995
 1.22285265 1.22713276 1.23082258 1.23395089 1.23654342 1.23862319
 1.24021091 1.24132519 1.2419828  1.2421989  1.24198713 1.24135982
 1.24032811 1.23890203 1.23709059 1.23490192 1.23234325 1.22942106
 1.22614106 1.2225083  1.21852712 1.21420131 1.209534   1.20452779
 1.19918472 1.19350628 1.18749347 1.18114672 1.17446601 1.16745076
 1.16009989 1.1524118  1.14438438 1.13601495 1.12730028 1.11823656
 1.10881937 1.09904366 1.0889037  1.07839304 1.06750447 1.05622997
 1.0445606  1.03248649 1.01999669 1.00707911 0.99372038 0.97990571
 0.96561876 0.95084139 0.93555349 0.91973269 0.90335407 0.88638972
 0.86880835 0.85057469 0.83164884 0.81198535 0.79153224 0.77022959
 0.74800782 0.7247854  0.70046587 0.67493372 0.64804878 0.61963821
 0.58948474 0.55730896 0.52274112 0.48527403 0.44417862 0.39833786
 0.34587496 0.2831383  0.20072303 0.        ]

After finding this answer and this blogpost I have an idea of what needs to be done:

def find_parameters(x1, p1, x2, p2):
    """Find parameters for a beta random variable X
    so that P(X > x1) = p1 and P(X > x2) = p2.
    """

    def objective(v):
        (a, b) = v
        temp = (ss.beta.cdf(x1, a, b) - p1) ** 2.0
        temp += (ss.beta.cdf(x2, a, b) - p2) ** 2.0
        return temp

    # arbitrary initial guess of (1, 1) for parameters
    xopt = ss.optimize.fmin(objective, (1, 1))
    return [xopt[0], xopt[1]]

However, this function only uses two points. I modified the solution as follows:

def find_parameters(x, p):

    def objective(v):
        (a, b) = v
        return np.sum((ss.beta.cdf(x, a, b) - p) ** 2.0)

    # arbitrary initial guess of (1, 1) for parameters
    xopt = optimize.fmin(objective, (1, 1))
    return [xopt[0], xopt[1]]

fitted_a, fitted_b = find_parameters(x=x, p=ss.beta.pdf(x, a, b))

This results in the following incorrect values:

a: 4.2467303147231366e-10

and

b: 1.9183210434237443

Please advice

Asked By: HJA24

||

Answers:

Your objective function is calculating the error to a Beta CDF (ss.beta.cdf(x, a, b)) but you are trying to fit a Beta PDF. Make this simple change and your code will work:

def find_parameters(x, p):

    def objective(v):
        (a, b) = v
        return np.sum((ss.beta.pdf(x, a, b) - p) ** 2.0)

    # arbitrary initial guess of (1, 1) for parameters
    xopt = optimize.fmin(objective, (1, 1))
    return [xopt[0], xopt[1]]
Answered By: foglerit
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.