How to extract the regression coefficient from statsmodels.api?

Question:

 result = sm.OLS(gold_lookback, silver_lookback ).fit()

After I get the result, how can I get the coefficient and the constant?

In other words, if
y = ax + c
how to get the values a and c?

Asked By: JOHN

||

Answers:

You can use the params property of a fitted model to get the coefficients.

For example, the following code:

import statsmodels.api as sm
import numpy as np
np.random.seed(1)
X = sm.add_constant(np.arange(100))
y = np.dot(X, [1,2]) + np.random.normal(size=100)
result = sm.OLS(y, X).fit()
print(result.params)

will print you a numpy array [ 0.89516052 2.00334187] – estimates of intercept and slope respectively.

If you want more information, you can use the object result.summary() that contains 3 detailed tables with model description.

Answered By: David Dale

Cribbing from this answer Converting statsmodels summary object to Pandas Dataframe, it seems that the result.summary() is a set of tables, which you can export as html and then use Pandas to convert to a dataframe, which will allow you to directly index the values you want.

So, for your case (putting the answer from the above link into one line):

df = pd.read_html(result.summary().tables[1].as_html(),header=0,index_col=0)[0]

And then

a=df['coef'].values[1]
c=df['coef'].values[0]
Answered By: Idiot Tom

Adding up details on @IdiotTom answer.

You may use:

head = pd.read_html(res.summary2().as_html())[0]
body = pd.read_html(res.summary2().as_html())[1]

Not as nice, but the info is there.

Answered By: B Furtado

If the input to the API is pandas objects (i.e. a pd.DataFrame for the data, or pd.Series for x and for y), then when you access .params it will be a pd.Series, so each coefficient is easily accessible by its name.

For example:

import statsmodels.api as sm 
# sm.__version__ is '0.13.1'


df = pd.DataFrame({'x': [0,  1,2,3,4],
                   'y': [0.1, 0.2, 0.5, 0.5, 0.8]
                  })

sm.OLS.from_formula(formula='y~x-1', data=df).fit().params

Outputs the following pd.Series:

x    0.196667
dtype: float64

Allowing for an intercept term (by changing the formula from y~x-1 to y~x) changes the output to include the intercept under the name Intercept:

Intercept    0.08
x            0.17
dtype: float64
Answered By: Itamar Mushkin

The coefficients are saved as a dictionary in the result.params data frame, that’s a pandas Series. In it, the constant term is stored as Intercept, as others pointed. The variable terms are stored with their variable names. So, if your model is y ~ x, the regression coefficients will be available as result.params['Intercept'] (that’s b) and result.params['x'] (that’s a) for the equation y = a*x + b.

Answered By: Hilton Fernandes