how to add regression line and regression line equation on graph

Question:

I have the below input file and the code/script to add the regression line on the graph but the code gives this error:ValueError: x and y must have same first dimension. I couldn’t figure out that error.

How can I add the regression line and regression line equation on graph?

Input file:

-5.06   -4.27
-6.69   -7.28
-3.80   -3.51
-3.88   -2.79
-0.90   -0.81
 2.10    2.59
-1.08    0.28
-5.00   -3.39
 2.67    2.92
 2.48    2.85
-5.10   -3.49
 2.88    3.09
 2.30    2.67
-3.47   -2.20
-0.90   -0.79

Script:

#!/usr/bin/python
import numpy as np
import pylab as plot
import matplotlib.pyplot as plt
import numpy, scipy, pylab, random
from matplotlib.ticker import MultipleLocator
import matplotlib as mpl
from matplotlib.ticker import MaxNLocator
from scipy import stats

with open("input.txt", "r") as f:
    x=[]
    y=[]

    for line in f:
        if not line.strip() or line.startswith('@') or line.startswith('#'): continue
        row = line.split()
        x.append(float(row[0]))
        y.append(float(row[1]))

fig = plt.figure(figsize=(2.2,2.2), dpi=300)
ax = plt.subplot(111)

plt.xlim(4, -8)
plt.ylim(4, -8)

ax.xaxis.set_major_locator(MaxNLocator(6))
ax.yaxis.set_major_locator(MaxNLocator(6))

ax.xaxis.set_minor_locator(MultipleLocator(1))
ax.yaxis.set_minor_locator(MultipleLocator(1))


#regression part
slope, intercept, r_value, p_value, std_err = stats.linregress(x,y)
line = slope*x+intercept
plt.plot(x, line, 'r', label='fitted line')
#end

plt.scatter(x,y,color=['black','black','black','black','black','black','black','black','black','black','black','black','black','black','black'], s=3.5)


plt.savefig("output.png", dpi=300)
Asked By: qasim

||

Answers:

You could try and add this piece of code for the regression line:

# To plot the regression line
plt.plot(X, (B0 + B1*x), label = 'y = {:.2f} + {:.2f}*x'.format(B0, B1))
plt.legend(loc='lower right')

Here is a screenshot of the visualization for the Linear Regression code I had written:

enter image description here

Answered By: Sagar Dawda

You cannot multiply a list with a float number. So you may create a numpy array from the input list x,

line = slope*np.array(x)+intercept

More ideally you would read in your data using np.genfromtxt,

x,y = np.genfromtxt("input.txt", unpack=True) 

Complete example:

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import MultipleLocator
from matplotlib.ticker import MaxNLocator
from scipy import stats

x,y = np.genfromtxt("input.txt", unpack=True) 

fig = plt.figure(figsize=(2.2,2.2), dpi=300)
ax = plt.subplot(111)

plt.xlim(4, -8)
plt.ylim(4, -8)

ax.xaxis.set_major_locator(MaxNLocator(6))
ax.yaxis.set_major_locator(MaxNLocator(6))

ax.xaxis.set_minor_locator(MultipleLocator(1))
ax.yaxis.set_minor_locator(MultipleLocator(1))


#regression part
slope, intercept, r_value, p_value, std_err = stats.linregress(x,y)

line = slope*x+intercept
plt.plot(x, line, 'r', label='y={:.2f}x+{:.2f}'.format(slope,intercept))
#end

plt.scatter(x,y, color="k", s=3.5)
plt.legend(fontsize=9)

plt.show()

enter image description here

Is there a way to change the equation format in the legend?
I mean, How can I change the values in the label equation to scientific notation?

Answered By: Baback MDian
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.