Adding y=x to a matplotlib scatter plot if I haven't kept track of all the data points that went in


Here’s some code that does scatter plot of a number of different series using matplotlib and then adds the line y=x:

import numpy as np, matplotlib.pyplot as plt, as cm, pylab

nseries = 10
colors = cm.rainbow(np.linspace(0, 1, nseries))

all_x = []
all_y = []
for i in range(nseries):
    x = np.random.random(12)+i/10.0
    y = np.random.random(12)+i/5.0
    plt.scatter(x, y, color=colors[i])

# Could I somehow do the next part (add identity_line) if I haven't been keeping track of all the x and y values I've seen?
identity_line = np.linspace(max(min(all_x), min(all_y)),
                            min(max(all_x), max(all_y)))
plt.plot(identity_line, identity_line, color="black", linestyle="dashed", linewidth=3.0)

In order to achieve this I’ve had to keep track of all the x and y values that went into the scatter plot so that I know where identity_line should start and end. Is there a way I can get y=x to show up even if I don’t have a list of all the points that I plotted? I would think that something in matplotlib can give me a list of all the points after the fact, but I haven’t been able to figure out how to get that list.

Asked By: kuzzooroo



You don’t need to know anything about your data per se. You can get away with what your matplotlib Axes object will tell you about the data.

See below:

import numpy as np
import matplotlib.pyplot as plt

# random data 
N = 37
x = np.random.normal(loc=3.5, scale=1.25, size=N)
y = np.random.normal(loc=3.4, scale=1.5, size=N)
c = x**2 + y**2

# now sort it just to make it look like it's related

fig, ax = plt.subplots()
ax.scatter(x, y, s=25, c=c,, zorder=10)

Here’s the good part:

lims = [
    np.min([ax.get_xlim(), ax.get_ylim()]),  # min of both axes
    np.max([ax.get_xlim(), ax.get_ylim()]),  # max of both axes

# now plot both limits against eachother
ax.plot(lims, lims, 'k-', alpha=0.75, zorder=0)
fig.savefig('/Users/paul/Desktop/so.png', dpi=300)

Et voilĂ 

enter image description here

Answered By: Paul H

In one line:

ax.plot([0,1],[0,1], transform=ax.transAxes)

No need to modify the xlim or ylim.

Answered By: Edward

If you set scalex and scaley to False, it saves a bit of bookkeeping. This is what I have been using lately to overlay y=x:

xpoints = ypoints = plt.xlim()
plt.plot(xpoints, ypoints, linestyle='--', color='k', lw=3, scalex=False, scaley=False)

or if you’ve got an axis:

xpoints = ypoints = ax.get_xlim()
ax.plot(xpoints, ypoints, linestyle='--', color='k', lw=3, scalex=False, scaley=False)

Of course, this won’t give you a square aspect ratio. If you care about that, go with Paul H’s solution.

Answered By: kilodalton

Starting with matplotlib 3.3 this has been made very simple with the axline method which only needs a point and a slope. To plot x=y:

ax.axline((0, 0), slope=1)

You don’t need to look at your data to use this because the point you specify (i.e. here (0,0)) doesn’t actually need to be in your data or plotting range.

Answered By: vantom

ok imagine I plot one figure by point- point method like 4 point x and 4 point y but I would like on y axes show 10 point how can i give range to y axis because in point by point the number of x and y points should be equal?

Answered By: harshith netha