Pass labels for legend in matplotlib from a csv file

Question:

I have a csv file with a large amount of data that I am plotting as a scatter plot using pandas and matplotlib. Each data point corresponds to a different specimen I have tested. I wanted to assign each specimen a specific color/marker with the legend indicating the name of the specimen that each point corresponds to. In my csv file I have a column for specimen name, a column for force, a column for displacement, the color I assigned, and marker type. I defined this function below.

> def master_plot(df):
>     
> 
>     y = df["Force"]
>     x = df["Displacement"]
>     z = df["Volume"]
>     z1 = z.values.tolist()
>     c1 = df["Color"]
>     c1 = c1.values.tolist()
>     m = df["Marker"]
>     m = m.values.tolist()
> 
>     fig = plt.figure()
>     ax = fig.add_subplot(111)
>     
>     plt.scatter(x,y,c = c1, label = z)    
>     plt.plot( [0,25000],[0,25000])
>     
>     ax.set_aspect('equal', adjustable='box')
>     
>     plt.title("Force vs Displacement")
>     plt.ylabel("Force") 
>     plt.xlabel("Displacement")
>     
>     
>     plt.legend(bbox_to_anchor=(1.05, 1.0), loc='upper left')
> master_plot(df)

I can get the scatter plot to output the colors but I just get a list of the different samples in the legend without the color assignment. I have tried including z as an an argument of plt.legend but that results in no output.

I’d like for the legend to have the appropriate color next to sample name

Here is also a snippet of the csv file to show what I am passing through to my function.

> Volume,Force,Displacement,,Color,Marker, 
> 1_762-68335-L45-3x,1434.645679,.45,blue,o
> 1_762-68335-L45-3x,952.316311,.23,,blue,v
Asked By: hm23

||

Answers:

Couple of things I noticed… the data you provided has extra , which I assume is an error. Also, think you have volume in your data, but Name in code. So, I used your data provided with some additional rows like this

>> df
    Name                Force      Displacement Color   Marker
0   1_762-68335-L45-3x  1434.645679 0.45    blue    o
1   1_762-68335-L45-3x  952.316311  0.23    blue    o
2   1_762-68335-L45-4x  2234.645679 0.51    green   v
3   1_762-68335-L45-4x  2982.316311 0.22    green   v
4   1_762-68335-L45-5x  3134.645679 0.44    red x
5   1_762-68335-L45-5x  3052.316311 0.19    red x
6   1_762-68335-L45-6x  4334.645679 0.48    maroon  <
7   1_762-68335-L45-6x  4121.316311 0.20    maroon  <

The code is simplified and is as below… I have used matplotlib, as I believe that is the library you prefer. Seaborn would make this a little easier…
Now added code to get the marker. It will pick the first marker from each set of colors in each case

def master_plot(df):
    fig = plt.figure()
    ax = fig.add_subplot(111)
    for color in df["Color"].unique():
        ax.scatter(df[df.Color==color].Displacement, df[df.Color==color].Force, 
                  c = color, label = df[df.Color==color].Name.unique(),
                  marker = df[df.Color==color].Marker.to_list()[0])        
    plt.plot([0,25000],[0,25000])
    ax.set_aspect('equal', adjustable='box')
    plt.title("Force vs Displacement")
    plt.ylabel("Force") 
    plt.xlabel("Displacement")
    plt.legend(bbox_to_anchor=(1.05, 1.0), loc='upper left')

master_plot(df)

Output plot

enter image description here

EDIT – New Req

If you need a different combination for each entry and you want to show ALL the different entries in the plot, then you can use this data (slightly modified)

>>  df
    Name    Force   Displacement    Color   Marker
0   1_762-68335-L45-3x  2434.645679 0.45    blue    o
1   1_762-68335-L45-3x  952.316311  0.23    blue    v
2   1_762-68335-L45-4x  4234.645679 0.51    green   x
3   1_762-68335-L45-4x  6982.316311 0.22    green   <
4   1_762-68335-L45-5x  8134.645679 0.44    red o
5   1_762-68335-L45-5x  10052.316310    0.19    red v
6   1_762-68335-L45-6x  13334.645680    0.48    maroon  x
7   1_762-68335-L45-6x  16121.316310    0.20    maroon  <

and this code…

def master_plot(df):
    fig = plt.figure()
    ax = fig.add_subplot(111)
    mylabels = []  ##Declare and build an array of labels
    mylabels.append('Diagonal')  ##...with the diagonal line as first entry
    for i in range(len(df)):
        ax.scatter(df.iloc[i,2], df.iloc[i,1], c = df.iloc[i,3], marker = df.iloc[i,4])
        mylabels.append(df.iloc[i,0])
    plt.plot([0,25000],[0,25000])
    ax.set_aspect('equal', adjustable='box')
    plt.title("Force vs Displacement")
    plt.ylabel("Force") 
    plt.xlabel("Displacement")
    ax.legend(labels=mylabels, bbox_to_anchor=(1.05, 1.0), loc='upper left')

master_plot(df)

…will give you this plot

enter image description here

Answered By: Redox
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.