Plotting pcolormesh in python from csv data

Question:

I am trying to make a pcolormesh plot in python from my csv file. But I am stuck with dimension error.

My csv looks like this:

ratio    5%   10%   20%   30%   40%   50%
1.2    0.60  0.63  0.62  0.66  0.66  0.77
1.5    0.71  0.81  0.75  0.78  0.76  0.77
1.8    0.70  0.82  0.80  0.73  0.80  0.78
1.2    0.75  0.84  0.94  0.84  0.76  0.82
2.3    0.80  0.92  0.93  0.85  0.87  0.86
2.5    0.80  0.85  0.91  0.85  0.87  0.88
2.9    0.85  0.91  0.96  0.96  0.86  0.87

I want to make pcolormesh plot where x-axis shows ratio and y-axis shows csv header i.e 0.05, 0.1, 0.2, 0.3, 0.4, 0.5 and the plot includes values from csv 2nd column.

I tried to do following in python:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings

warnings.filterwarnings('ignore')



df = pd.read_csv('./result.csv')
xlabel = df['ratio']
ylabel = [0.05, 0.1, 0.2, 0.3, 0.4, 0.5]

plt.figure(figsize=(8, 6))
df = df.iloc[:, 1:]

plt.pcolormesh(df, xlabel, ylabel, cmap='RdBu')
plt.colorbar()
plt.xlabel('rati0')
plt.ylabel('threshold')
plt.show()

But it doesn’t work.

Can I get a help to make a plot as I want.

Thank you.

Asked By: Codeholic

||

Answers:

First off: ignoring warnings is a really bad idea, especially in code that doesn’t work as expected.

X and Y in plt.colormesh define the mesh, i.e. edges of the cells, not the cells themselves. There is one more edge both horizontally and vertically than there are cells. You’ll need to label the centers in a separate step.

Apart from that, you would have to change the order: when there are 3 unnamed parameters, the first is X, the second would be Y and the third the values for the colors.

Also, the columns of the dataframe will be the columns of the mesh. You seem to want to have them to be the rows of the mesh. Therefore, the dataframe should be transposed.

This is how your code could work:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from io import StringIO

df_str = '''ratio    5%   10%   20%   30%   40%   50%
1.2    0.60  0.63  0.62  0.66  0.66  0.77
1.5    0.71  0.81  0.75  0.78  0.76  0.77
1.8    0.70  0.82  0.80  0.73  0.80  0.78
1.2    0.75  0.84  0.94  0.84  0.76  0.82
2.3    0.80  0.92  0.93  0.85  0.87  0.86
2.5    0.80  0.85  0.91  0.85  0.87  0.88
2.9    0.85  0.91  0.96  0.96  0.86  0.87'''
df = pd.read_csv(StringIO(df_str), delim_whitespace=True)
xlabel = df['ratio']
ylabel = [0.05, 0.1, 0.2, 0.3, 0.4, 0.5]

plt.figure(figsize=(8, 6))
df = df.iloc[:, 1:]

plt.pcolormesh(df.T, cmap='RdBu')
plt.xticks(np.arange(len(xlabel)) + 0.5, xlabel)
plt.yticks(np.arange(len(ylabel)) + 0.5, ylabel)
plt.colorbar()
plt.xlabel('ratio')
plt.ylabel('threshold')
plt.show()

pcolormesh from dataframe

Note that your code would be a lot more straightforward if you’d use seaborn, which builds on matplotlib and pandas to easily create statistical plots.

Seaborn’s heatmap uses the index of the dataframe to label the y-axis, and the columns to label the x-axis. So, you can set the ‘ratio’ column as index and transpose the dataframe. A colorbar will be generated by default, and optionally the cells can be annotated with their values.

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

# df = pd.read_csv(...)

plt.figure(figsize=(8, 6))
ax = sns.heatmap(df.set_index('ratio').T, annot=True, cmap='RdBu')
ax.set_ylabel('threshold')
plt.show()

sns.heatmap from dataframe

Answered By: JohanC