How to plot wide format dataframe with seaborn.relplot

Question:

I am trying to plot the following line graph with dummy data on 5 cities(C1-C5).

Data frame which has been imported already

Based on what I understand, x="Year", y="Number of Employees" and hue="City". How would I set up the code for it? I have tried doing it in the following manner, but it doesn’t work!

Current Code

import seaborn as sns
import pandas as pd

Areas = r'C:UsersTachiDesktopCity.xlsx'
df = pd.read_excel(Areas)
df.set_index('City', inplace=True)

sns.relplot(x="Year", y="Number of Employees",hue="City", kind="line", data=df)

Sample Data

data = {'City': ['C1', 'C2', 'C3', 'C4', 'C5'], 
        2015: [28564, 2585, 4679, 33227, 2000], 
        2016: [83659, 4429, 35834, 1447, 3454], 
        2017: [0, 453, 40903, 46826, 646], 
        2018: [39470, 8364, 29464, 36443, 8364]}
df = pd.DataFrame(data)
df.set_index('City', inplace=True)

       2015   2016   2017   2018
City                            
C1    28564  83659      0  39470
C2     2585   4429    453   8364
C3     4679  35834  40903  29464
C4    33227   1447  46826  36443
C5     2000   3454    646   8364
Asked By: Tachi

||

Answers:

  • Given the test dataframe, df, in the OP, the easiest way to plot the dataframe is to use pandas.DataFrame.transpose, and plot with seaborn.relplot using a wide format.
    • This automatically uses the dataframe index as the x-axis, and the column headers for hue.
    • The visualization can also be produced with sns.lineplot(data=df, marker='o') instead of using relplot.
# transpose the dataframe
df = df.T

# display(df)
City     C1    C2     C3     C4    C5
2015  28564  2585   4679  33227  2000
2016  83659  4429  35834   1447  3454
2017      0   453  40903  46826   646
2018  39470  8364  29464  36443  8364

# plot the dataframe
sns.relplot(data=df, kind='line', marker='o')

enter image description here

  • The index values are int dtype, so the x-axis is formatted with intermediated numbers.
    • One way to deal with this is to cast the index to a str dtype before plotting.
# set the index of years to a str dtype
df.index = df.index.astype(str)

# plot the dataframe
sns.relplot(data=df, kind='line', marker='o')

enter image description here

Answered By: Trenton McKinney