How to plot all dataframes from a dictionary of dataframes

Question:

I have lot of data, so to simplify :

data = [[0.5, 1,'abcnews'],
        [0.4, 1.2, 'abcnews'],
        [0.8, 1.7, 'cnn'], 
        [0.9, 1.4, 'abcnews'],
        [0.4, 1.3, 'abcnews'], 
        [0.75, 1.67, 'cnn']]
a = pd.DataFrame(data,columns=['cpc','rate_bid', 'site_target'],dtype=float)  

data = [[0.7, 1, 'cnn'],
        [0.4, 1.2, 'abcnews'],
        [0.1, 1.4, 'cnn'],
        [0.9, 1.6, 'cnn']]
b = pd.DataFrame(data,columns=['cpc','rate_bid', 'site_target'],dtype=float)

data = [[0.4, 1.1, 'abcnews'],
        [0.5, 1, 'abcnews'],
        [0.6, 1.4, 'abcnews'],
        [0.7, 1.8, 'abcnews'],
        [0.8, 1.2, 'cnn']]
ac = pd.DataFrame(data,columns=['cpc','rate_bid', 'site_target'],dtype=float)


And imagine we have a dict named : d

In this dict, I have 31 dataframes (a, b, c, …, ac)

So in my dict I have something like that :

key           Type                  Size                  Value 

a           DataFrame               (6,3)             Column names : cpc, rate_bid, site_target
b           DataFrame               (4.3)             Column names : cpc, rate_bid, site_target
.
.
.
ac          DataFrame               (5.3)             Column names : cpc, rate_bid, site_target
  • I would like to have 31 graphs (each dataframe of my dict)
  • With this kind of plot : sns.lineplot(data=a, x='cpc, y='rate_bid', hue='site_target', legend = False)
  • How could I do that?
Asked By: baring

||

Answers:

  • Given the sample data, a .scatterplot is a better option.
    • Rate Bid is a function of Cost per Click (CPC)
      • Each one is a discrete indicator
    • You can substitute .lineplot if you prefer
  • You want hue='site_target', however because dataframes are being iterated through, there’s no guarantee the colors will be mapped the same for each plot.
    • A custom color map needs to be created from the unique 'site_target' values for all the dataframes
    • Since the colors will be the same, place one legend to the side of the plots, instead of a legend in every plot
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.patches import Patch
import seaborn as sns
import math

# given d as the dict of dataframes
d = {'a': a, 'b': b, 'ac': ac}

# create color mapping based on all unique values of site_target
unique_site = {site for v in d.values() for site in v.site_target.unique()}  # get unique sites
colors = sns.color_palette('husl', n_colors=len(unique_site))  # get a number of colors
cmap = dict(zip(unique_site, colors))  # zip values to colors

# iterate through dictionary and plot
col_nums = 3  # how many plots per row
row_nums = math.ceil(len(d) / col_nums)  # how many rows of plots
plt.figure(figsize=(10, 4))  # change the figure size as needed
for i, (k, v) in enumerate(d.items(), 1):
    plt.subplot(row_nums, col_nums, i)
    p = sns.scatterplot(data=v, x='cpc', y='rate_bid', hue='site_target', palette=cmap)
    p.legend_.remove()
    plt.title(f'DataFrame: {k}')

plt.tight_layout()
# create legend from cmap
patches = [Patch(color=v, label=k) for k, v in cmap.items()]
# place legend outside of plot
plt.legend(handles=patches, bbox_to_anchor=(1.04, 0.5), loc='center left', borderaxespad=0)
plt.show()

enter image description here

With .lineplot instead of .scatterplot

enter image description here

Answered By: Trenton McKinney