how to display pandas.io.formats.style.styler object on top of each other

Question:

Here’s some data:

import numpy as np
import random
import pandas as pd

random.seed(365)

duration = np.random.exponential(scale = 5, size = 100).round(1)
numbers = np.random.normal(loc = 50, scale = 2, size = 100).round(2)
group = np.random.choice(["A", "B", "C", "D"], size = len(duration))
gender = np.random.choice(["Male", "Female"], p = [0.7, 0.3], size = len(duration))
provider = np.random.choice(["2Degrees", "Skinny", "Vodafone", "Spark"], p = [0.25, 0.25, 0.25, 0.25], size = len(duration))

df = pd.DataFrame(
    {"Duration":duration,
    "Numbers":numbers,
    "Group":group,
    "Gender":gender,
    "Provider":provider}
)

I attempting to concatenate multiple pandas.styler objects together into one figure.

I have all the "pieces" of the figure as individual pandas.styler objects. These I created as data-frames and "styled" them to have their own individual captions.

Here is the code I used to generate the first two "pieces" of this figure (much of the other code I used to generate the other pieces is very similar):

#Gets the number of rows and columns
pd.DataFrame({
    "Number of Rows":df.shape[0],
    "Number of Columns":df.shape[1]
}, index = [""])

#Gets the info on the data set's categorical columns
data = []

for column in df:
    if df[column].dtype == "object":
        freq = df[column].value_counts(ascending = False)
        data.append({
            "Column Name":column,
            "Unique Values":len(df[column].unique()),
            "Missing Values":df[column].isna().sum(),
            "Most Frequently Occurring":freq.index[0],
            "Occurrences":freq[0],
            "% of Total":freq[0] / freq.sum()*100
        })
pd.DataFrame(data).style.format(precision = 1).set_caption("Categorical Columns").set_table_styles([{
    "selector": "caption",
    "props": [
        ("font-size", "16px")
    ]
}])

The figure I attempting to create looks something like this (this I made in an Excel spreadsheet):
enter image description here

See that the pandas.style objects (apart from the first data-frame which states the number of rows and columns in the data set) are stacked on top of each with enough padding between them

Ideally, this entire figure would be exportable to an Excel spreadsheet.

I pretty much have all the code I need, its just getting this final part together that I need help with. Any ideas how to tackle this?

Asked By: JoMcGee

||

Answers:

After some figuring out, I found out that each of the "pieces" of the entire figure must first be rendered to HTML code. These "pieces" (which are now HTML strings) then need to be concatenated by putting padding in between them.

For those that may wish to create similar data summary tables in the future, I will leave my code here:

from IPython.display import display, HTML

styles = [{"selector":"caption", "props":[("font-size", "16px"), ("font-weight", "bold")]}]

head = pd.DataFrame({
    "Number of Rows":df.shape[0],
    "Number of columns":df.shape[1]
}, index = [""]).style
    .set_caption("Data Frame")
    .set_table_styles(styles)
    .to_html()


data = []

#Info obtained from categorical columns
for column in df:
    if df[column].dtype == "object":
        freq = df[column].value_counts(dropna = False, ascending = False)
        data.append({
            "Column Name":column,
            "Unique Values":len(df[column].unique()),
            "Missing Values":df[column].isna().sum(),
            "Most Frequently Occurring":freq.index[0],
            "Occurrences":freq[0],
            "% of Total":freq[0] / freq.sum()*100,
        })
    
cat = pd.DataFrame(data).style.set_caption("Categorical Columns")
    .set_table_styles(styles)
    .format(precision = 1)
    .hide_index()
    .to_html()

data = []

#Info obtained from numeric columns
for column in df:
    if df[column].dtype in ["int", "float"]:
        data.append({
            "Column Name":column,
            "Unique Values":len(df[column].unique()),
            "Missing Values":df[column].isna().sum(),
            "Range":[df[column].min(), df[column].max()],
            "Mean Value":df[column].mean(),
            "Median Value":df[column].median()
        })
    
num = pd.DataFrame(data).style.set_caption("Numeric Columns")
    .set_table_styles(styles)
    .format(precision = 1)
    .hide_index()
    .to_html()

padding = "<div style='padding: 20px;'></div>"
figure = padding.join([head, cat, num])

display(HTML(figure))
Answered By: JoMcGee
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.