Multiple boxplots based on conditions

Question:

I have a dataframe with two columns. The power column represents the power consumption of the system. And the component_status column divide the data in two, based when the component is OFF or ON. When the values are 153 is when the component is ON and when the values are 150 the component is OFF.

The result that I am looking for is to have a boxplot with three boxplots, using sns.boxplot. One is the power consumption with all the data, called "TOTAL". The other two, the power consumption based if the component was OFF or ON, called "COMPONENT = ON" "COMPONENT = OFF".

The data frame example is as follows:

power|component_status |
 0.5 |       150       | 
 1.5 |       150       | 
 2.5 |       150       |
 0.3 |       153       |
 0.5 |       153       | 
 1.5 |       153       | 
 2.5 |       150       |
 0.3 |       153       |

thanks for the help.

Asked By: DMatheu

||

Answers:

Your first step is to build your data frame with the conditions. There are a few ways to go about this.

  1. Let’s start with an initial df1 (dataframe #1) as you have given. Then, let’s add a condition column to say "Total". You can use print(df1) to see what this looks like.
  2. Then let’s copy that dataframe into df2, and let’s replace the conditions with the off/on criteria from the component_status.
  3. Our final dataframe df is just a concatenation of df1 and df2.
  4. Now we have a dataframe df that is ready to go in Seaborn.

Imports and DataFrame

# Set up
import pandas as pd
import numpy as np
import seaborn as sns

power = [0.5, 1.5, 2.5, 0.3, 0.5, 1.5, 2.5, 0.3]
component_status = [150, 150, 150, 153, 153, 153, 150, 153]
df1 = pd.DataFrame(
    data=zip(power, component_status), columns=["power", "component_status"]
)

# Step 1
df1["condition"] = "Total"
# print(df1)

# Step 2
df2 = df1.copy()

df2["condition"] = np.where(df2["component_status"] == 153, "On", "Off")

# If you have several criteria, it can be easier to use np.select
# ... or just use Pandas directly:
# df2.loc[(df2['component_status'] == 153), 'condition'] = 'On'
# df2.loc[(df2['component_status'] == 150), 'condition'] = 'Off'

### Step 3
df = pd.concat([df1,df2])

df view

   power  component_status condition
0    0.5               150     Total
1    1.5               150     Total
2    2.5               150     Total
3    0.3               153     Total
4    0.5               153     Total
5    1.5               153     Total
6    2.5               150     Total
7    0.3               153     Total
0    0.5               150       Off
1    1.5               150       Off
2    2.5               150       Off
3    0.3               153        On
4    0.5               153        On
5    1.5               153        On
6    2.5               150       Off
7    0.3               153        On

Plotting

# Step 4
ax = sns.boxplot(data=df, x='condition', y='power')

enter image description here

Answered By: a11
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.