How can I highlight cells with categorical variables?

Question

I have a pandas dataframe called value_matrix_classification which looks as follows:

{('wind_on_share',
  'Wind-onshore power generation'): {('AIM/CGE 2.0',
   'ADVANCE_2020_WB2C'): 'high', ('AIM/CGE 2.0',
   'ADVANCE_2030_Price1.5C'): 'high', ('AIM/CGE 2.0',
   'ADVANCE_2030_WB2C'): 'high', ('IMAGE 3.0.1',
   'ADVANCE_2020_WB2C'): 'low', ('IMAGE 3.0.1',
   'ADVANCE_2030_WB2C'): 'low', ('MESSAGE-GLOBIOM 1.0',
   'ADVANCE_2020_WB2C'): 'low'},
 ('wind_off_share',
  'Wind-offshore power generation'): {('AIM/CGE 2.0',
   'ADVANCE_2020_WB2C'): nan, ('AIM/CGE 2.0',
   'ADVANCE_2030_Price1.5C'): nan, ('AIM/CGE 2.0',
   'ADVANCE_2030_WB2C'): nan, ('IMAGE 3.0.1',
   'ADVANCE_2020_WB2C'): 'low', ('IMAGE 3.0.1',
   'ADVANCE_2030_WB2C'): 'low', ('MESSAGE-GLOBIOM 1.0',
   'ADVANCE_2020_WB2C'): 'low'}}

The two columns in the right contain low, medium and high which are categorical variables. I created them using pd.cut(value_matrix_classification, bins = 3, labels = ["low", "medium", "high"]

I’d like to highlight the pandas dataframe such that there are red, orange, yellow and background color for high, medium, low and NaN values respectively.

I wrote the following function

def highlight_cells(x):
    if x == "high":
        color = "red"
    elif x=="medium":
        color = "orange"
    elif x=="low":
        color = "yellow"
    else:
        color = "gray"
    
    return [f"background-color: {color}"]

and applied it to the dataframe

value_matrix_classification.style.apply(highlight_cells)

However, this gives ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). What would be the appropriate way to do the highlighting here?

I was able to highlight the cells with null values only using

value_matrix_classification.style.highlight_null(null_color = "gray")

I am attaching the screenshot here just for the convenience of the reader.

How can I highlight all the cells based on the given categories: low, medium and high?

Asked By: hbstha123

||

Source

Answer 1

apply takes an entire row or column as input. Use applymap instead.

See this Pandas documentation section.

Edit: you’ll also want highlight_cells to return just f"background-color: {color}", not wrapped in a list.

Answered By: Angus L'Herrou

Answer 2

To add more detail, suppose you have

np.random.seed(0)
df = pd.DataFrame(np.random.randn(4,2), columns=list('AB'))

>>> df

   A         B
0 -0.686760 -0.791461
1 -0.497699 -1.287310
2  0.793787  0.525824
3  0.501172  1.695914

To understand what is happening, we compare a column against a value=0.2. A column of booleans is returned. This is true for and, or, not, if, while. When you have multiple criteria, you will get multiple columns returned.

>>> df.B > 0.2

0     True
1     True
2    False
3    False
Name: B, dtype: bool

Now lets do comparsion

 if df.B > 0.2:
   print("do something")

 >>> ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

The above comparison is equal to the case below and not clear what the result should be.Should it be True because it’s not zero-length? False because there are False values? It is unclear, so instead, pandas raises a ValueError:

if Series([True, True, False, False]) > 0.2:
     print("do something")

So we need to get those multiple values into a single bool value, depending on what we want to do.

if pd.Series([True, True, False, False]).any(): # evaluates to True
   print("I checked if there was any True value in the Series!)

>>> I checked if there was any True value in the Series!

if pd.Series([True, True, False, False]).all(): # Evaluates to False
   print("I checked if there were all True values in the Series!")

Answered By: Priya

Answer 3

Series.map + fillna to create a Series of styles for each column is a more common approach to this type of problem:

def highlight_cells(x):
    return 'background-color: ' + x.map(
        # Associate Values to a given colour code
        {'high': 'red', 'medium': 'orange', 'low': 'yellow'}
    ).fillna('gray')  # Fill unmapped values with default


value_matrix_classification.style.apply(highlight_cells)

Each column is mapped to a new set of colour codes.

This is how the styles are determined using just the second column as a reference, but Styler.apply will call on all columns in the subset:

value_matrix_classification.iloc[:, 1].map(
    {'high': 'red', 'medium': 'orange', 'low': 'yellow'}
)

AIM/CGE 2.0          ADVANCE_2020_WB2C            NaN
                     ADVANCE_2030_Price1.5C       NaN
                     ADVANCE_2030_WB2C            NaN
IMAGE 3.0.1          ADVANCE_2020_WB2C         yellow
                     ADVANCE_2030_WB2C         yellow
MESSAGE-GLOBIOM 1.0  ADVANCE_2020_WB2C         yellow
Name: (wind_off_share, Wind-offshore power generation), dtype: object

Then fillna is used to replace an unmapped values with a default. Note this is not a NaN repr, but rather the default for any value which does not appear in the mapping dictionary:

value_matrix_classification.iloc[:, 1].map(
    {'high': 'red', 'medium': 'orange', 'low': 'yellow'}
).fillna('gray')

AIM/CGE 2.0          ADVANCE_2020_WB2C           gray  # NaN replaced with gray
                     ADVANCE_2030_Price1.5C      gray
                     ADVANCE_2030_WB2C           gray
IMAGE 3.0.1          ADVANCE_2020_WB2C         yellow
                     ADVANCE_2030_WB2C         yellow
MESSAGE-GLOBIOM 1.0  ADVANCE_2020_WB2C         yellow
Name: (wind_off_share, Wind-offshore power generation), dtype: object

Lastly, add the css property:

'background-color: ' + value_matrix_classification.iloc[:, 1].map(
    {'high': 'red', 'medium': 'orange', 'low': 'yellow'}
).fillna('gray')

AIM/CGE 2.0          ADVANCE_2020_WB2C           background-color: gray  # valid css style
                     ADVANCE_2030_Price1.5C      background-color: gray
                     ADVANCE_2030_WB2C           background-color: gray
IMAGE 3.0.1          ADVANCE_2020_WB2C         background-color: yellow
                     ADVANCE_2030_WB2C         background-color: yellow
MESSAGE-GLOBIOM 1.0  ADVANCE_2020_WB2C         background-color: yellow
Name: (wind_off_share, Wind-offshore power generation), dtype: object

Answered By: Henry Ecker

How can I highlight cells with categorical variables?

Question:

Answers: