Label specific points in seaborn based on x-values


I have a dataframe where idividuals have some scores. The idea is to highlight the reference indididual (check) in red and the individuals with a lower score in green. Following similar problem on StackOverflow (Adding labels in x y scatter plot with seaborn), I was able to highlight the check in red. However, I failed to highlight in green the two individuals (id_11, id_17) with a lower score. I got the error
"ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()."
Please, find below my code. Thank you in advance for your help.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

df = pd.DataFrame(
    {'Individual Name': ['id_1', 'check', 'id_3', 'id_4', 'id_5', 'id_6', 'id_7', 'id_8', 'id_9', 'id_10', 'id_11', 'id_12', 'id_13', 'id_14', 'id_15', 'id_16', 'id_17', 'id_18', 'id_19', 'id_20', 'id_21', 'id_22', 'id_23', 'id_24', 'id_25', 'id_26', 'id_27', 'id_28', 'id_29', 'id_30'],
     'feature': [0.508723818, 0.438733637, 0.718100026, 0.506722786, 0.520924985, 0.69302915, 0.659499198, 0.547989555, 0.714309067, 0.617602669, 0.35364303, 0.534064345, 0.59011931, 0.488031738, 0.511025466, 0.655582175, 0.32029745, 0.594929278, 0.562511802, 0.571763799, 0.681324482, 0.40444921, 0.628999099, 0.497668065, 0.690914914, 0.530561335, 0.798924312, 0.671025127, 0.71243462, 0.539980784],
     'score': [91.5, 89.75, 94.25, 91.75, 91.75, 93.5, 93.25, 92.25, 94.0, 93.0, 89.25, 92.0, 92.5, 91.5, 91.5, 93.5, 88.5, 92.25, 92.0, 93.25, 93.25, 90.25, 92.75, 90.75, 94.0, 92.0, 95.75, 93.75, 94.5, 92.0]})

fig, ax = plt.subplots()
sns.scatterplot(data=df, x='score', y='feature')
plt.text(x=df['score'][df['Individual Name'] == 'check'], y=df['feature'][df['Individual Name'] == 'check'], s='check', color='red')
score_of_check = df['score'][
    df['Individual Name'] == 'check']  # reference value for highlighting idividuals that have a lower score
# label points if score is lower than score_of_check
for x in df['score']:
    if x < score_of_check:
        print(x)  # Even print generate the error
        plt.text(x=df['score'], y=df['feature'], s=df['Individual Name'],
                 color='green')  # Ultimately I would like to label the 2 materials, id_11 and id_17 in green
Asked By: Amilovsky



Further to JohanC’s comment, here is some code that makes it work. The key is to set up an index based off of the size (rows) of your dataframe. The if was not comparing compatible data types – note the variable score_of_check is a series and needs to be converted to a value for comparison. You also need to use your index to supply single element coordinates and labels to the plt.text function, otherwise you are trying to assign the entire column each time you run it.

for ind in range(len(df)):
   if df['score'][ind] < score_of_check.values[0]:
       print(ind)  # Even print generate the error
       plt.text(x=df['score'][ind], y=df['feature'][ind], s=df['Individual Name'][ind],
             color='green')  # Ultimately I would like to label the 2 materials, id_11 and id_17 in green
Answered By: David A
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.