Filter DataFrame by regex and match condition

Question

I have the following DataFrame:

Periodicity	Answer
M	Yes
M	YeS
Y	yeS
M	No
Y	NO
M	nO

I need to filter the DataFrame to get the rows that have a Monthly (M) periodicity and have a positive (YES, YEs, Yes, YeS, and so on) answer.

I have tried filter it with the following code:

import pandas as pd
import re

data = {'Periodicity': ['M', 'Y', 'M', 'M', 'M', 'Y', 'M', 'M'],
        'Answer': ['YES', 'Yes', 'YEs', 'NO', 'no', 'No', 'yeS', 'yeS']}
  
df = pd.DataFrame(data)

pat=r'^[Yy].*'

df_filter=df[df.Answer.str.contains(pat)]

But I dont know how to add another condition to filter the DataFrame to match the desired Periodicity. Everytime I add another filter condition, I get the following error message:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Asked By: Aragorn64

||

Source

Answer 1

Try this:

df_filter=df[df.Answer.str.contains(pat) & df.Periodicity.str.contains('M')]

Answered By: vitali_droujko

Answer 2

You can do this without regex (by using string lower method and dataframe filtering :

import pandas as pd
import re

data = {'Periodicity': ['M', 'Y', 'M', 'M', 'M', 'Y', 'M', 'M'],
        'Answer': ['YES', 'Yes', 'YEs', 'NO', 'no', 'No', 'yeS', 'yeS']}
  
df = pd.DataFrame(data)

df = df[(df['Answer'].str.lower() == 'yes') & (df['Periodicity'] == 'M')] # do this

print(df)

output:

  Periodicity Answer
0           M    YES
2           M    YEs
6           M    yeS
7           M    yeS

Answered By: mrCopiCat

Filter DataFrame by regex and match condition

Question:

Answers: