How to filtering pandas dataframe by multiple columns

Question

I would like to get values from column n where values in subset of other columns is True.
Example, the data frame:

t, f = True, False
data = [
 [t, f, f, '1'],
 [f, f, f, '2'],
 [f, t, f, '3'],
 [f, f, t, '4']
]
df = pd.DataFrame(data, columns=list("abcn"))

df as table

       a      b      c  n
0   True  False  False  1
1  False  False  False  2
2  False   True  False  3
3  False  False   True  4

columns for search is a and b, and I wish to get records from n where these columns are True, what I tried:

fcols = ("a", "b")
df[df[[*fcols]] == t].dropna(axis=0, how='all')

this is give me right records, but with Nan in the column n

      a     b    c    n
0  True   NaN  NaN  NaN
2   NaN  True  NaN  NaN

I’m feel that I’m more or less close to the solution, but …

Asked By: Brown Bear

||

Source

Answer 1

Use any to aggregate the booleans for your boolean indexing:

fcols = ("a", "b")

out = df[df[[*fcols]].eq(t).any(axis=1)]#.dropna(axis=0, how='all') # dropna not needed

Output:

       a      b      c  n
0   True  False  False  1
2  False   True  False  3

Intermediate indexing Series:

df[[*fcols]].eq(t).any(axis=1)

0     True
1    False
2     True
3    False
dtype: bool

Answered By: mozway

Answer 2

Use DataFrame.any for test at least one True match per rows for boolean Series passed to boolean indexing:

fcols = ("a", "b")
df = df[df[[*fcols]].eq(t).any(axis=1)]

#if need test Trues, possible remove compare by True
df = df[df[[*fcols]].any(axis=1)]

print (df)
       a      b      c  n
0   True  False  False  1
2  False   True  False  3

Details:

print (df[[*fcols]].eq(t).any(axis=1))
0     True
1    False
2     True
3    False
dtype: bool

Answered By: jezrael

Answer 3

I decided in this way

df = df[df['a'] | df['b']]

In [5]: df
Out[5]: 
       a      b      c  n
0   True  False  False  1
2  False   True  False  3

Answered By: Ensay

How to filtering pandas dataframe by multiple columns

Question:

Answers: