df.loc column where name contains "x" and count rows where rows contains "y"
Question:
I have a pandas df and want to loc columns containing string "X" and then sum for every row where rows string contains "Y".
Desired results
ABC
ABC
ABX
ABX
Result
A
A
YA
A
1
A
A
YA
YA
2
I’ve tried beginning below, ,but already here and error occurs.
df[‘result’] = np.where(df.loc[:~df.columns.str.contains(‘X’)]
Error:
TypeError: cannot do slice indexing on DatetimeIndex with these indexers
Anyone know how to go about?
Answers:
You can use filter
to select the columns and stack
+str.contains
+groupby.sum
to count the values:
df['Result'] = (df.filter(like='X')
.stack()
.str.contains('Y')
.groupby(level=0).sum()
)
Or with apply
:
df['Result'] = (df.filter(like='X')
.apply(lambda s: s.str.contains('Y'))
.sum(axis=1)
)
output:
ABC ABC ABX ABX Result
0 A A YA A 1
1 A A YA YA 2
I have a pandas df and want to loc columns containing string "X" and then sum for every row where rows string contains "Y".
Desired results
ABC | ABC | ABX | ABX | Result |
---|---|---|---|---|
A | A | YA | A | 1 |
A | A | YA | YA | 2 |
I’ve tried beginning below, ,but already here and error occurs.
df[‘result’] = np.where(df.loc[:~df.columns.str.contains(‘X’)]
Error:
TypeError: cannot do slice indexing on DatetimeIndex with these indexers
Anyone know how to go about?
You can use filter
to select the columns and stack
+str.contains
+groupby.sum
to count the values:
df['Result'] = (df.filter(like='X')
.stack()
.str.contains('Y')
.groupby(level=0).sum()
)
Or with apply
:
df['Result'] = (df.filter(like='X')
.apply(lambda s: s.str.contains('Y'))
.sum(axis=1)
)
output:
ABC ABC ABX ABX Result
0 A A YA A 1
1 A A YA YA 2