Trying to write a function that searches for a few different values in each row and outputs a match count as a new column

Question

Suppose we have a DataFrame

df = pd.DataFrame({'A':['cam1','cam2','cam1','cam4'],'B':['cam2', 'cam1', 'cam4', 'cam3'],'C':['cam3','cam4', 'cam5','cam2']})

	A	B	C
0	cam1	cam2	cam3
1	cam2	cam1	cam4
2	cam1	cam4	cam5
3	cam4	cam3	cam2

I’d like to add a column that counts the amount of times ‘cam1’ or ‘cam2’ appears in each row.

The desired output would look like this:

	A	B	C	Count
0	cam1	cam2	cam3	2
1	cam2	cam1	cam4	2
2	cam1	cam4	cam5	1
3	cam4	cam3	cam2	1

Is there a way to do this without using a million if else statements?

Asked By: pythonTyler

||

Source

Answer 1

You can use DataFrame.aggregate() with two lambda functions to achieve what you’re describing:

df["Count"] = df.aggregate(lambda x: len(list(filter(lambda y: y in ["cam1", "cam2"], x.values))), axis="columns")

^Repl.it

Per the comment – the desire to individually "weight" specific values could be achieved by changing up the paradigm slightly to map each value to a numerical value and then summing the resulting iterable:

df["Count"] = df.aggregate(lambda x: sum(list(map(lambda y: 2 if y ==  "cam1" else (1 if y == "cam2" else 0), x.values))), axis="columns")

^Repl.it

If in this scenario there are more than just a few values to map against their weights, you may consider doing this more eloquently with a dict:

weights = {
  "cam1": 2,
  "cam2": 1
}

df["Count"] = df.aggregate(lambda x: sum(list(map(lambda y: weights[y] if y in weights else 0, x.values))), axis="columns")

^Repl.it

Answered By: esqew

Trying to write a function that searches for a few different values in each row and outputs a match count as a new column

Question:

Answers: