Python (pandas) – Count Occurrences in Column

Question:

I have a data frame and want to create a new column to count the number of Participants there are in each row. Is there a way to do this?

Data: invoice_df

Order Id,Date,Meal Id,Company Id,Date of Meal,Participants,Meal Price,Type of Meal
839FKFW2LLX4LMBB,27-05-2016,INBUX904GIHI8YBD,LJKS5NK6788CYMUU,2016-05-31 07:00:00+02:00,['David Bishop'],469,Breakfast
97OX39BGVMHODLJM,27-09-2018,J0MMOOPP709DIDIE,LJKS5NK6788CYMUU,2018-10-01 20:00:00+02:00,['David Bishop'],22,Dinner
041ORQM5OIHTIU6L,24-08-2014,E4UJLQNCI16UX5CS,LJKS5NK6788CYMUU,2014-08-23 14:00:00+02:00,['Karen Stansell'],314,Lunch
YT796QI18WNGZ7ZJ,12-04-2014,C9SDFHF7553BE247,LJKS5NK6788CYMUU,2014-04-07 21:00:00+02:00,['Addie Patino'],438,Dinner
6YLROQT27B6HRF4E,28-07-2015,48EQXS6IHYNZDDZ5,LJKS5NK6788CYMUU,2015-07-27 14:00:00+02:00,['Addie Patino' 'Susan Guerrero'],690,Lunch
AT0R4DFYYAFOC88Q,21-07-2014,W48JPR1UYWJ18NC6,LJKS5NK6788CYMUU,2014-07-17 20:00:00+02:00,['David Bishop' 'Susan Guerrero' 'Karen Stansell'],181,Dinner
2DDN2LHS7G85GKPQ,29-04-2014,1MKLAKBOE3SP7YUL,LJKS5NK6788CYMUU,2014-04-30 21:00:00+02:00,['Susan Guerrero' 'David Bishop'],14,Dinner
FM608JK1N01BPUQN,08-05-2014,E8WJZ1FOSKZD2MJN,36MFTZOYMTAJP1RK,2014-05-07 09:00:00+02:00,['Amanda Knowles' 'Cheryl Feaster' 'Ginger Hoagland' 'Michael White'],320,Breakfast
CK331XXNIBQT81QL,23-05-2015,CTZSFFKQTY7SBZ4J,36MFTZOYMTAJP1RK,2015-05-18 13:00:00+02:00,['Cheryl Feaster' 'Amanda Knowles' 'Ginger Hoagland'],697,Lunch
FESGKOQN2OZZWXY3,10-01-2016,US0NQYNNHS1SQJ4S,36MFTZOYMTAJP1RK,2016-01-14 22:00:00+01:00,['Glenn Gould' 'Amanda Knowles' 'Ginger Hoagland' 'Michael White'],451,Dinner
YITOTLOF0MWZ0VYX,03-10-2016,RGYX8772307H78ON,36MFTZOYMTAJP1RK,2016-10-01 22:00:00+02:00,['Ginger Hoagland' 'Amanda Knowles' 'Michael White'],263,Dinner
8RIGCF74GUEQHQEE,23-07-2018,5XK0KTFTD6OAP9ZP,36MFTZOYMTAJP1RK,2018-07-27 08:00:00+02:00,['Amanda Knowles'],210,Breakfast
TH60C9D8TPYS7DGG,15-12-2016,KDSMP2VJ22HNEPYF,36MFTZOYMTAJP1RK,2016-12-13 08:00:00+01:00,['Cheryl Feaster' 'Bret Adams' 'Ginger Hoagland'],755,Breakfast
W1Y086SRAVUZU1AL,17-09-2017,8IUOYVS031QPROUG,36MFTZOYMTAJP1RK,2017-09-14 13:00:00+02:00,['Bret Adams'],469,Lunch
WKB58Q8BHLOFQAB5,31-08-2016,E2K2TQUMENXSI9RP,36MFTZOYMTAJP1RK,2016-09-03 14:00:00+02:00,['Michael White' 'Ginger Hoagland' 'Bret Adams'],502,Lunch
N8DOG58MW238BHA9,25-12-2018,KFR2TAYXZSVCHAA2,36MFTZOYMTAJP1RK,2018-12-20 12:00:00+01:00,['Ginger Hoagland' 'Cheryl Feaster' 'Glenn Gould' 'Bret Adams'],829,Lunch
DPDV9UGF0SUCYTGW,25-05-2017,6YV61SH7W9ECUZP0,36MFTZOYMTAJP1RK,2017-05-24 22:00:00+02:00,['Michael White'],708,Dinner
KNF3E3QTOQ22J269,20-06-2018,737T2U7604ABDFDF,36MFTZOYMTAJP1RK,2018-06-15 07:00:00+02:00,['Glenn Gould' 'Cheryl Feaster' 'Ginger Hoagland' 'Amanda Knowles'],475,Breakfast
LEED1HY47M8BR5VL,22-10-2017,I22P10IQQD06MO45,36MFTZOYMTAJP1RK,2017-10-22 14:00:00+02:00,['Glenn Gould'],27,Lunch
LSJPNJQLDTIRNWAL,27-01-2017,247IIVNN6CXGWINB,36MFTZOYMTAJP1RK,2017-01-23 13:00:00+01:00,['Amanda Knowles' 'Bret Adams'],672,Lunch
6UX5RMHJ1GK1F9YQ,24-08-2014,LL4AOPXDM8V5KP5S,H3JRC7XX7WJAD4ZO,2014-08-27 12:00:00+02:00,['Anthony Emerson' 'Irvin Gentry' 'Melba Inlow'],552,Lunch
5SYB15QEFWD1E4Q4,09-07-2017,KZI0VRU30GLSDYHA,H3JRC7XX7WJAD4ZO,2017-07-13 08:00:00+02:00,"['Anthony Emerson' 'Emma Steitz' 'Melba Inlow' 'Irvin Gentry'
 'Kelly Killebrew']",191,Breakfast
W5S8VZ61WJONS4EE,25-03-2017,XPSPBQF1YLIG26N1,H3JRC7XX7WJAD4ZO,2017-03-25 07:00:00+01:00,['Irvin Gentry' 'Kelly Killebrew'],471,Breakfast
795SVIJKO8KS3ZEL,05-01-2015,HHTLB8M9U0TGC7Z4,H3JRC7XX7WJAD4ZO,2015-01-06 22:00:00+01:00,['Emma Steitz'],588,Dinner
8070KEFYSSPWPCD0,05-08-2014,VZ2OL0LREO8V9RKF,H3JRC7XX7WJAD4ZO,2014-08-09 12:00:00+02:00,['Lewis Eyre'],98,Lunch
RUQOHROBGBOSNUO4,10-06-2016,R3LFUK1WFDODC1YF,H3JRC7XX7WJAD4ZO,2016-06-09 08:00:00+02:00,['Anthony Emerson' 'Kelly Killebrew' 'Lewis Eyre'],516,Breakfast
6P91QRADC2O9WOVT,25-09-2016,L2F2HEGB6Q141080,H3JRC7XX7WJAD4ZO,2016-09-26 07:00:00+02:00,"['Kelly Killebrew' 'Lewis Eyre' 'Irvin Gentry' 'Emma Steitz'
 'Anthony Emerson']",664,Breakfast
Asked By: c200402

||

Answers:

They, seems to be just list so you should be able to get desired result by applying len to them, consider following simple example
import pandas as pd

import pandas as pd
df = pd.DataFrame({'col1': [['A'],['A','B'],['A','B','C']]})
df['col1cnt'] = df.col1.apply(len)
print(df)

gives output

        col1  col1cnt
0        [A]        1
1     [A, B]        2
2  [A, B, C]        3
Answered By: Daweo

You can create a new column to count the number of participants in each row as follows:

invoice_df['participant_count'] = invoice_df['Participants'].apply(lambda x: len(x))

The apply function applies a custom function to each element in the Participants column. The function returns the length of the list and puts it as the value of the new column.

Answered By: Mats

Yes you can do that. If the Participants column would contain python lists you could calculate the length of that list for each row:

df['num_participants'] = df['Participants'].apply(lambda x: len(x))

However, if you read the csv directly the column will be text and we will get the number of characters instead of the list length.
Because your data does not have comma’s separating the values in the list, you could count apostrophes and divide by 2:

df['num_participants'] = df['Participants'].apply(lambda x: x.count("'")/2)
Answered By: pieterbons
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.