How to count the amount of words said by someone pandas dataframe

Question

I have a dataframe like this am I’m trying to count the words said by a specific author.

Author              Text                   Date
Jake                hey hey my names Jake  1.04.1997
Mac                 hey my names Mac       1.02.2019
Sarah               heymy names Sarah      5.07.2001

I’ve been trying to get it set up in a way where if i were to search for the word "hey" it would produce

Author              Count
Jake                2
Mac                 1

Asked By: Matt

||

Source

Answer 1

If df is your original dataframe

newDF = pd.DataFrame(columns=['Author','Count'])
newDF['Author'] = df['Author']
newDF['Count'] = df['Text'].str.count("hey")
newDF.drop(newDF[newDF['Count'] == 0].index, inplace=True)

Answered By: Rodrigo Guzman

Answer 2

Use Series.str.count with aggregate sum:

df1 = df['Text'].str.count('hey').groupby(df['Author']).sum().reset_index(name='Count')
print (df1)
  Author  Count
0   Jake      2
1    Mac      0
2  Sarah      1

If need filter out rows with 0 values add boolean indexing:

s = df['Text'].str.count('hey')
df1 = s[ s.gt(0)].groupby(df['Author']).sum().reset_index(name='Count')
print (df1)
  Author  Count
0   Jake      2
1  Sarah      1

EDIT: for test hey separately add words boundaries bb like:

df1 = df['Text'].str.count(r'bheyb').groupby(df['Author']).sum().reset_index(name='Count')
print (df1)
  Author  Count
0   Jake      2
1    Mac      1
2  Sarah      0


s = df['Text'].str.count(r'bheyb')
df1 = s[ s.gt(0)].groupby(df['Author']).sum().reset_index(name='Count')
print (df1)
  Author  Count
0   Jake      2
1    Mac      1

Answered By: jezrael

How to count the amount of words said by someone pandas dataframe

Question:

Answers: