How to calculate percentage while matching columns on video_id?

Question:

I am trying to calculate a percentage of users who view a specific video and those who do not. I managed to calculate the total number of videos and also total number of videos viewed by each group.

However, when I try to calculate the percentages it does not work.
I believe I probably need to match the story ids as the columns do not match anymore after calculating, how do I do that?

This is my formula to calculate percentages:

pd.DataFrame(df.status.eq(3).astype(int).groupby(df.story_id).sum() / df['story_id'].value_counts())

However the results do not make sense as I believe that during the calculations the story_id did not match.

Asked By: Sophie Martusewicz

||

Answers:

For percentage – sum divide by count is possible use mean – solution is simplify:

print (df)
   story_id  status
0         1       3
1         1       5
2         1       3
3         2       3
4         2       3
5         4       5
6         4       3
7         5       7


df1 = df.status.eq(3).groupby(df.story_id).mean().reset_index(name='perc')
print (df1)
   story_id      perc
0         1  0.666667
1         2  1.000000
2         4  0.500000
3         5  0.000000
Answered By: jezrael