How to calculate the percentage ratio for 'Star' column, grouped by another column's values?

Question:

It’s a complex problem, I don’t know how to formulate the question, just copy-paste the code below to see the sampled data frame (original has >million rows):

df = pd.DataFrame({
    'App ID': {
        0: 'apple',
        1: 'apple',
        2: 'apple',
        3: 'apple',
        4: 'banana',
        5: 'banana',
        6: 'banana',
        7: 'banana',
        8: 'banana',
        9: 'banana',
        10: 'banana',
        11: 'banana',
        12: 'apple',
        13: 'apple',
        14: 'apple',
        15: 'apple',
        16: 'mango',
        17: 'mango',
        18: 'mango',
        19: 'mango'
    },
    'Star': {
        0: 3,
        1: 5,
        2: 5,
        3: 4,
        4: 5,
        5: 1,
        6: 1,
        7: 1,
        8: 5,
        9: 1,
        10: 1,
        11: 1,
        12: 3,
        13: 5,
        14: 5,
        15: 4,
        16: 5,
        17: 1,
        18: 1,
        19: 1
    },
    'Text': {
        0: 'No thats okay ...I Found some network issue in my phone that is by your app is not working properly ...but now its working thanks for your Quick reaction !',
        1: 'its a great app evr for preparation for govt exams or very useful for the great knowledge! about our religious books and thoughts',
        2: "I just loved it ..I don't know if it helps in clarifying net exam or not but surely it is increasing my knowledge and because of this app I made myself somewhat busy  ",
        3: 'Its aap very useful for us. This aap give us smart idea prepration for exam.',
        4: "Have been using someapp for 2 + years now for MF and other investments. Pros 1. The interface is clean and easy to surf 2. The customer service is very good with short response time 3. Over the last two years, the number of funds on the platform have increased, giving more and better choices to the users Cons 1. You can't search the MF of your choice. You will have to 'discover' them.",
        5: "Bad application. Showing fake data. Already one year I invested. I am in loss. Don't download or invest. Showing fake data to attract customers. 80%,90% but actually it's minus 20% result. Waste of time and money.",
        6: 'Lags Too Much , App Freeze on Option Data Screen , when i open F&O Section . whenever i try to open Nifty50 Option Chain ,app starts Lagging & app forced close , Too Much App Freeze Problem, Buggy app , Lags too much , my device :- Oneplus 7 Pro 256 GB / 8 GB Ram',
        7: 'In other categories section, it shows nothing! Just goes blank!! Tried almost 20 times from last 2 days!! After a year problem is the same!!',
        8: "Have been using someapp for 2 + years now for MF and other investments. Pros 1. The interface is clean and easy to surf 2. The customer service is very good with short response time 3. Over the last two years, the number of funds on the platform have increased, giving more and better choices to the users Cons 1. You can't search the MF of your choice. You will have to 'discover' them.",
        9: "Bad application. Showing fake data. Already one year I invested. I am in loss. Don't download or invest. Showing fake data to attract customers. 80%,90% but actually it's minus 20% result. Waste of time and money.",
        10: 'Lags Too Much , App Freeze on Option Data Screen , when i open F&O Section . whenever i try to open Nifty50 Option Chain ,app starts Lagging & app forced close , Too Much App Freeze Problem, Buggy app , Lags too much , my device :- Oneplus 7 Pro 256 GB / 8 GB Ram',
        11: 'In other categories section, it shows nothing! Just goes blank!! Tried almost 20 times from last 2 days!! After a year problem is the same!!',
        12: 'No thats okay ...I Found some network issue in my phone that is by your app is not working properly ...but now its working thanks for your Quick reaction !',
        13: 'its a great app evr for preparation for govt exams or very useful for the great knowledge! about our religious books and thoughts',
        14: "I just loved it ..I don't know if it helps in clarifying net exam or not but surely it is increasing my knowledge and because of this app I made myself somewhat busy  ",
        15: 'Its aap very useful for us. This aap give us smart idea prepration for exam.',
        16: 'Amazing app. I have never faced any kind of issue while doing every transactions. It is hassle-free and smooth to use. Very nice ui design as well. No issue while navigating between multiple services. Truly amazing app and the online services offered are very useful and easy navigation between each operation. Totally hassle-free.You added some new features in every updates.',
        17: "My words may sound harsh, but yes this might be the worst responding app I have ever used. It takes way too long to load. After clicking on some options, it just keeps on loading and shows nothing. I never understood why is this happening?! Not only in my phone, I've checked in some other phones also. No difference! There are several other bugs. Development team have a LOT of works to do.",
        18: "Lost all my money paying bills here. They provided big offers, but don't fall for it. It's a scam! Paid electricity bill 3 days back, deducted the amount from bank. But it's still showing pending. There is no way we can talk to a customer service for payment purpose. There is no option to select the pending transaction when raising a dispute for payments. So I'm totally stuck and messed up. Lost almost 20% of my monthly income there. Can't sleep at night. I'm unable to contact anyone. Worst app",
        19: 'After entering OTP, the app now says "Multiple IDs found" and does not allow me to move further. In previous versions it used to show two accounts and I could choose from the list. Now I am unable to use this app at all because of this issue. It is not my fault that I have two IDs. I did not create these! Please provide the option to choose an Account ID after entering otp, just like I could earlier. Alternatively, if both the IDs can be merged, then help me with that atleast.'
    },
    'DeveloperReply': {
        0: 'Plz send issue email me we solve this problem',
        1: 'धन्यवाद',
        2: 'Dear User, thank you very much for sharing your valuable & useful feedback. If you have other feedback or suggestions, please write to us at hnds.net. We would love to hear from you!',
        3: '',
        4: 'nDear Nikita! Thank you for the positive rating. We’re really stoked about this. Thank you for investing with us.nnnnnn',
        5: 'nHey! Sorry to hear your experience was less than 5-stars. If you’re open to discussing your experience further, please write to us at [email protected]',
        6: 'nHey! Sorry to hear your experience was less than 5-stars. If you’re open to discussing your experience further, please share the screen recording of the issue at [email protected].',
        7: 'nDear Intzar Hussain Sihol! This is certainly not the experience we’d want any of our customers to have. Please fill this form and we will get in touch: htbit.ly/3ixhv9ennnnnnnnn',
        8: 'nDear Nikita! Thank you for the positive rating. We’re really stoked about this. Thank you for investing with us.nnnnnn',
        9: 'nHey! Sorry to hear your experience was less than 5-stars. If you’re open to discussing your experience further, please write to us at [email protected]',
        10: 'nHey! Sorry to hear your experience was less than 5-stars. If you’re open to discussing your experience further, please share the screen recording of the issue at [email protected].',
        11: 'nDear Intzar Hussain Sihol! This is certainly not the experience we’d want any of our customers to have. Please fill this form and we will get in touch: htpsbit.ly/3ixhv9ennnnnnnnn',
        12: 'Plz send issue email me we solve this problem',
        13: 'धन्यवाद',
        14: 'Dear User, thank you very much for sharing your valuable & useful feedback. If you have other feedback or suggestions, please write to us at hnds.net. We would love to hear from you!',
        15: '',
        16: 'Thank you so much for this 5-star review. We appreciate you being a customer and helping to share the word about us. We’re here for you anytime - Rashmi',
        17: 'Hi Anmol, we apologize for the inconvenience caused. We've made a note of your feedback and fixed the same in our latest app version "7.1.1”. Please update your app & do revise your rating in case your issue is addressed. Thank you - Sagar',
        18: 'Hi Dilshad, Sorry for the inconvenience caused. However, we need more details to address your concern we suggest to visit our website www.somecompany.in -> Click on "Contact us" -> Click on "Support" -> Click on "Email us" and share the required details for a quick resolution. Mention the ref ID EX092221-07 for us to track it. Thank you - Akshay',
        19: 'Hi Jyothi, Sorry for the inconvenience caused. However, we need more details to address your concern we suggest to visit our website www.somecompany.in -> Click on "Contact us" -> Click on "Support" -> Click on "Email us" and share the required details for a quick resolution. Mention the ref ID EX102921-40 for us to track it. Thank you - Gurpreet'
    }
})

For each App ID, grouped by Star (Rating: 0-1), I need the percentage of reviews that has been replied to by the developer.

Output: (something like below)

Star       Percentage
0          10%
1          40%
2          30%
3          63.66%
4          23.1%
5          87.7%
Asked By: Mystic

||

Answers:

using groupby: Ref

df_new=((df.groupby('Star').size()/df['Star'].count())*100).reset_index()
df_new.columns =['Star', 'per']

output

    Star per
0   1   45.0
1   3   10.0
2   4   10.0
3   5   35.0
Answered By: Bhargav

You can use the groupby and aggregation function of pandas:

df = df.groupby("Star").agg({"Text":"count", "DeveloperReply":lambda x: x.ne('').sum()})
df["Percentage"] = df["DeveloperReply"]/df["Text"] * 100 
Answered By: robinood

You could try as follows.

  • First, use df.replace to replace all empty string in column ‘DeveloperReply’ with NaN values.
  • Use df.groupby on [‘App ID’,’Star’], select columns ‘Text’ and ‘DeveloperReply’ and get the count for both.
  • Add the calculated percentage as a new column, and reset the index.
import pandas as pd
import numpy as np

out = df.replace({'DeveloperReply':''},np.nan)
    .groupby(['App ID','Star'])[['Text','DeveloperReply']].count()

out['Perc'] = out['DeveloperReply']/out['Text']*100
out.reset_index(drop=False, inplace=True)

print(out)

   App ID  Star  Text  DeveloperReply   Perc
0   apple     3     2               2  100.0
1   apple     4     2               0    0.0
2   apple     5     4               4  100.0
3  banana     1     6               6  100.0
4  banana     5     2               2  100.0
5   mango     1     3               3  100.0
6   mango     5     1               1  100.0

You can also combine df.reindex with itertools.product, if you also want to see all "App ID/Star" combinations that aren’t in the data:

from itertools import product

out = df.replace({'DeveloperReply':''},np.nan)
    .groupby(['App ID','Star'])[['Text','DeveloperReply']].count()

out['Perc'] = out['DeveloperReply']/out['Text']*100
# out.reset_index(drop=False, inplace=True)

out = out.reindex(list(product(df['App ID'].unique(),range(1,6))))
out.reset_index(drop=False, inplace=True)

print(out)

    App ID  Star  Text  DeveloperReply   Perc
0    apple     1   NaN             NaN    NaN
1    apple     2   NaN             NaN    NaN
2    apple     3   2.0             2.0  100.0
3    apple     4   2.0             0.0    0.0
4    apple     5   4.0             4.0  100.0
5   banana     1   6.0             6.0  100.0
6   banana     2   NaN             NaN    NaN
7   banana     3   NaN             NaN    NaN
8   banana     4   NaN             NaN    NaN
9   banana     5   2.0             2.0  100.0
10   mango     1   3.0             3.0  100.0
11   mango     2   NaN             NaN    NaN
12   mango     3   NaN             NaN    NaN
13   mango     4   NaN             NaN    NaN
14   mango     5   1.0             1.0  100.0
Answered By: ouroboros1
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.