Python Pandas – trying to get counts and percentages for all fields – Can only get one or the other
Question:
I want to import some survey data, loop through all fields, and run counts and percentages. I’m struggling to get this to work for both a count and percentage of each value in a question.
For example, here’s what a survey question might look like:
Q1
ID
Response
1
White
2
Black
3
Black
DESIRED OUTPUT:
Q1
Black 2 66.6%
White 1 33.3%
Below is my attempt and I know it isn’t correct, but I want to show that I am trying.
import pandas as pd
import numpy as np
dataset = pd.read_excel('imp_survey_analyze.xlsx')
for column in df.columns
print("n" + column)
print(df[column].value_counts())
Answers:
try:
out=df['Response'].value_counts(normalize=True).mul(100).round(1).astype(str)+'%'
Finally:
out=pd.concat([df['Response'].value_counts(),out],axis=1)
out.columns=['count','percentage']
Output of out
:
count percentage
Response
Black 2 66.7%
White 1 33.3%
import pandas as pd
data = [['ID','Response'],
['1','White'],
['2','Black'],
['3','Black']]
df = pd.DataFrame(data[1:], columns=data[0])
descriptive_df = (df.groupby(by='Response').count())
descriptive_df['percentages'] = descriptive_df["ID"] / descriptive_df["ID"].sum()
print (descriptive_df)
>>> ID percentages
>>> Response
>>> Black 2 0.666667
>>> White 1 0.333333
Try this:
>>> pd.concat([df["Response"].value_counts(), df["Response"].value_counts(normalize=True)], axis=1, keys=["Count", "%"])
Count %
Black 2 0.666667
White 1 0.333333
I want to import some survey data, loop through all fields, and run counts and percentages. I’m struggling to get this to work for both a count and percentage of each value in a question.
For example, here’s what a survey question might look like:
Q1
ID | Response |
---|---|
1 | White |
2 | Black |
3 | Black |
DESIRED OUTPUT:
Q1
Black 2 66.6%
White 1 33.3%
Below is my attempt and I know it isn’t correct, but I want to show that I am trying.
import pandas as pd
import numpy as np
dataset = pd.read_excel('imp_survey_analyze.xlsx')
for column in df.columns
print("n" + column)
print(df[column].value_counts())
try:
out=df['Response'].value_counts(normalize=True).mul(100).round(1).astype(str)+'%'
Finally:
out=pd.concat([df['Response'].value_counts(),out],axis=1)
out.columns=['count','percentage']
Output of out
:
count percentage
Response
Black 2 66.7%
White 1 33.3%
import pandas as pd
data = [['ID','Response'],
['1','White'],
['2','Black'],
['3','Black']]
df = pd.DataFrame(data[1:], columns=data[0])
descriptive_df = (df.groupby(by='Response').count())
descriptive_df['percentages'] = descriptive_df["ID"] / descriptive_df["ID"].sum()
print (descriptive_df)
>>> ID percentages
>>> Response
>>> Black 2 0.666667
>>> White 1 0.333333
Try this:
>>> pd.concat([df["Response"].value_counts(), df["Response"].value_counts(normalize=True)], axis=1, keys=["Count", "%"])
Count %
Black 2 0.666667
White 1 0.333333