How to apply value_counts(normalize=True) and value_counts() to pandas series?
Question:
I like to show the value_counts(normalize=True)
of a series
what works well, but I also wanna show the value_counts()
not normalized in an additional column.
Code
import pandas as pd
cars = {'Brand': ['Honda Civic','Toyota Corolla','','Audi A4'],
'Price': [32000,35000,37000,45000]
}
df = pd.DataFrame(cars, columns = ['Brand', 'Price'])
df.Brand.value_counts(normalize=True)
Expected output
perc count
Toyota Corolla 0.25 1
Audi A4 0.25 1
Honda Civic 0.25 1
0.25 1
Name: Brand, dtype: float64
Question
How could I attache both information to the series?
Answers:
If want use value_counts
you need run code without normalize=True
:
df = pd.concat([df.Brand.value_counts(normalize=True),
df.Brand.value_counts()],
axis=1,
keys=('perc','count'))
print (df)
perc count
0.25 1
Honda Civic 0.25 1
Toyota Corolla 0.25 1
Audi A4 0.25 1
Another idea is create perc
column in another step, DataFrame.insert
is for set position of new column:
df = df.Brand.value_counts().to_frame('count')
df.insert(0, 'perc', df['count'].div(len(df)))
print (df)
perc count
0.25 1
Honda Civic 0.25 1
Toyota Corolla 0.25 1
Audi A4 0.25 1
df = df.Brand.value_counts(normalize=True).to_frame('perc')
df['count'] = df['perc'].mul(len(df))
You also can try this way:
pd.concat(
[
df.Brand.value_counts(),
df.Brand.value_counts(normalize=True)
],
keys=['counts', 'normalized_counts'],
axis=1,
)
I like to show the value_counts(normalize=True)
of a series
what works well, but I also wanna show the value_counts()
not normalized in an additional column.
Code
import pandas as pd
cars = {'Brand': ['Honda Civic','Toyota Corolla','','Audi A4'],
'Price': [32000,35000,37000,45000]
}
df = pd.DataFrame(cars, columns = ['Brand', 'Price'])
df.Brand.value_counts(normalize=True)
Expected output
perc count
Toyota Corolla 0.25 1
Audi A4 0.25 1
Honda Civic 0.25 1
0.25 1
Name: Brand, dtype: float64
Question
How could I attache both information to the series?
If want use value_counts
you need run code without normalize=True
:
df = pd.concat([df.Brand.value_counts(normalize=True),
df.Brand.value_counts()],
axis=1,
keys=('perc','count'))
print (df)
perc count
0.25 1
Honda Civic 0.25 1
Toyota Corolla 0.25 1
Audi A4 0.25 1
Another idea is create perc
column in another step, DataFrame.insert
is for set position of new column:
df = df.Brand.value_counts().to_frame('count')
df.insert(0, 'perc', df['count'].div(len(df)))
print (df)
perc count
0.25 1
Honda Civic 0.25 1
Toyota Corolla 0.25 1
Audi A4 0.25 1
df = df.Brand.value_counts(normalize=True).to_frame('perc')
df['count'] = df['perc'].mul(len(df))
You also can try this way:
pd.concat(
[
df.Brand.value_counts(),
df.Brand.value_counts(normalize=True)
],
keys=['counts', 'normalized_counts'],
axis=1,
)