Unique values of two columns for pandas dataframe
Question:
Suppose I have pandas data frame with 2 columns:
df: Col1 Col2
1 1
1 2
1 2
1 2
3 4
3 4
Then I want to keep only the unique couple values (col1, col2) of these two columns and give their frequncy:
df2: Col1 Col2 Freq
1 1 1
1 2 3
3 4 2
I think to use df['Col1', 'Col2'].value_counts()
but it works only for one column.
Does it exist a function to deal with many columns?
Answers:
You could try
df.groupby(['Col1', 'Col2']).size()
for a different visual output in comparison to jez’s answer, you can extend that solution with
pd.DataFrame(df.groupby(['Col1', 'Col2']).size().rename('Freq'))
gives
Freq
Col1 Col2
1 1 1
2 3
3 4 2
You need groupby
+ size
+ Series.reset_index
:
df = df.groupby(['Col1', 'Col2']).size().reset_index(name='Freq')
print (df)
Col1 Col2 Freq
0 1 1 1
1 1 2 3
2 3 4 2
Suppose I have pandas data frame with 2 columns:
df: Col1 Col2
1 1
1 2
1 2
1 2
3 4
3 4
Then I want to keep only the unique couple values (col1, col2) of these two columns and give their frequncy:
df2: Col1 Col2 Freq
1 1 1
1 2 3
3 4 2
I think to use df['Col1', 'Col2'].value_counts()
but it works only for one column.
Does it exist a function to deal with many columns?
You could try
df.groupby(['Col1', 'Col2']).size()
for a different visual output in comparison to jez’s answer, you can extend that solution with
pd.DataFrame(df.groupby(['Col1', 'Col2']).size().rename('Freq'))
gives
Freq
Col1 Col2
1 1 1
2 3
3 4 2
You need groupby
+ size
+ Series.reset_index
:
df = df.groupby(['Col1', 'Col2']).size().reset_index(name='Freq')
print (df)
Col1 Col2 Freq
0 1 1 1
1 1 2 3
2 3 4 2