count the number of values in data frame's column that exist in another data frame's column
Question:
I have two data frames:
df1:
Index
Date
0
2016-03-21 20:10:00
1
2016-03-22 21:09:00
2
2016-05-03 17:05:00
df2:
Index
Date
0
2016-03-21 20:10:00
1
2016-03-21 21:00:00
2
2016-03-22 21:09:00
3
2016-05-03 17:05:00
4
2017-06-01 16:10:00
There’s probably a really simple way to do this but how would I count the number of values in the df1 Date column that are also in the df2 Date column? (These are not unique value counts)
Answers:
The simplest approach to solve your problem will be use set
intersection(find common element from set).
Eg:
df1=pd.DataFrame({"date":['2016-03-21 20:10:00','2016-03-22 21:09:00','2016-05-03 17:05:00']})
df2=pd.DataFrame({"date":['2016-03-21 20:10:00','2016-03-21 21:00:00',
'2016-03-22 21:09:00','2016-05-03 17:05:00','2017-06-01 16:10:00']})
print(len(set(df1.date) & set(df2.date))) # 3
This will just convert that specified column to python-set
and find common between them.
If you want to use Pandas then you can use pandas.merge() to get the common rows based on the columns.
df3 = pd.merge(df1, df2)
print(len(df3)) # 3
and count common rows using len
function.
I have two data frames:
df1:
Index | Date |
---|---|
0 | 2016-03-21 20:10:00 |
1 | 2016-03-22 21:09:00 |
2 | 2016-05-03 17:05:00 |
df2:
Index | Date |
---|---|
0 | 2016-03-21 20:10:00 |
1 | 2016-03-21 21:00:00 |
2 | 2016-03-22 21:09:00 |
3 | 2016-05-03 17:05:00 |
4 | 2017-06-01 16:10:00 |
There’s probably a really simple way to do this but how would I count the number of values in the df1 Date column that are also in the df2 Date column? (These are not unique value counts)
The simplest approach to solve your problem will be use set
intersection(find common element from set).
Eg:
df1=pd.DataFrame({"date":['2016-03-21 20:10:00','2016-03-22 21:09:00','2016-05-03 17:05:00']})
df2=pd.DataFrame({"date":['2016-03-21 20:10:00','2016-03-21 21:00:00',
'2016-03-22 21:09:00','2016-05-03 17:05:00','2017-06-01 16:10:00']})
print(len(set(df1.date) & set(df2.date))) # 3
This will just convert that specified column to python-set
and find common between them.
If you want to use Pandas then you can use pandas.merge() to get the common rows based on the columns.
df3 = pd.merge(df1, df2)
print(len(df3)) # 3
and count common rows using len
function.