Compare quantity of each item in dataframe in pandas
Question:
I fairly new to programming in python and using the pandas library
and I am having problems with comparing 2 dataframes with different quantities
i want to see if the quantity in data_1 is less than the quantity in data_2 for each item
import pandas as pd
data_1 = [['banana',10],['orange',2],['strawberry',3]]
data_2 = [['banana',1],['orange',2],['strawberry',5],['melon',8]]
df_1 = pd.Dataframe(data_1,columns = ['item','quantity'])
df_2 = pd.Dataframe(data_2,columns = ['item','quantity'])
i was trying to use pd.merge() the 2 dataframes to compare but it’s not quite what i was looking for…..
i needed a 3° dataframe with the differences only
Answers:
For such comparison item should be index and then you might concat quantity columns and compute desired column following way
import pandas as pd
data_1 = [['banana',10],['orange',2],['strawberry',3]]
data_2 = [['banana',1],['orange',2],['strawberry',5],['melon',8]]
df_1 = pd.DataFrame(data_1,columns = ['item','quantity'])
df_2 = pd.DataFrame(data_2,columns = ['item','quantity'])
df = pd.concat({"data1":df_1.set_index("item").quantity,"data2":df_2.set_index("item").quantity},axis=1,sort=False)
df['isless'] = df['data1'] < df['data2']
print(df)
output
data1 data2 isless
banana 10.0 1 False
orange 2.0 2 False
strawberry 3.0 5 True
melon NaN 8 False
If your looking at the difference between df_1
and df_2
, you can substract the 'quantity'
columns after seting the 'item'
as the index:
import pandas as pd
data_1 = [['banana',10],['orange',2],['strawberry',3]]
data_2 = [['banana',1],['orange',2],['strawberry',5],['melon',8]]
df_1 = pd.DataFrame(data_1,columns = ['item','quantity'])
df_2 = pd.DataFrame(data_2,columns = ['item','quantity'])
# Setting 'item' as the index
df_1.set_index('item', inplace=True)
df_2.set_index('item', inplace=True)
df_3 = df_1.subtract(df_2,fill_value=0)>0
df_3.columns=['df1>df2']
outputs:
df1>df2
item
banana True
melon False
orange False
strawberry False
I fairly new to programming in python and using the pandas library
and I am having problems with comparing 2 dataframes with different quantities
i want to see if the quantity in data_1 is less than the quantity in data_2 for each item
import pandas as pd
data_1 = [['banana',10],['orange',2],['strawberry',3]]
data_2 = [['banana',1],['orange',2],['strawberry',5],['melon',8]]
df_1 = pd.Dataframe(data_1,columns = ['item','quantity'])
df_2 = pd.Dataframe(data_2,columns = ['item','quantity'])
i was trying to use pd.merge() the 2 dataframes to compare but it’s not quite what i was looking for…..
i needed a 3° dataframe with the differences only
For such comparison item should be index and then you might concat quantity columns and compute desired column following way
import pandas as pd
data_1 = [['banana',10],['orange',2],['strawberry',3]]
data_2 = [['banana',1],['orange',2],['strawberry',5],['melon',8]]
df_1 = pd.DataFrame(data_1,columns = ['item','quantity'])
df_2 = pd.DataFrame(data_2,columns = ['item','quantity'])
df = pd.concat({"data1":df_1.set_index("item").quantity,"data2":df_2.set_index("item").quantity},axis=1,sort=False)
df['isless'] = df['data1'] < df['data2']
print(df)
output
data1 data2 isless
banana 10.0 1 False
orange 2.0 2 False
strawberry 3.0 5 True
melon NaN 8 False
If your looking at the difference between df_1
and df_2
, you can substract the 'quantity'
columns after seting the 'item'
as the index:
import pandas as pd
data_1 = [['banana',10],['orange',2],['strawberry',3]]
data_2 = [['banana',1],['orange',2],['strawberry',5],['melon',8]]
df_1 = pd.DataFrame(data_1,columns = ['item','quantity'])
df_2 = pd.DataFrame(data_2,columns = ['item','quantity'])
# Setting 'item' as the index
df_1.set_index('item', inplace=True)
df_2.set_index('item', inplace=True)
df_3 = df_1.subtract(df_2,fill_value=0)>0
df_3.columns=['df1>df2']
outputs:
df1>df2
item
banana True
melon False
orange False
strawberry False