differences of values of columns of pandas dataframe

Question:

i have two pandas dataframe df1 and df2. df1 has two columns ‘A’ and ‘B’, both having 10 observations and df2 has two columns ‘C’ and ‘D’, both having 20 observations. take 1st value of column ‘B’ from df1. subtract it from all 20 values of column ‘D’ of df2. find the minimum of these 20 differences and place it beside 1st obs of df1[‘B’]. repeat this process for all 10 values of df1[‘B’].
Kindly help.

You can use the following sample code to create sample data and solve it.

sample data code:
data1 = [[0.1, 0.5], [0.11, 0.4]]

df1 = pd.DataFrame(data1, columns=[‘A’, ‘B’])

data2 = [[-0.6, 0.33], [0.47, 0.9],[0.21, 0.76], [-0.7, 0.01]]

df2 = pd.DataFrame(data2, columns=[‘C’, ‘D’])

Asked By: BTM

||

Answers:

There are a few ways of doing what you are trying to achieve. You could just iterate through the rows of df1. Here’s an example of defining a function to apply to a dataframe

import pandas as pd
import numpy as np

# create correctly shaped dataframes with random numbers
df1 = pd.DataFrame(np.random.rand(10,2), columns=['A', 'B'])
df2 = pd.DataFrame(np.random.rand(20,2), columns=['C', 'D'])


# define a funtion to apply
def my_func(x, df1, col_d):
    index = x.name
    difference = col_d - df1['B'][index]
    return difference.min()

# add a column to df1 by applying the function above to column 'D' of df2
df1['result'] = df1.apply(lambda x: my_func(x, df1, df2['D']), axis=1)
Answered By: tomcheney
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.