Element-wise weighted average of multiple dataframes

Question:

Let’s say we have 3 dataframes (df1, df2, df3). I know I can get an element-wise average of the three dataframes with

list_of_dfs = [df1, df2, df3]
sum(list_of_dfs)/len(list_of_dfs)

But I need to get a weighted average of the three dataframes, with weights defined in an array "W"

W = np.array([0.2, 0.3, 0.5])

So df1 will get a 20% weight, df2 30% and df3 50%.
Unfortunately the actual number of dataframes is much larger than 3, otherwise I could do simply the follwing:

df1*W[0] + df2*W[1] + df3*W[2]

Any help? Thanks

Asked By: younggotti

||

Answers:

You can get the sum with sum and zip:

sum(w*d for w, d in zip(W, list_of_dfs))

For the average, divide by the sum of weights if it’s not already equal to 1:

sum(w*d for w, d in zip(W, list_of_dfs))/sum(W)

Or, with (assuming the DataFrames are aligned):

out = pd.DataFrame(np.average(np.dstack(list_of_dfs), axis=2, weights=W),
                   index=df1.index, columns=df1.columns)

Example output (weighted average):

     0    1    2    3    4
0  1.9  2.0  4.0  4.6  4.4
1  6.4  3.5  3.9  2.7  6.2
2  6.3  2.6  4.2  5.1  5.0
3  8.4  5.6  4.4  6.3  3.3
4  1.9  3.9  6.2  6.9  2.8

Used input:

np.random.seed(0)
df1 = pd.DataFrame(np.random.randint(0, 10, size=(5, 5)))
df2 = pd.DataFrame(np.random.randint(0, 10, size=(5, 5)))
df3 = pd.DataFrame(np.random.randint(0, 10, size=(5, 5)))
Answered By: mozway
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.