Merging two dataframes in a loop change original dataframe in loop

Question:

Assuming I have a list of dataframes

dataframes = [a, b, c, d]

and one dataframe additionalInformation containing information I need for the merge.

Is there a way to join the dataframes in a loop and overwrite the original dataframe?

for index, df in enumerate(dataframes):
    dataframes[index] = pd.merge(df, additionalInformation, how="left", left_on="cat", right_on="cat")

However, this is not updating the dataframes.
When I do a

a.columns

The columns from additionalInformation are not merged… When I perform a

a = pd.merge(a, additionalInformation, how="left", left_on="cat", right_on="cat")

It works.
How would I merge dataframes in a loop and overwrite the original dataframe?

Asked By: four-eyes

||

Answers:

If you want to modify the DataFrame, you have to assign to the dataframe’s content, not the variable.

If the cat values in additionalInformation are unique (thus making the left-merge keep the same number of row), you can use:

for index, df in enumerate(dataframes):
    merged = pd.merge(df, additionalInformation, how="left", on="cat")
    dataframes[index].loc[:, merged.columns] = merged

If the values are not unique, this will truncate the output

Answered By: mozway
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.