Merging two dataframes in a loop change original dataframe in loop
Question:
Assuming I have a list of dataframes
dataframes = [a, b, c, d]
and one dataframe additionalInformation
containing information I need for the merge.
Is there a way to join
the dataframes in a loop and overwrite the original dataframe?
for index, df in enumerate(dataframes):
dataframes[index] = pd.merge(df, additionalInformation, how="left", left_on="cat", right_on="cat")
However, this is not updating the dataframes.
When I do a
a.columns
The columns from additionalInformation
are not merged… When I perform a
a = pd.merge(a, additionalInformation, how="left", left_on="cat", right_on="cat")
It works.
How would I merge dataframes in a loop and overwrite the original dataframe?
Answers:
If you want to modify the DataFrame, you have to assign to the dataframe’s content, not the variable.
If the cat values in additionalInformation are unique (thus making the left-merge keep the same number of row), you can use:
for index, df in enumerate(dataframes):
merged = pd.merge(df, additionalInformation, how="left", on="cat")
dataframes[index].loc[:, merged.columns] = merged
If the values are not unique, this will truncate the output
Assuming I have a list of dataframes
dataframes = [a, b, c, d]
and one dataframe additionalInformation
containing information I need for the merge.
Is there a way to join
the dataframes in a loop and overwrite the original dataframe?
for index, df in enumerate(dataframes):
dataframes[index] = pd.merge(df, additionalInformation, how="left", left_on="cat", right_on="cat")
However, this is not updating the dataframes.
When I do a
a.columns
The columns from additionalInformation
are not merged… When I perform a
a = pd.merge(a, additionalInformation, how="left", left_on="cat", right_on="cat")
It works.
How would I merge dataframes in a loop and overwrite the original dataframe?
If you want to modify the DataFrame, you have to assign to the dataframe’s content, not the variable.
If the cat values in additionalInformation are unique (thus making the left-merge keep the same number of row), you can use:
for index, df in enumerate(dataframes):
merged = pd.merge(df, additionalInformation, how="left", on="cat")
dataframes[index].loc[:, merged.columns] = merged
If the values are not unique, this will truncate the output