Reindex dataframe inside loop

Question:

I’m trying to reindex the columns in a set of dataframes inside a loop. This only seems to work outside the loop. See sample code below

import pandas as pd

data1 = [[1,2,3],[4,5,6],[7,8,9]]
data2 = [[10,11,12],[13,14,15],[16,17,18]]
data3 = [[19,20,21],[22,23,24],[25,26,27]]
index = ['a','b','c']
columns = ['d','e','f']

df1 = pd.DataFrame(data=data1,index=index,columns=columns)
df2 = pd.DataFrame(data=data2,index=index,columns=columns)
df3 = pd.DataFrame(data=data3,index=index,columns=columns)

columns2 = ['f','e','d']

for i in [df1,df2,df3]:
    i = i.reindex(columns=columns2)

print(df1)

df2 = df2.reindex(columns=columns2)

print(df2)

df1 is not reindexed as desired, however if I reindex df2 outside of the loop it works. Why is that?

Thanks
Andrew

Asked By: Andrew5715

||

Answers:

That happens for the same reason this happens:

a = 5
b = 6
for i in [a, b]:
    i = 4

>>> a
    5

Why? See this accepted answer.

Concerning your problem, one way to go about it is create a list of reindexed dataframes like so:

reindexed_dfs = [df.reindex(columns=columns2) for df in [df1, df2, df3]]

and then reassign df1, df2 and df3. But it’s better to just keep using your newly created list anyways.

Answered By: Camilo Martinez M.