How to change the columns name from a tuple to string?
Question:
I have used pd.pivot_table
in pandas dataframe, and the columns names becomes tuples like ('A1', 'B1'), ('A1', 'B2')...
and I want them to be like 'A1_B1', 'A1_B2'...
I tried to use
df.columns.values[i] = df.columns.values[i][0] + '_' + df6.columns.values[i][1],
and tried rename as well.
When I checked df.columns.values, the columns’ names changed, but when I cannot use these names to do indexing. I am new to python, so might not know the difference between column names and column indices.
Can anyone help me? Thanks!
Answers:
You can use df.DataFrame.Index.map
for this:
df1.columns.map(lambda t: t[0] + "_" + t[1])
You might need to iterate.
final=[]
for x in df.columns.values:
final.append(x[0]+'_'+x[1])
df.columns.values = final
setup
df = pd.DataFrame(
np.arange(8).reshape(2, 4),
columns=[('A1', 'B1'), ('A2', 'B1'), ('A1', 'B2'), ('A2', 'B2')])
print(df)
(A1, B1) (A2, B1) (A1, B2) (A2, B2)
0 0 1 2 3
1 4 5 6 7
rename
df.rename(columns='_'.join, inplace=True)
print(df)
A1_B1 A2_B1 A1_B2 A2_B2
0 0 1 2 3
1 4 5 6 7
map
df.columns = df.columns.map('_'.join)
print(df)
A1_B1 A2_B1 A1_B2 A2_B2
0 0 1 2 3
1 4 5 6 7
Use list comprehension
:
df.columns = ['{}_{}'.format(x[0], x[1]) for x in df.columns]
print(df)
A1_B1 A2_B1 A1_B2 A2_B2
0 0 1 2 3
1 4 5 6 7
Or:
df.columns = ['_'.join(x) for x in df.columns]
print(df)
A1_B1 A2_B1 A1_B2 A2_B2
0 0 1 2 3
1 4 5 6 7
I used this approach:
mydic = dict()
for i,var in enumerate(df.columns):
if isinstance(var, tuple):
mydic[var] = '{}_{}'.format(var[0], var[1])
df.rename(columns = mydic)
This allows me to also handle the fact that the second input in my tuple was an integer which had become a float (and been appended an annoying “.0” decimal), by instead rounding off and specifying an integer
mydic[var] = '{}_{:d}'.format(var[0], round(var[1]))
list comprehension on the column names themselves:
df = pd.DataFrame(columns=[('ok',1),('ok',2),('ok',3)])
newcols = [x[1] for x in df.columns]
df.columns = newcols
print(df)
I have used pd.pivot_table
in pandas dataframe, and the columns names becomes tuples like ('A1', 'B1'), ('A1', 'B2')...
and I want them to be like 'A1_B1', 'A1_B2'...
I tried to use
df.columns.values[i] = df.columns.values[i][0] + '_' + df6.columns.values[i][1],
and tried rename as well.
When I checked df.columns.values, the columns’ names changed, but when I cannot use these names to do indexing. I am new to python, so might not know the difference between column names and column indices.
Can anyone help me? Thanks!
You can use df.DataFrame.Index.map
for this:
df1.columns.map(lambda t: t[0] + "_" + t[1])
You might need to iterate.
final=[]
for x in df.columns.values:
final.append(x[0]+'_'+x[1])
df.columns.values = final
setup
df = pd.DataFrame(
np.arange(8).reshape(2, 4),
columns=[('A1', 'B1'), ('A2', 'B1'), ('A1', 'B2'), ('A2', 'B2')])
print(df)
(A1, B1) (A2, B1) (A1, B2) (A2, B2)
0 0 1 2 3
1 4 5 6 7
rename
df.rename(columns='_'.join, inplace=True)
print(df)
A1_B1 A2_B1 A1_B2 A2_B2
0 0 1 2 3
1 4 5 6 7
map
df.columns = df.columns.map('_'.join)
print(df)
A1_B1 A2_B1 A1_B2 A2_B2
0 0 1 2 3
1 4 5 6 7
Use list comprehension
:
df.columns = ['{}_{}'.format(x[0], x[1]) for x in df.columns]
print(df)
A1_B1 A2_B1 A1_B2 A2_B2
0 0 1 2 3
1 4 5 6 7
Or:
df.columns = ['_'.join(x) for x in df.columns]
print(df)
A1_B1 A2_B1 A1_B2 A2_B2
0 0 1 2 3
1 4 5 6 7
I used this approach:
mydic = dict()
for i,var in enumerate(df.columns):
if isinstance(var, tuple):
mydic[var] = '{}_{}'.format(var[0], var[1])
df.rename(columns = mydic)
This allows me to also handle the fact that the second input in my tuple was an integer which had become a float (and been appended an annoying “.0” decimal), by instead rounding off and specifying an integer
mydic[var] = '{}_{:d}'.format(var[0], round(var[1]))
list comprehension on the column names themselves:
df = pd.DataFrame(columns=[('ok',1),('ok',2),('ok',3)])
newcols = [x[1] for x in df.columns]
df.columns = newcols
print(df)