Rename a single pandas DataFrame column without knowing column name
Question:
I know I can rename single pandas.DataFrame columns with:
drugInfo.rename(columns = {'col_1': 'col_1_new_name'}, inplace = True)
But I’d like to rename a column based on its index (without knowing its name) – although I know dictionaries don’t have it). I would like to rename column number 1 like this:
drugInfo.rename(columns = {1: 'col_1_new_name'}, inplace = True)
But in the DataFrame.columns dict there is no ‘1’ entry, so no renaming is done. How could I achieve this?
Answers:
Should work:
drugInfo.rename(columns = {list(drugInfo)[1]: 'col_1_new_name'}, inplace = True)
Example:
In [18]:
df = pd.DataFrame({'a':randn(5), 'b':randn(5), 'c':randn(5)})
df
Out[18]:
a b c
0 -1.429509 -0.652116 0.515545
1 0.563148 -0.536554 -1.316155
2 1.310768 -3.041681 -0.704776
3 -1.403204 1.083727 -0.117787
4 -0.040952 0.108155 -0.092292
In [19]:
df.rename(columns={list(df)[1]:'col1_new_name'}, inplace=True)
df
Out[19]:
a col1_new_name c
0 -1.429509 -0.652116 0.515545
1 0.563148 -0.536554 -1.316155
2 1.310768 -3.041681 -0.704776
3 -1.403204 1.083727 -0.117787
4 -0.040952 0.108155 -0.092292
It is probably more readable to index into the dataframe columns attribute:
df.rename(columns={df.columns[1]:'col1_new_name'}, inplace=True)
So for you:
drugInfo.rename(columns = {drugInfo.columns[1]: 'col_1_new_name'}, inplace = True)
To change a column name by index, one could alter the underlying array of df.columns
by index. So
df.columns.array[1] = 'col_1_new_name'
# or
df.columns.values[1] = 'col_1_new_name'
# or
df.columns.to_numpy()[1] = 'col_1_new_name'
They all perform the following transformation (without referencing B
, it is changed):
However, if a new dataframe copy needs to be returned, rename
method is the way to go (as suggested by EdChum):
df1 = df.rename(columns={list(df)[1]: 'col_1_new_name'})
If df
has many columns, instead of list(df)
, it might be worth it to call islice()
from the standard itertools
library to efficiently select a column label (e.g. the second column name):
from itertools import islice
df1 = df.rename(columns={next(islice(df, 1, 2)): 'col_1_new_name'})
I know I can rename single pandas.DataFrame columns with:
drugInfo.rename(columns = {'col_1': 'col_1_new_name'}, inplace = True)
But I’d like to rename a column based on its index (without knowing its name) – although I know dictionaries don’t have it). I would like to rename column number 1 like this:
drugInfo.rename(columns = {1: 'col_1_new_name'}, inplace = True)
But in the DataFrame.columns dict there is no ‘1’ entry, so no renaming is done. How could I achieve this?
Should work:
drugInfo.rename(columns = {list(drugInfo)[1]: 'col_1_new_name'}, inplace = True)
Example:
In [18]:
df = pd.DataFrame({'a':randn(5), 'b':randn(5), 'c':randn(5)})
df
Out[18]:
a b c
0 -1.429509 -0.652116 0.515545
1 0.563148 -0.536554 -1.316155
2 1.310768 -3.041681 -0.704776
3 -1.403204 1.083727 -0.117787
4 -0.040952 0.108155 -0.092292
In [19]:
df.rename(columns={list(df)[1]:'col1_new_name'}, inplace=True)
df
Out[19]:
a col1_new_name c
0 -1.429509 -0.652116 0.515545
1 0.563148 -0.536554 -1.316155
2 1.310768 -3.041681 -0.704776
3 -1.403204 1.083727 -0.117787
4 -0.040952 0.108155 -0.092292
It is probably more readable to index into the dataframe columns attribute:
df.rename(columns={df.columns[1]:'col1_new_name'}, inplace=True)
So for you:
drugInfo.rename(columns = {drugInfo.columns[1]: 'col_1_new_name'}, inplace = True)
To change a column name by index, one could alter the underlying array of df.columns
by index. So
df.columns.array[1] = 'col_1_new_name'
# or
df.columns.values[1] = 'col_1_new_name'
# or
df.columns.to_numpy()[1] = 'col_1_new_name'
They all perform the following transformation (without referencing B
, it is changed):
However, if a new dataframe copy needs to be returned, rename
method is the way to go (as suggested by EdChum):
df1 = df.rename(columns={list(df)[1]: 'col_1_new_name'})
If df
has many columns, instead of list(df)
, it might be worth it to call islice()
from the standard itertools
library to efficiently select a column label (e.g. the second column name):
from itertools import islice
df1 = df.rename(columns={next(islice(df, 1, 2)): 'col_1_new_name'})