Can I assign a reset index a name?
Question:
Normally when a dataframe undergoes a reset_index()
the new column is assigned the name index
or level_i
depending on the level.
Is it possible to assign the new column a name?
Answers:
You can call rename
on the returned df from reset_index
:
In [145]:
# create a df
df = pd.DataFrame(np.random.randn(5,3))
df
Out[145]:
0 1 2
0 -2.845811 -0.182439 -0.526785
1 -0.112547 0.661461 0.558452
2 0.587060 -1.232262 -0.997973
3 -1.009378 -0.062442 0.125875
4 -1.129376 3.282447 -0.403731
Set the index name
In [146]:
df.index = df.index.set_names(['foo'])
df
Out[146]:
0 1 2
foo
0 -2.845811 -0.182439 -0.526785
1 -0.112547 0.661461 0.558452
2 0.587060 -1.232262 -0.997973
3 -1.009378 -0.062442 0.125875
4 -1.129376 3.282447 -0.403731
call reset_index
and chain with rename
:
In [147]:
df.reset_index().rename(columns={df.index.name:'bar'})
Out[147]:
bar 0 1 2
0 0 -2.845811 -0.182439 -0.526785
1 1 -0.112547 0.661461 0.558452
2 2 0.587060 -1.232262 -0.997973
3 3 -1.009378 -0.062442 0.125875
4 4 -1.129376 3.282447 -0.403731
Thanks to @ayhan
alternatively you can use rename_axis
to rename the index prior to reset_index
:
In [149]:
df.rename_axis('bar').reset_index()
Out[149]:
bar 0 1 2
0 0 -2.845811 -0.182439 -0.526785
1 1 -0.112547 0.661461 0.558452
2 2 0.587060 -1.232262 -0.997973
3 3 -1.009378 -0.062442 0.125875
4 4 -1.129376 3.282447 -0.403731
or just overwrite the index name directly first:
df.index.name = 'bar'
and then call reset_index
For a Series you can specify the name directly. E.g.:
>>> df.groupby('s1').size().reset_index(name='new_name')
s1 new_name
0 b 1
1 r 1
2 s 1
You could do this (Jan of 2020):
df = df.reset_index().rename(columns={'index': 'bar'})
print(df)
bar 0 1 2
0 0 -2.845811 -0.182439 -0.526785
1 1 -0.112547 0.661461 0.558452
2 2 0.587060 -1.232262 -0.997973
3 3 -1.009378 -0.062442 0.125875
4 4 -1.129376 3.282447 -0.403731
If you’re using reset_index() to go from a Series to a DataFrame you can name the column like this
my_series.rename('Example').reset_index()
If you are seeking one-liners that return a new DataFrame, use assign
. Here’s an example:
>>> df = pd.DataFrame({"a": [4.4, 2.2], "b": 8}, index=[10, 20])
>>> df
a b
10 4.4 8
20 2.2 8
Assign a bar
Series with the index values, but keep the original index:
>>> df.assign(bar=df.index)
a b bar
10 4.4 8 10
20 2.2 8 20
Similar, but drop the index:
>>> df.assign(bar=df.index).reset_index(drop=True)
a b bar
0 4.4 8 10
1 2.2 8 20
You can try the names
parameter introduced in version 1.5.0, from DataFrame.reset_index()
:
names: int, str or 1-dimensional list, default None
Using the given string, rename the DataFrame column which contains the index data. If the DataFrame has a MultiIndex, this has to be a list or tuple with length equal to the number of levels.
For a single index level dataframe:
class max_speed name class max_speed
falcon bird 389.0 df.reset_index(names='name') 0 falcon bird 389.0
parrot bird 24.0 ------------------------------> 1 parrot bird 24.0
lion mammal 80.5 2 lion mammal 80.5
monkey mammal NaN 3 monkey mammal NaN
For a multiple index level dataframe:
speed species classes names speed species
max type max type
class name 0 bird falcon 389.0 fly
bird falcon 389.0 fly df.reset_index(names=['classes', 'names']) 1 bird parrot 24.0 fly
parrot 24.0 fly --------------------------------------------> 2 mammal lion 80.5 run
mammal lion 80.5 run 3 mammal monkey NaN jump
monkey NaN jump
If you only care about one level in Multiindex
speed species classes speed species
max type max type
class name df.reset_index(level='class', names='classes') name
bird falcon 389.0 fly ------------------------------------------------> falcon bird 389.0 fly
parrot 24.0 fly parrot bird 24.0 fly
mammal lion 80.5 run lion mammal 80.5 run
monkey NaN jump monkey mammal NaN jump
NOTE: reset_index()
is not inplaced by default, you need assign the result back or use inplace=True
, for example
df = df.reset_index(names='name')
# or
df.reset_index(names='name', inplace=True)
Normally when a dataframe undergoes a reset_index()
the new column is assigned the name index
or level_i
depending on the level.
Is it possible to assign the new column a name?
You can call rename
on the returned df from reset_index
:
In [145]:
# create a df
df = pd.DataFrame(np.random.randn(5,3))
df
Out[145]:
0 1 2
0 -2.845811 -0.182439 -0.526785
1 -0.112547 0.661461 0.558452
2 0.587060 -1.232262 -0.997973
3 -1.009378 -0.062442 0.125875
4 -1.129376 3.282447 -0.403731
Set the index name
In [146]:
df.index = df.index.set_names(['foo'])
df
Out[146]:
0 1 2
foo
0 -2.845811 -0.182439 -0.526785
1 -0.112547 0.661461 0.558452
2 0.587060 -1.232262 -0.997973
3 -1.009378 -0.062442 0.125875
4 -1.129376 3.282447 -0.403731
call reset_index
and chain with rename
:
In [147]:
df.reset_index().rename(columns={df.index.name:'bar'})
Out[147]:
bar 0 1 2
0 0 -2.845811 -0.182439 -0.526785
1 1 -0.112547 0.661461 0.558452
2 2 0.587060 -1.232262 -0.997973
3 3 -1.009378 -0.062442 0.125875
4 4 -1.129376 3.282447 -0.403731
Thanks to @ayhan
alternatively you can use rename_axis
to rename the index prior to reset_index
:
In [149]:
df.rename_axis('bar').reset_index()
Out[149]:
bar 0 1 2
0 0 -2.845811 -0.182439 -0.526785
1 1 -0.112547 0.661461 0.558452
2 2 0.587060 -1.232262 -0.997973
3 3 -1.009378 -0.062442 0.125875
4 4 -1.129376 3.282447 -0.403731
or just overwrite the index name directly first:
df.index.name = 'bar'
and then call reset_index
For a Series you can specify the name directly. E.g.:
>>> df.groupby('s1').size().reset_index(name='new_name')
s1 new_name
0 b 1
1 r 1
2 s 1
You could do this (Jan of 2020):
df = df.reset_index().rename(columns={'index': 'bar'})
print(df)
bar 0 1 2
0 0 -2.845811 -0.182439 -0.526785
1 1 -0.112547 0.661461 0.558452
2 2 0.587060 -1.232262 -0.997973
3 3 -1.009378 -0.062442 0.125875
4 4 -1.129376 3.282447 -0.403731
If you’re using reset_index() to go from a Series to a DataFrame you can name the column like this
my_series.rename('Example').reset_index()
If you are seeking one-liners that return a new DataFrame, use assign
. Here’s an example:
>>> df = pd.DataFrame({"a": [4.4, 2.2], "b": 8}, index=[10, 20])
>>> df
a b
10 4.4 8
20 2.2 8
Assign a bar
Series with the index values, but keep the original index:
>>> df.assign(bar=df.index)
a b bar
10 4.4 8 10
20 2.2 8 20
Similar, but drop the index:
>>> df.assign(bar=df.index).reset_index(drop=True)
a b bar
0 4.4 8 10
1 2.2 8 20
You can try the names
parameter introduced in version 1.5.0, from DataFrame.reset_index()
:
names: int, str or 1-dimensional list, default None
Using the given string, rename the DataFrame column which contains the index data. If the DataFrame has a MultiIndex, this has to be a list or tuple with length equal to the number of levels.
For a single index level dataframe:
class max_speed name class max_speed
falcon bird 389.0 df.reset_index(names='name') 0 falcon bird 389.0
parrot bird 24.0 ------------------------------> 1 parrot bird 24.0
lion mammal 80.5 2 lion mammal 80.5
monkey mammal NaN 3 monkey mammal NaN
For a multiple index level dataframe:
speed species classes names speed species
max type max type
class name 0 bird falcon 389.0 fly
bird falcon 389.0 fly df.reset_index(names=['classes', 'names']) 1 bird parrot 24.0 fly
parrot 24.0 fly --------------------------------------------> 2 mammal lion 80.5 run
mammal lion 80.5 run 3 mammal monkey NaN jump
monkey NaN jump
If you only care about one level in Multiindex
speed species classes speed species
max type max type
class name df.reset_index(level='class', names='classes') name
bird falcon 389.0 fly ------------------------------------------------> falcon bird 389.0 fly
parrot 24.0 fly parrot bird 24.0 fly
mammal lion 80.5 run lion mammal 80.5 run
monkey NaN jump monkey mammal NaN jump
NOTE: reset_index()
is not inplaced by default, you need assign the result back or use inplace=True
, for example
df = df.reset_index(names='name')
# or
df.reset_index(names='name', inplace=True)