How to add a new row and new column to a multiindex Pandas dataframe?
Question:
I try to use .loc
to create a new row and a new column to a multiindex Pandas dataframe, by specifying all the axis. The problem is that it creates the new index without the new column, and at the same time throws an obscur KeyError: 6
.
How could I do that ? A one line solution whould be much appreciated.
> df
side total value
city code type
NaN NTE urban ouest 0.01949 391.501656
> df.loc[(np.nan, 'NTE', 'rural'), 'population'] = 1000
KeyError: 6
> df
side total value
city code type
NaN NTE urban ouest 0.01949 391.501656
NaN NTE rural NaN NaN NaN
Now, when I try the same command again it complains the index doesn’t exist.
> df.loc[(np.nan, 'NTE', 'rural'), 'population'] = 1000
KeyError: (nan, 'NTE', 'rural')
The desired output would be this dataframe:
side total value population
city code type
NaN NTE urban ouest 0.01949 391.501656 NaN
NaN NTE rural NaN NaN NaN 1000
Answers:
Here is problem with missing values, possible hack solution with assign empty string and rename
:
df.loc[('', 'NTE', 'rural'), 'population'] = 1000
print (df.index)
MultiIndex([(nan, 'NTE', 'urban'),
( '', 'NTE', 'rural')],
names=['city', 'code', 'type'])
df = df.rename({'':np.nan}, level=0)
print (df.index)
MultiIndex([(nan, 'NTE', 'urban'),
(nan, 'NTE', 'rural')],
names=['city', 'code', 'type'])
print (df)
side total value population
city code type
NaN NTE urban ouest 0.01949 391.501656 NaN
rural NaN NaN NaN 1000.0
I try to use .loc
to create a new row and a new column to a multiindex Pandas dataframe, by specifying all the axis. The problem is that it creates the new index without the new column, and at the same time throws an obscur KeyError: 6
.
How could I do that ? A one line solution whould be much appreciated.
> df
side total value
city code type
NaN NTE urban ouest 0.01949 391.501656
> df.loc[(np.nan, 'NTE', 'rural'), 'population'] = 1000
KeyError: 6
> df
side total value
city code type
NaN NTE urban ouest 0.01949 391.501656
NaN NTE rural NaN NaN NaN
Now, when I try the same command again it complains the index doesn’t exist.
> df.loc[(np.nan, 'NTE', 'rural'), 'population'] = 1000
KeyError: (nan, 'NTE', 'rural')
The desired output would be this dataframe:
side total value population
city code type
NaN NTE urban ouest 0.01949 391.501656 NaN
NaN NTE rural NaN NaN NaN 1000
Here is problem with missing values, possible hack solution with assign empty string and rename
:
df.loc[('', 'NTE', 'rural'), 'population'] = 1000
print (df.index)
MultiIndex([(nan, 'NTE', 'urban'),
( '', 'NTE', 'rural')],
names=['city', 'code', 'type'])
df = df.rename({'':np.nan}, level=0)
print (df.index)
MultiIndex([(nan, 'NTE', 'urban'),
(nan, 'NTE', 'rural')],
names=['city', 'code', 'type'])
print (df)
side total value population
city code type
NaN NTE urban ouest 0.01949 391.501656 NaN
rural NaN NaN NaN 1000.0