Pandas adding a row into a DataFrame
Question:
I am writing a little function that would incrementaly add rows into a DataFrame using pandas.
The core goes like this:
DB = pd.DataFrame(columns=['MAN','MOD','YEAR','TYPE'])
for id, row is other_dataFrame.iterrows():
DB = pd.concat([self.loadedDB, row.to_frame().T])
using print, I get that initial DB looks like this:
Empty DataFrame
Columns: [MAN, MOD, YEAR, TYPE]
Index: []
while row may look like this:
MAN Aixam
MOD 400
YEAR 1930
TYPE NaN
Name: 0, dtype: object
then, after the loop the resulting DB looks like this:
MAN MOD YEAR TYPE MOD YEAR TYPE
0 Aixam NaN NaN NaN 400 1930 NaN
1 BMW NaN NaN NaN I3 1930 NaN
2 Bollore NaN NaN NaN Bluecar 1930 NaN
3 BYD NaN NaN NaN e6 1930 NaN
4 Buddy NaN NaN NaN Cab 1930 NaN
5 Chery NaN NaN NaN QQ3 1930 NaN
6 Chevrolet NaN NaN NaN Spark EV 1930 NaN
7 Dynasty NaN NaN NaN IT 1930 NaN
8 Ford NaN NaN NaN Focus Electric 1930 NaN
...
while I would, of course, like to have it in format of:
MAN MOD YEAR TYPE
0 Aixam 400 1930 NaN
1 BMW I3 1930 NaN
2 Bollore Bluecar 1930 NaN
3 BYD e6 1930 NaN
4 Buddy Cab 1930 NaN
5 Chery QQ3 1930 NaN
6 Chevrolet Spark EV 1930 NaN
7 Dynasty IT 1930 NaN
8 Ford Focus Electric 1930 NaN
...
Can anyone please tell me what am I doing wrong? This is the first time I use pandas, so it is possible that the answer is really simple, however I cant find it. Thank you
Answers:
This should work:
DB = pd.DataFrame(columns=['MAN','MOD','YEAR','TYPE'])
for id, row in other_dataFrame.iterrows():
DB.loc[len(DB)] = [row['MAN'], row['MOD'], row['YEAR'], row['TYPE']]
Just appending a list to the empty dataframe at its last position. There may be an easier way to convert the row object to list, couldn’t find it.
Your original code should work. You should try printing self.loadedDB.columns
and row.to_frame().T.columns
to verify if there’s any whitespace in these column names, causing them to be concatenated as separate columns.
I am writing a little function that would incrementaly add rows into a DataFrame using pandas.
The core goes like this:
DB = pd.DataFrame(columns=['MAN','MOD','YEAR','TYPE'])
for id, row is other_dataFrame.iterrows():
DB = pd.concat([self.loadedDB, row.to_frame().T])
using print, I get that initial DB looks like this:
Empty DataFrame
Columns: [MAN, MOD, YEAR, TYPE]
Index: []
while row may look like this:
MAN Aixam
MOD 400
YEAR 1930
TYPE NaN
Name: 0, dtype: object
then, after the loop the resulting DB looks like this:
MAN MOD YEAR TYPE MOD YEAR TYPE
0 Aixam NaN NaN NaN 400 1930 NaN
1 BMW NaN NaN NaN I3 1930 NaN
2 Bollore NaN NaN NaN Bluecar 1930 NaN
3 BYD NaN NaN NaN e6 1930 NaN
4 Buddy NaN NaN NaN Cab 1930 NaN
5 Chery NaN NaN NaN QQ3 1930 NaN
6 Chevrolet NaN NaN NaN Spark EV 1930 NaN
7 Dynasty NaN NaN NaN IT 1930 NaN
8 Ford NaN NaN NaN Focus Electric 1930 NaN
...
while I would, of course, like to have it in format of:
MAN MOD YEAR TYPE
0 Aixam 400 1930 NaN
1 BMW I3 1930 NaN
2 Bollore Bluecar 1930 NaN
3 BYD e6 1930 NaN
4 Buddy Cab 1930 NaN
5 Chery QQ3 1930 NaN
6 Chevrolet Spark EV 1930 NaN
7 Dynasty IT 1930 NaN
8 Ford Focus Electric 1930 NaN
...
Can anyone please tell me what am I doing wrong? This is the first time I use pandas, so it is possible that the answer is really simple, however I cant find it. Thank you
This should work:
DB = pd.DataFrame(columns=['MAN','MOD','YEAR','TYPE'])
for id, row in other_dataFrame.iterrows():
DB.loc[len(DB)] = [row['MAN'], row['MOD'], row['YEAR'], row['TYPE']]
Just appending a list to the empty dataframe at its last position. There may be an easier way to convert the row object to list, couldn’t find it.
Your original code should work. You should try printing self.loadedDB.columns
and row.to_frame().T.columns
to verify if there’s any whitespace in these column names, causing them to be concatenated as separate columns.