Autoincrementing option for Pandas DataFrame index
Question:
Is there a way to set an option for auto-incrementing the index of pandas.DataFrame when adding new rows, or to define a function for managing creation of new indices?
Answers:
You can set ignore_index=True
when append
-ing:
In [1]: df = pd.DataFrame([[1,2],[3,4]])
In [2]: row = pd.Series([5,6])
In [3]: df.append(row, ignore_index=True)
Out[3]:
0 1
0 1 2
1 3 4
2 5 6
Note that the accepted answer is dangerous if your existing index is meaningful. For instance:
df = pd.DataFrame(
[('Alice', 1010, 'sales'), ('Bob', 1011, 'service')],
columns = ['name', 'emp_id', 'dept']
).set_index('emp_id')
# here's a new employee to append, who has no id:
row = pd.Series({'name': 'Eve', 'dept': 'r&d'})
# this will wipe all the existing employee id numbers:
df.append(row, ignore_index=True)
One way around this would be to manually increment the index:
def add_new_row(df, row):
row.name = max(df.index)+1
return df.append(row)
# the existing ids are now preserved:
add_new_row(df, row)
Is there a way to set an option for auto-incrementing the index of pandas.DataFrame when adding new rows, or to define a function for managing creation of new indices?
You can set ignore_index=True
when append
-ing:
In [1]: df = pd.DataFrame([[1,2],[3,4]])
In [2]: row = pd.Series([5,6])
In [3]: df.append(row, ignore_index=True)
Out[3]:
0 1
0 1 2
1 3 4
2 5 6
Note that the accepted answer is dangerous if your existing index is meaningful. For instance:
df = pd.DataFrame(
[('Alice', 1010, 'sales'), ('Bob', 1011, 'service')],
columns = ['name', 'emp_id', 'dept']
).set_index('emp_id')
# here's a new employee to append, who has no id:
row = pd.Series({'name': 'Eve', 'dept': 'r&d'})
# this will wipe all the existing employee id numbers:
df.append(row, ignore_index=True)
One way around this would be to manually increment the index:
def add_new_row(df, row):
row.name = max(df.index)+1
return df.append(row)
# the existing ids are now preserved:
add_new_row(df, row)