Python Pandas DataFrame: How to add an additional index / column to existing data
Question:
So I want make an extra index / extra column in my excel sheet (with already existing data) using Pandas DataFrame. This is what I mean:
Picture 1 (What my code outputs):
Picture 2 (What I WANT my code to output):
Here’s my code for picture 1:
import pandas as pd
# Create a Pandas dataframe from the data.
df = pd.DataFrame([['a', 'b'], ['c', 'd']],
index=['row 1', 'row 2'],
columns=['col 1', 'col 2'])
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_simple.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Close the Pandas Excel writer and output the Excel file.
writer.close()
Is there a possible way to do this?
Answers:
You can use pd.MultiIndex.from_arrays
:
new_idx = pd.Index(['data_type_1', 'date_type_2'])
out = df.set_index(pd.MultiIndex.from_arrays([df.index, new_idx]))
out.to_excel('pandas_simple.xlsx')
print(out)
# Output
col 1 col 2
row 1 data_type_1 a b
row 2 date_type_2 c d
Update
Maybe you want to prefer:
new_idx = pd.Index(['data_type_1', 'date_type_2'])
mi = pd.MultiIndex.from_product([df.index, new_idx])
out = df.reindex(df.index.repeat(len(new_idx))).set_index(mi)
print(out)
# Output
col 1 col 2
row 1 data_type_1 a b
date_type_2 a b
row 2 data_type_1 c d
date_type_2 c d
So I want make an extra index / extra column in my excel sheet (with already existing data) using Pandas DataFrame. This is what I mean:
Picture 1 (What my code outputs):
Picture 2 (What I WANT my code to output):
Here’s my code for picture 1:
import pandas as pd
# Create a Pandas dataframe from the data.
df = pd.DataFrame([['a', 'b'], ['c', 'd']],
index=['row 1', 'row 2'],
columns=['col 1', 'col 2'])
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_simple.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Close the Pandas Excel writer and output the Excel file.
writer.close()
Is there a possible way to do this?
You can use pd.MultiIndex.from_arrays
:
new_idx = pd.Index(['data_type_1', 'date_type_2'])
out = df.set_index(pd.MultiIndex.from_arrays([df.index, new_idx]))
out.to_excel('pandas_simple.xlsx')
print(out)
# Output
col 1 col 2
row 1 data_type_1 a b
row 2 date_type_2 c d
Update
Maybe you want to prefer:
new_idx = pd.Index(['data_type_1', 'date_type_2'])
mi = pd.MultiIndex.from_product([df.index, new_idx])
out = df.reindex(df.index.repeat(len(new_idx))).set_index(mi)
print(out)
# Output
col 1 col 2
row 1 data_type_1 a b
date_type_2 a b
row 2 data_type_1 c d
date_type_2 c d