Unable to print multiindex dataframe to excel with merged cells

Question:

I have a dataframe df which looks as following:

Date        ConstraintType  Col1    Col2
2020-07-15  N-S             w1      521133
2020-07-15  N-S             w2      550260
2020-07-15  CSD             d1      522417
2020-07-15  CSD             d2      534542
2020-07-15  A               d4      534905
2020-07-15  B               d5      534904

The index of dataframe is:

df.index
Out[6]: 
MultiIndex([('2020-07-15',  'N-S'),
            ('2020-07-15',  'N-S'),
            ('2020-07-15',  'CSD'),
            ('2020-07-15',  'CSD'),
            ('2020-07-15', 'A'),
            ('2020-07-15', 'B')],
           names=['Date', 'ConstraintType'])

But when I print it to excel it appears as following:

enter image description here

I was expecting the following:

enter image description here

I am using the following code:

df.to_excel(r'C:UsersramDesktopz1.xlsx', merge_cells=True)
Asked By: Zanam

||

Answers:

From the provided DataFrame:

  1. Use .reset_index() to remove the index columns from the index
  2. Use where, and .shift to make all of the cell values blank except for the first occurrence of those values in the columns Date and ConstraintType.
  3. Finally, use .set_index to put them back on the index, this time with only one unrepeated value and write to_excel. Now, merge_cells=True should work.

code:

df=df.reset_index()
df['Date'] = df['Date'].where(df['Date'] != df['Date'].shift(), '')
df['ConstraintType'] = df['ConstraintType'].where(df['ConstraintType'] != df['ConstraintType'].shift(), '')
df = df.set_index(['Date', 'ConstraintType'])
df.to_excel(r'C:UsersramDesktopz1.xlsx', merge_cells=True)

excel output:

enter image description here

Answered By: David Erickson

In pandas the inner most index must label each row.
Therefore the inner most index must be manually handled, as shown in @David Erickson ‘s answer. Pandas automatically hides outer indices; see below example:

import pandas as pd

tuples = [["2020-07-15", "N-S"],
          ["2020-07-15", "N-S"],
          ["2020-07-15", "CSD"],
          ["2020-07-15", "CSD"],
          ["2020-07-15", "A"],
          ["2020-07-15", "B"]
         ]

index = pd.MultiIndex.from_tuples(tuples, names=['Date', 'ConstraintType'])

df = pd.DataFrame([
    ["w1", 521133],
    ["w2", 550260],
    ["d1", 522417],
    ["d2", 534542],
    ["d4", 534905],
    ["d5", 534904],
], columns=["Col1", "Col2"],
   index=index
)

print(df, 'n'*2)
print(df.swaplevel(0,1))

Returns:

                          Col1    Col2
Date       ConstraintType             
2020-07-15 N-S              w1  521133
           N-S              w2  550260
           CSD              d1  522417
           CSD              d2  534542
           A                d4  534905
           B                d5  534904


                          Col1    Col2
ConstraintType Date                   
N-S            2020-07-15   w1  521133
               2020-07-15   w2  550260
CSD            2020-07-15   d1  522417
               2020-07-15   d2  534542
A              2020-07-15   d4  534905
B              2020-07-15   d5  534904

Reset index, clean former multi-index columns, then save to Excel without the need of setting the merge_cells option:

df = df.reset_index(drop=False)
row_filt = df['ConstraintType'].eq(df['ConstraintType'].shift())
df.loc[row_filt, 'ConstraintType'] = ''
row_filt = df['Date'].eq(df['Date'].shift())
df.loc[row_filt, 'Date'] = ''

df.to_excel(r'C:UsersramDesktopz1.xlsx')

Produces the following Excel:

enter image description here

Answered By: Gustav Rasmussen
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.