ExcelWriter using openpyxl engine ignoring date_format parameter

Question:

I have read quite a few answers on this, but when I run my code I don’t get the same result.

I am using pandas 2.0.0 and openpyxl 3.1.2 on Python 3.9

This is a reduced example of my issue, which is that I can’t get the ExcelWriter to respect my choice of date format. I am trying to append a new sheet to an existing Excel .xlsx file.

import pandas as pd
import datetime

filePath = 'c:\temp\myfile.xlsx'

writer = pd.ExcelWriter(filePath,mode='a',engine='openpyxl',if_sheet_exists='replace',date_format='DD/MM/YYY')

df = pd.DataFrame([datetime.date(2023,4,7)],columns=['Date'])
df.to_excel(writer,sheet_name='Data')
writer.close()

The result in Excel is this:

enter image description here

I have explicitly set the type of the value in the dataframe to be datetime.date. I have tried using datetime_format or indeed both together but to no avail.

I have also tried xlsxwriter but it seems this engine does not allow appending to an existing workbook.

Asked By: DS_London

||

Answers:

This is what openpyxl has to say about Dates and Times:

Dates and times can be stored in two distinct ways in XLSX files: as
an ISO 8601 formatted string or as a single number. openpyxl supports
both representations and translates between them and Python’s datetime
module representations when reading from and writing to files. In
either representation, the maximum date and time precision in XLSX
files is millisecond precision.

What you could do is take your datetime data, convert it to a string to perserve the format and write the string to excel.

Answered By: Marcelo Paco

This appears to be a bug in the implementation of OpenpyxlWriter:

class OpenpyxlWriter(ExcelWriter):
    _engine = "openpyxl"
    _supported_extensions = (".xlsx", ".xlsm")

    def __init__(
        self,
        path: FilePath | WriteExcelBuffer | ExcelWriter,
        engine: str | None = None,
        date_format: str | None = None,
        datetime_format: str | None = None,
        mode: str = "w",
        storage_options: StorageOptions = None,
        if_sheet_exists: str | None = None,
        engine_kwargs: dict[str, Any] | None = None,
        **kwargs,
    ) -> None:
        # Use the openpyxl module as the Excel writer.
        from openpyxl.workbook import Workbook

        engine_kwargs = combine_kwargs(engine_kwargs, kwargs)

        super().__init__(
            path,
            mode=mode,
            storage_options=storage_options,
            if_sheet_exists=if_sheet_exists,
            engine_kwargs=engine_kwargs,
        )

To fix, add

date_format=date_format, 
datetime_format=datetime_format,

to the super().__init__() call:

        super().__init__(
            path,
            date_format=date_format, 
            datetime_format=datetime_format,
            mode=mode,
            storage_options=storage_options,
            if_sheet_exists=if_sheet_exists,
            engine_kwargs=engine_kwargs,
        )
Answered By: BigBen
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.