Overwriting excel columns while keeping format using pandas

Question:

I’m working with an xlsx-file which looks like this:

enter image description here

My previous task was to modify the columns named ‘Entry 1’ and ‘Entry 2’. I have stored those columns in a seperate slice of the original dataframe for better overview. I’ll give you a quick glimpse how this slice looks:

>>> slice = df.loc[:, 'Entry 1':'Entry 2']
# code to modify the values
>>> slice

    Entry 1     Entry 2
1   Modified 1  Value 1
2   Modified 2  Value 2
3   Modified 3  Value 3 

I now want to overwrite those columns in the original dataframe with the named slice. I already achieved this by using the following:

df.loc[:, 'Entry1':'Entry2'] = slice

Question

As you can see, the header of the columns has a special format. How do I overwrite the values in ‘Entry1’ and ‘Entry2’, excluding the header, to keep the format?

Asked By: Mowgli

||

Answers:

Full disclosure: I’m the author of the suggested library

Unfortunately there is no out-of-the-box way in pandas to achieve that as it does not load the styling data. You can use StyleFrame (that wraps pandas and openpyxl, which I assume you already have installed) that can read xlsx files while keeping (most) of the styling elements.

Using it in this case may look like the following:

from StyleFrame import StyleFrame

sf = StyleFrame.read_excel('test.xlsx', read_style=True)
# currently you have to specify each value manually,
# using slices will revert to the default style used by StyleFrame
sf.loc[0, 'Entry 1'].value = 'Modified 1'
sf.loc[1, 'Entry 1'].value = 'Modified 2'
sf.loc[2, 'Entry 1'].value = 'Modified 3'
sf.to_excel('test.xlsx').save()

Another alternative using a loop:

sf = StyleFrame.read_excel('test.xlsx', read_style=True, use_openpyxl_styles=False)
new_values = ['Modified 1', 'Modified 2', 'Modified 3']
for cell, new_value in zip(sf['Entry 1'], new_values):
    cell.value = new_value
sf.to_excel('test.xlsx').save()

Content of test.xlsx before execution:

enter image description here

and after:

enter image description here

Answered By: DeepSpace

Final answer

To give probs to a way more extensive solution which will fit to many passengers dropping by, check this.


But for me, this easy way was enough to fit my needs. All you need to do is write back to the original file, just start by “row 1” (since the first row is marked as “row 0”) as well as letting out the header and the indexing. In my case, you achieve this by the following:

# It is also possible to write the dataframe without the header and index.
df4.to_excel(writer, sheet_name='Sheet1',
             startrow=1, startcol=2, header=False, index=False)
Answered By: Mowgli

You can do this using df.to_clipboard(index=False)

from win32com.client import Dispatch
import pandas as pd

xlApp = Dispatch("Excel.Application")
xlApp.Visible = 1
xlApp.Workbooks.Open(r'c:Chadeetest.xlsx')
xlApp.ActiveSheet.Cells(1,1).Select

d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)
df.to_clipboard(index=False)

xlApp.ActiveWorkbook.ActiveSheet.PasteSpecial()

Output:

Note that the cell colors are still the same

Hope that helps! 🙂

Answered By: Chadee Fouad

I know this is more than you need, but in case others were looking for an answer to keeping formatting; as of Pandas 1.4 there is the addition of if_sheet_exists='overlay'

Original Spreadsheet:

Original Spreadsheet

import pandas as pd

df = pd.DataFrame({'Entry1': ['Modified 1', 'Modified 2 ', 'Modified 3'],
                   'Entry2': ['Value 1', 'Value 2','Value 2']})

with pd.ExcelWriter('Original_File.xlsx', engine='openpyxl'
                    mode='a', if_sheet_exists='overlay') as writer:
    
    df.to_excel(writer, sheet_name='SheetName', startrow=1,
                startcol=2, header=False, index=False)

After Overlay

And one can see that this also works if there is formatting in the cell.

Lots of Formatting

Keeps lots of formatting

Answered By: John Anderson

So, I want to answer this with my workaround pre-pandas 1.4 because I found this page when trying to solve this problem.
I’m working in Pandas 1.3.4.

This is not the most elegant or fast solution, but it got the job done for me.

import openpyxl
import pandas as pd

with open(filePath,'rb') as fid:
    DataFrame = pd.read_excel(fid,"sheetName")
dataWorkbook = openpyxl.load_workbook(filePath)
dataSheet = dataWorkbook["sheetName"]

--> Logic for editing data here

#Iterate over dataframe to write to the format in openpyxl
for col, header in enumerate(DataFrame):
    for row in range(len(DataFrame)):
        cellRef = dataSheet.cell(row=row+2,column=col+1) #2: OpenPyXl does not track headers internally 1:Indexing starts at 1 in excel
        cellRef.value = DataFrame.loc[row,header]
dataWorkbook.save(filePath)

Disclaimer: I began learning Python in Late August of this year.

Answered By: KerbFusion
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.