Adding in multiple dataframes into an existing Excel sheet starting on specific cell references

Question

Good afternoon,

I’m working on a python program that will take 3 separate dataframes and and them into an existing excel file; overwriting the cell ranges in question but leaving the rest of the rows and columns unaltered.

Below is an example of the Excel file structure

Keywords	Match type	col1a	col1b	col1c	col2a	col2b	col2c	col3a	col3b	col3c	counter
not to be removed	not to be removed	replaced data	replaced data	replaced data	replaced data	replaced data	replaced data	replaced data	replaced data	replaced data	not to be removed
not to be removed	not to be removed	replaced data	replaced data	replaced data	replaced data	replaced data	replaced data	replaced data	replaced data	replaced data	not to be removed

In this I need the first df starting in row 2 column 3, the second df in col 6 and the third df in column 9.

Currently with the code below I can get the data into the correct position but all the other data gets lost in the process. I think it may be possible to merge the Excel if opened as a dataframe and the newer data frames but no such luck so far.

My code is below, I am still fiddling with this and at the time of writing the old data has been opened but no action with it has been taken.

    DF_LastMonthDL = pd.read_csv (LastMonthDL)
    DF_Last3MonthsDL = pd.read_csv (Last3MonthsDL)
    DF_LifeTimeDL = pd.read_csv (LifeTimeDL)
    
    ########################################################## Manipulating the dataframes
    #Sorting the arrays to keep ordering consistent
    DF_LifeTimeDL.sort_index(0)
    DF_LastMonthDL.sort_index(0)
    DF_Last3MonthsDL.sort_index(0)
    
    #Removing first cols as uneeded ¦ Keywords, Matchtype
    DF_LifeTimeShrt = DF_LifeTimeDL[["Impressions", "Clicks", "CTR", "Spend(GBP)", "CPC(GBP)", "Orders", "Sales(GBP)","ACOS","ROAS"]]
    DF_Last3MonthsShrt = DF_Last3MonthsDL[["Impressions", "Clicks", "CTR", "Spend(GBP)", "CPC(GBP)", "Orders", "Sales(GBP)","ACOS","ROAS"]]
    DF_LastMonthShrt = DF_LastMonthDL[["Impressions", "Clicks", "CTR", "Spend(GBP)", "CPC(GBP)", "Orders", "Sales(GBP)","ACOS","ROAS"]]
    
    
    oldData = pd.read_excel(r"oldData.xlsx")
    
    
    ########################################################## Exporting into excel in set positions
    # Create a Pandas Excel writer using XlsxWriter as the engine.
    writer = pd.ExcelWriter('Temp.xlsx', engine='openpyxl')
    
    # Position the dataframes in the worksheet
    DF_LifeTimeShrt.to_excel(writer, sheet_name='LifeTime', startrow=2, startcol=2, header=True, index=False)
    DF_Last3MonthsShrt.to_excel(writer, sheet_name='Sheet1', startrow=2, startcol=11, header=False, index=False)
    DF_LastMonthShrt.to_excel(writer, sheet_name='Sheet1', startrow=2, startcol=20, header=False, index=False)
    
    # Close the Pandas Excel writer and output the Excel file.
    writer.save()

Any guidance on this would be greatly appreciated.

Asked By: Ryan

||

Source

Answer 1

you can do this using openpyxl.load_workbook() and updating the cells, similar to what you are doing above. Assuming you have the initial part all working correctly, just need to change the last part as below…

import openpyxl
from openpyxl.utils.dataframe import dataframe_to_rows

writer = openpyxl.load_workbook('Temp.xlsx')
ws=writer['LifeTime']
rows = dataframe_to_rows(DF_LifeTimeShrt, index=False, header=True)
for r_idx, row in enumerate(rows, 1):
    for c_idx, value in enumerate(row, 1):
         ws.cell(row=r_idx+2, column=c_idx+2, value=value)
            
ws=writer['Sheet1']
rows = dataframe_to_rows(DF_Last3MonthsShrt, index=False, header=True)
for r_idx, row in enumerate(rows, 1):
    for c_idx, value in enumerate(row, 1):
         ws.cell(row=r_idx+2, column=c_idx+2, value=value)
            
ws=writer['Sheet2']
rows = dataframe_to_rows(DF_LastMonthShrt, index=False, header=True)
for r_idx, row in enumerate(rows, 1):
    for c_idx, value in enumerate(row, 1):
         ws.cell(row=r_idx+2, column=c_idx+2, value=value)
            
# Close the Excel file... need to provide name the file it needs to be written to.
writer.save('Temp.xlsx')

EDIT – The advantage with load_workbook is that it updates the cell and only overwrites a particular cell without any changes to other cells or even overwriting the color, etc. that may be present. The dataframe_to_rows gives you a way to get a whole DF row into a openpyxl readable from. From there, I am basically reading each row and column (a cell) and updating the value (ws.cell(row,col).value) with the value from the df. The disadvantage of this is that you need to go through the for loops (unlike say df.to_excel), but advantage is that you can update a single cell value without disturbing anything else…. Hope this explanation helps.

Answered By: Redox

Adding in multiple dataframes into an existing Excel sheet starting on specific cell references

Question:

Answers: