Python Openpyxl Copy Data From Rows Based on Cell Value& Paste In Specific Rows of ExcelSheet

Question:

I am trying to copy data by rows based on Column [‘A’] cell value from one sheet and paste in row2 of another sheet. The paste in sheet is an existing worksheet, row 1 of the worksheet is my header row so i want to paste the copied data starting from row2. I do not want to append as I have existing formula columns in the paste in sheet that will be overwritten, also with append I lose formatting. So say Column A of my copy from sheet is States, i want to copy all rows where Column [‘A’] cell.value is ‘Georgia’ and paste in row2 of sheet2, copy rows where Column [‘A’] cell.value = Texas and paste in row2 of sheet 3 etc(pasting every state in different sheets). I am able to copy the data and paste but I am not able to get it to paste in row 2 it is pasting in whatever row the data is in my copy from sheet. So if Texas starts from row 3000, my code is copying from row 3000 of the copy from sheet and pasting in row 3000 of sheet 2 meaning rows 1-2999 of my sheet 2 is all empty rows,
Copy from file looks like this: enter image description here

Paste in file looks like this: enter image description here
see my code below

import openpyxl
from openpyxl import load_workbook
from openpyxl import Workbook
from openpyxl.utils import range_boundaries
from sys import argv
script, inpath, outpath = argv


# load copy from file
wb_cpy = load_workbook(r'C:Usersmedocumentssourcefoldercopyfromfile.xlsx')
#ws = wb_src["sheet1"] #previous inconsistency referred to in thecomment
ws = wb_cpy["sheet1"] #edited fixed

# load paste in file
wb_pst = load_workbook(r'C:Usersmedocumentssourcefolderpasteinfile.xlsx')
#ws2 = wb_dst["sheet2"] #previous inconsistency referred to inthecomment
ws2 = wb_pst["sheet2"] #edited fixed


for row in ws.iter_rows(min_col=1, max_col=1, min_row=9):
   for row2 in ws2.iter_rows(min_col=1, max_col=1, min_row=2):
       for cell in row:
           for cell2 in row2:
            if cell.value == "GEORGIA":
                ws2.cell(row=cell.row, column=1).value = ws.cell(row=cell.row, column=1).value 
                ws2.cell(row=cell.row, column=2).value = ws.cell(row=cell.row, column=2).value
                ws2.cell(row=cell.row, column=6).value = ws.cell(row=cell.row, column=6).value
              
        
wb_pst.save(r'C:Usersmedocumentssourcefolderpasteinfile.xlsx')
#ps: i will repeat the script for each state

I maybe approaching it all wrong but I have tried multiple other approaches with no success, I cannot get the copied data to paste in row 2 of the paste in sheet

Asked By: ohmandy

||

Answers:

There seems to be some inconsistencies in your code e.g.

wb_cpy = load_workbook(r'C:Usersmedocumentssourcefoldercopyfromfile.xlsx')
ws = wb_src["sheet1"]

ws is referencing a workbook object different to that just created or indeed does not appear to exist anywhere in your code. Similar with the next workbook and worksheet objects
When you are writing code should try to avoid duplication, so reuse code where you can.

Below is some example code is based on the assumption in my comment and that the states are in order as shown in your example data i.e. not jumbled together and the States list is in that same order.

The code uses a python list of the States to search then copy the consecutive rows to the current ‘pasteinfile.xlsx’ sheet until the next State data. It then copies that State data to the next ‘pasteinfile.xlsx’ Sheet and so on for each State.
Summary
The States list is manually added here however it could be obtained from the values in Column A prior if these change each time. A search on Column A is made for each State in the list starting at A2, then subsequently from the last row of the last copied State data, i.e. after GEORGIA rows are copied and ALABAMA is the next search its will start from row 7 which is the end of the GEORGIA rows.
As a ‘State’ matches it sets the first row to paste data in the ‘pasteinfile.xlsx’ Sheet to row 2 then iterates through the cells in the first matched row and copies each cell value to ‘pasteinfile.xlsx’ (starting at row 2). Then checks next row in Column A for a State match again and if true copies the next row to row 3 of ‘pasteinfile.xlsx’ and so on until the State no longer matches. At this point it loops to the next State and resets the start row back to 2 and sets the next numeric Sheet name. Then the same process is repeated until all States in the list are searched.
For each State the ‘pasteinfile.xlsx’ Sheet name is incremented by 1, i.e. ‘Sheet1’, ‘Sheet2’, etc. The code starts naming at ‘Sheet1’ however that can be changed to start at another number if desired.

...
from copy import copy  # Import copy if used
# load copy from file
wb_cpy = load_workbook('copyfromfile.xlsx')
# ws = wb_src["sheet1"]
ws = wb_cpy["Sheet1"]

# load paste in file
wb_pst = load_workbook('pasteinfile.xlsx')
# ws2 = wb_dst["sheet2"]

copyfrom_max_columns = ws.max_column

paste_start_min_row = 1
states_list = ['GEORGIA', 'ALABAMA', 'TEXAS']  # States list to search for rows
for sheet_number, state in enumerate(states_list, 1):
    ws2 = wb_pst["Sheet" + str(sheet_number)]  # Set Sheet name for current pasted data
    search_min_row = paste_start_min_row  # Start search for States at top row then from the end of the last copy/paste
    paste_start_min_row = 1  # Reset the row number for each new sheet so the copy starts at row 2
    for row in ws.iter_rows(max_col=1, min_row=search_min_row):  # min_col defaults to 1
        for cell in row:
            if cell.value == state:  # Search ColA for the State, when match is found proceed to copy/paste
                paste_start_min_row += 1  # Set first row for 'copy to' to 2
                for i in range(copyfrom_max_columns):  # Iterate the cells in the row to max column
                    # Set the copy and paste Cells
                    copy_cell = cell.offset(column=i)
                    paste_cell = ws2.cell(row=paste_start_min_row, column=i + 1)
                    # Paste the copied value to the 'pasteinfile.xlsx' Sheet
                    paste_cell.value = copy_cell.value
                    # Set the number format of the cell to same as original
                    paste_cell.number_format = copy_cell.number_format

                    ### Copy other Cell formatting if desired
                    ### Requires 'from copy import copy'
                    paste_cell.font = copy(copy_cell.font)
                    paste_cell.alignment = copy(copy_cell.alignment)
                    paste_cell.border = copy(copy_cell.border)
                    paste_cell.fill = copy(copy_cell.fill)

wb_pst.save('pasteinfile.xlsx')

This image is an example of the Sheet for ALABAMA in ‘pasteinfile.xlsx’ (Sheet2 in this case), before and after running the code. Note I set each row in the Type column to a numeric value as a unique identifier for each row of the data.
Sheet2 output

#————-Additional Information———#
I have updated the code to include some style and formatting copying. The specific format noted is ‘number_format’ which can be copied across the same way as the value per the code. If you need/want other formatting like font, orientation, fill etc these need the ‘copy’ function and you’ll need to import copy as shown in the code, **from copy import copy**. If you just want the number format omit those lines and there is no need to import copy.

Answered By: moken
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.