openPyXL – assign value to range of cells during unmerge

Question:

So I have excel files with several sheets in each and I’m working on script which will gather data from selected sheets if they exist in file and combine it in one, big sheet. Generally it’s working, iterating through files and if desired sheet exist it finds range of cells with data and append it to dataframe. The thing I need to do now is to add header row (column names) to Dataframe, but in sheet those are multiline headers.

To make it look the same in dataframe i need to unmerge cells in top header row and copy value from first cell to rest of them in range which was merged before).

I’m using OpenPyXL for accessing excel sheets. My function receives sheet to work on as only parameter. It looks like this:

def checkForMergedCells(sheet):
    merged = ws.merged_cell_ranges
    for mergedCell in merged:
        mc_start, mc_stop = str(mergedCell).split(':')
        cp_value = sheet[mc_start]
        sheet.unmerge_cells(mergedCell)
        cell_range = sheet[mergedCell]
        for cell in cell_range:
            cell.value = cp_value

The thing is that cell_range returns a tuple which ends up in getting error message:

AttributeError: ‘tuple’ object has no attribute ‘value’
Below you can see screencap during debug which shows values passed in each variable.

Debugger running

Asked By: Uzzy

||

Answers:

Accessing by index will generally return a tuple of tuples except if you try to get an individual cell or row. For programmatic access you should use iter_rows() or iter_cols()

You might like to spend some time looking at the utils module.

from openpyxl.utils import range_boundaries

for group in ws.merged_cell_ranges:
     min_col, min_row, max_col, max_row = range_boundaries(group)
     top_left_cell_value = ws.cell(row=min_row, column=min_col).value
     for row in ws.iter_rows(min_col=min_col, min_row=min_row, max_col=max_col, max_row=max_row):
         for cell in row:
             cell.value = top_left_cell_value
Answered By: Charlie Clark

I was getting errors and a deprecation warning until I did this:

from openpyxl.utils import range_boundaries

for group in sheet.merged_cells.ranges: # merged_cell_ranges deprecated
    display(range_boundaries(group._get_range_string())) # expects a string instead of an object
    min_col, min_row, max_col, max_row = range_boundaries(group._get_range_string())
    top_left_cell_value = sheet.cell(row=min_row, column=min_col).value
    for row in sheet.iter_rows(min_col=min_col, min_row=min_row, max_col=max_col, max_row=max_row):
        for cell in row:
            cell.value = top_left_cell_value

None of the previous answers works.
So I elaborated this one, tested it and it worked for me.

from openpyxl.utils import range_boundaries
wb = load_workbook('Example.xlsx')

sheets = wb.sheetnames  ##['Sheet1', 'Sheet2']
for i,sheet in enumerate(sheets):
    ws = wb[sheets[i]]
    
    # you need a separate list to iterate on (see explanation #2 below)
    mergedcells =[]  
    for group in ws.merged_cells.ranges:
        mergedcells.append(group)
    
    for group in mergedcells:
        min_col, min_row, max_col, max_row = group.bounds 
        top_left_cell_value = ws.cell(row=min_row, column=min_col).value
        ws.unmerge_cells(str(group))   # you need to unmerge before writing (see explanation #1 below)
        for irow in range(min_row, max_row+1):
            for jcol in range(min_col, max_col+1): 
                ws.cell(row = irow, column = jcol, value = top_left_cell_value)

 

@Дмитро Олександрович is almost right, but I had to change a few things to fix his answer:

  1. You’ll have a AttributeError: 'MergedCell' object attribute 'value' is read-only error, because you need to unmerge merged cells before change their value. (see here: https://foss.heptapod.net/openpyxl/openpyxl/-/issues/1228)

  2. you can not iterate directly over ws.merged_cells.ranges because iterating through a ‘ranges’ list object in python and changing it (with an unmerge_cells function or a pop function for example) will result in changing only half of the objects (see here: https://foss.heptapod.net/openpyxl/openpyxl/-/issues/1085). You need to create a different list and iterate on it.

Answered By: rgiannico

The below code from http://thequickblog.com/merge-unmerge-cells-openpyxl-in-python/ is worked for me.

import openpyxl 
from openpyxl.utils import range_boundaries
wbook=openpyxl.load_workbook("openpyxl_merge_unmerge.xlsx")
sheet=wbook["unmerge_sample"]
for cell_group in sheet.merged_cells.ranges:
    min_col, min_row, max_col, max_row = range_boundaries(str(cell_group))
    top_left_cell_value = sheet.cell(row=min_row, column=min_col).value
    sheet.unmerge_cells(str(cell_group))
    for row in sheet.iter_rows(min_col=min_col, min_row=min_row, max_col=max_col, max_row=max_row):
        for cell in row:
            cell.value = top_left_cell_value
wbook.save("openpyxl_merge_unmerge.xlsx")
exit()
Answered By: suman

Issues with other answers

Regarding @Charlie Clark’s selected answer and the other answers which use the code from http://thequickblog.com/merge-unmerge-cells-openpyxl-in-python, you can unmerge the cells more easily without dealing with range_boundaries and those conversions.

I was also getting issues with the selected answer where some merged cells would unmerge but others would not, and some unmerged cells would fill with the data I wanted but others would not.

The issue is that worksheet.merged_cells.ranges is an iterator which means it is lazily evaluated, so when worksheet.unmerge_cells() is called, the object worksheet.merged_cells is mutated and side-effects occur when iterating the merged cell ranges again.

Better solution

In my case, I wanted to unmerge cells like this, copying the border, font, and alignment information as well:

                    +-------+------+
+-------+------+    | Date  | Time |
| Date  | Time |    +=======+======+
+=======+======+    | Aug 6 | 1:00 |
|       | 1:00 | -> +-------+------+
| Aug 6 | 3:00 |    | Aug 6 | 3:00 |
|       | 6:00 |    +-------+------+
+-------+------+    | Aug 6 | 6:00 |
                    +-------+------+

For the current latest version of openpyxl==3.0.9, I found that the following works best for me:

from copy import copy

from openpyxl import load_workbook, Workbook
from openpyxl.cell import Cell
from openpyxl.worksheet.cell_range import CellRange
from openpyxl.worksheet.worksheet import Worksheet


def unmerge_and_fill_cells(worksheet: Worksheet) -> None:
    """
    Unmerges all merged cells in the given ``worksheet`` and copies the content
    and styling of the original cell to the newly unmerged cells.

    :param worksheet: The Excel worksheet containing the merged cells.
    """

    # Must convert iterator to list to eagerly evaluate all merged cell ranges
    # before looping over them - this prevents unintended side-effects of
    # certain cell ranges from being skipped since `worksheet.unmerge_cells()`
    # is destructive.
    all_merged_cell_ranges: list[CellRange] = list(
        worksheet.merged_cells.ranges
    )

    for merged_cell_range in all_merged_cell_ranges:
        merged_cell: Cell = merged_cell_range.start_cell
        worksheet.unmerge_cells(range_string=merged_cell_range.coord)

        # Don't need to convert iterator to list here since `merged_cell_range`
        # is cached
        for row_index, col_index in merged_cell_range.cells:
            cell: Cell = worksheet.cell(row=row_index, column=col_index)
            cell.value = merged_cell.value

            # (Optional) If you want to also copy the original cell styling to
            # the newly unmerged cells, you must use shallow `copy()` since
            # cell style properties are proxy objects which are not hashable.
            #
            # See <https://openpyxl.rtfd.io/en/stable/styles.html#copying-styles>
            cell.alignment = copy(merged_cell.alignment)
            cell.border = copy(merged_cell.border)
            cell.font = copy(merged_cell.font)


# Sample usage
if __name__ == "__main__":
    workbook: Workbook = load_workbook(
        filename="workbook_with_merged_cells.xlsx"
    )
    worksheet: Worksheet = workbook["My Sheet"]

    unmerge_and_fill_cells(worksheet=worksheet)
    workbook.save(filename="workbook_with_unmerged_cells.xlsx")

Concise solution

Here is a shorter version without comments and not copying styles:

from openpyxl.worksheet.worksheet import Worksheet

def unmerge_and_fill_cells(worksheet: Worksheet) -> None:
    for merged_cell_range in list(worksheet.merged_cells.ranges):
        worksheet.unmerge_cells(range_string=merged_cell_range.start_cell)

        for row_col_indices in merged_cell_range.cells:
            worksheet.cell(*row_col_indices).value = merged_cell.value
Answered By: homersimpson

All the previous solutions were giving me some kind of error, probably due to different versions of openpyxl, with the current version (3.0.10) I found this solution to work for me:

for m_range in list(ws.merged_cells.ranges):
    merged_cell = m_range.start_cell
    ws.unmerge_cells(range_string=str(m_range))

    for row_col_indices in m_range.cells:
        ws.cell(*row_col_indices).value = merged_cell.value
Answered By: Robert Kolcun
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.