How do I determine whether an Excel worksheet contains no values using openpyxl?

Question:

I’m coding a Python 3.7.2 script to check for unused worksheets in an Excel file, and I need a way to determine if the current worksheet contains no values. Previously when I coded in VBA, I would find the last-used-row and last-used-column to check if the sheet was ’empty’, but the max_column and max_row functions in openpyxl take into account non-value data like formatting which I want to ignore. Is there any way to check if none of the cells of a given worksheet contain any values?

Previously when I coded in VBA, I would find the last-used-row and last-used-column to check if the sheet was ’empty’, but the max_column and max_row functions in openpyxl take into account non-value data like formatting which I want to ignore.

Ideally I’d be able to find a way to get the last_used_column and last_used_row like I used to in VBA, but I’d appreciate it if anyone could tell me of any tools I could use to build a simple function where you pass it a worksheet and it spits out True or False depending on whether it has no values or not.

Asked By: Keinga

||

Answers:

In contrast to my suggestion in a comment, ws.get_cell_collection() is deprecated, but you could iterate over the openpyxl.worksheet.worksheet module’s
values property, since (from the docs):

values
Produces all cell values in the worksheet, by row

Type: generator

With a workbook workbook.xlsx containing two worksheets, Sheet1 does contain some cell with values, but Sheet2 is visibly empty with all empty cells:

wb = load_workbook('workbook.xlsx')

for name in wb.get_sheet_names():
    isempty = False
    for row in wb.get_sheet_by_name(name).values:
        if all(value == None for value in row):
            isempty = True
        else:
            isempty = False
    if isempty:
        print(name)

outputs the empty sheet(s):

Sheet2

Although this works it is a bit more complicated than using only:

empty_sheets = [name for name in wb.get_sheet_names() if not (wb.get_sheet_by_name(name).get_cell_collection())] 

which still works albeit deprecated.

This seems to be a persistent, vexing problem with Python3 and OpenPyXL; hopefully the dev teams will address this more elegantly in future releases. In the mean time, my inelegant solution relies on the max_column max_row tests described by previous posters:

for sheet in workBook.worksheets:
    if (sheet.max_column == 1) and (sheet.max_row == 1):
    # see if cell is empty
        if sheet.cell(row=1,column=1).value == None:
            print('Sheet ' + sheet.title + ' is empty.  Ignoring ... ')
            continue

    do other stuff

I inserted this if test in a script that converted any *.xlsx? files in a source directory to CSV files to get the processing past a "TypeError: ‘tuple’" error. The script would work fine until it encountered any empty sheet in a workbook; then it would fail. Those simple few lines solved the issue.

Answered By: TCMcG008
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.