save excel file stream to azure blob storage

Question:

I programmed a few lines of code in Python which opens an Excel file from a Azure Blob Storage with the openpyxl-library. The code is running in Azure Functions.

After a few modifications on the content, I will create a second workbook and copy the content from the original workbook into it. I saved the second workbook as a stream.
Now I want to save the second workbook back to the blob storage as an Excel file again (other container).

    import logging
    from typing import Container
    
    import azure.functions as func
    import azure.storage.blob
    import openpyxl as xl
    import io
    

    ### Start Trigger
    def main(myblob: func.InputStream):
        logging.info(f"Python blob trigger function processed blob n"
                     f"Name: {myblob.name}n"
                     f"Blob Size: {myblob.length} bytes")
        logging.info('Loading Workbook...')
        ### Load Excel file
        wb1 = xl.load_workbook(filename=io.BytesIO(myblob.read()))
        logging.info(wb1.sheetnames)
        
        ### Create Second Workbook
        output = io.BytesIO()
        wb2 = xl.Workbook()
        wb2.active
        wb2.create_sheet('TestSheet')
        wb2.save(output)

        ### Upload to Azure Blob Storage
        blob_service_client = azure.storage.blob.BlobServiceClient.from_connection_string(conString)
        blob_client = blob_service_client.get_blob_client(container='test2', blob='test2.xlsx')
        blob_client.upload_blob(output)

When I now run the code everything will work. But the excel file is corrupt on the blob storage and I get the following error when I try to open the Excel file:
enter image description here

Thanks for helping!

Asked By: oeliger

||

Answers:

The problem seems to be here: wb2.save(output), you can refer to the following code:

import logging
import azure.functions as func
import azure.storage.blob
import openpyxl as xl
import io
from tempfile import NamedTemporaryFile


def main(myblob: func.InputStream):
    logging.info(f"Python blob trigger function processed blob n"
                 f"Name: {myblob.name}n"
                 f"Blob Size: {myblob.length} bytes")
    #logging.info(myblob.read())
    logging.info('Loading Workbook...')
    ### Load Excel file
    wb1 = xl.load_workbook(filename=io.BytesIO(myblob.read()))
    ws1 = wb1.worksheets[0]
    logging.info(wb1.sheetnames)

    ### Create Second Workbook
    #output = io.BytesIO()
    wb2 = xl.Workbook()
    ws2 = wb2.create_sheet('TestSheet')

    # calculate total number of rows and 
    # columns in source excel file
    mr = ws1.max_row
    mc = ws1.max_column
    
    # copying the cell values from source 
    # excel file to destination excel file
    for i in range (1, mr + 1):
        for j in range (1, mc + 1):
            # reading cell value from source excel file
            c = ws1.cell(row = i, column = j)
    
            # writing the read value to destination excel file
            ws2.cell(row = i, column = j).value = c.value

    ### Upload to Azure Blob Storage
    conString = ""
    blob_service_client = azure.storage.blob.BlobServiceClient.from_connection_string(conString)
    blob_client = blob_service_client.get_blob_client(container='testout', blob='test2.xlsx')
    with NamedTemporaryFile() as tmp:
        wb2.save(tmp.name)
        output = io.BytesIO(tmp.read())
        blob_client.upload_blob(output)
Answered By: Frank Borzage

Just in case anyone is coming back to this like me. For me it worked that way (using the getvalue() function):

output = BytesIO()
wb2.save(output)
blob_client.upload_blob(output.getvalue())
Answered By: Kristian