How do I save a FastAPI UploadFile which is a zip file to disk as .zip?

Question:

I’m uploading zip files as UploadFile via FastAPI and want to save them to the filesystem using async aiofiles like so:

async def upload(in_file: UploadFile = File(...)):
    filepath = /path/to/out_file.zip
    
    async with aiofiles.open(filepath, 'wb') as f:
        while buffer := await in_file.read(1024):
            await f.write(buffer)
        await f.close()

The file is created at filepath, however it’s 0B in size and unzip out_file.zip yields following error:

Archive: out_file.zip
    End-of-central-directory signature not found. Either this file is not
    a zipfile, or it constitutes one disk of a multi-part archive. In the
    latter case the central directory and zipfile comment will be found on
    the last disk(s) of this archive.
unzip:  cannot find zipfile directory in one of out_file.zip or out_file.zip.zip,
        and cannot find out_file.zip.ZIP, period.

print(in_file.content_type) outputs application/x-zip-compressed and

python3 -m mimetypes out_file.zip yields type: application/zip encoding: None

I’ve spent way too much time on this inconvenience and tried several blocking alternatives like:

with open(filepath, "wb") as f:
    f.write(in_file.file.read())
    f.close()

which all resulted in the same scenario. I’m trying to achieve this with .zip files right now but eventually I’m looking for a universal solution for binary files to save them as they come because I’m not processing any of the files, they just need to be stored.

If someone could point out to me what I’m missing that would be of great help.

Edit:
Before I try to write the file to my filesystem, I’m adding an Entry with some metadata to my database via Motor:

@router.post("/")
async def upload(in_file: UploadFile = File(...)):
    file_content = await in_file.read()
    file_db = {"name": in_file.filename, "size": len(file_content)}
    file_db_json = jsonable_encoder(file_db)
    added_file_db = await add_file(file_db_json) 

    filepath = /path/to/out_file.zip 
    async with aiofiles.open(filepath, 'wb') as f:
        while buffer := await in_file.read(1024):
            await f.write(buffer)
        
    return ResponseModel(added_file_db, "upload successful")

The return in upload() confirms the upload was successful, metadata is added to the database, the file is created in my filesystem but broken as described above. I don’t know how any of this would interfere with writing the file contents to my disk but maybe I’m wrong.

Asked By: shin

||

Answers:

You can save the file using aiofiles as shown below (take a look at this answer for more details):

from fastapi import FastAPI, File, UploadFile, status
from fastapi.exceptions import HTTPException
import aiofiles
import os

CHUNK_SIZE = 1024 * 1024  # adjust the chunk size as desired
app = FastAPI()

@app.post("/upload")
async def upload(file: UploadFile = File(...)):
    try:
        filepath = os.path.join('./', os.path.basename(file.filename))
        async with aiofiles.open(filepath, 'wb') as f:
            while chunk := await file.read(CHUNK_SIZE):
                await f.write(chunk)
    except Exception:
        raise HTTPException(status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, 
            detail='There was an error uploading the file')
    finally:
        await file.close()

    return {"message": f"Successfuly uploaded {file.filename}"}

Update

The recent edit in your question shows that you have already read the file contents at this line: file_content = await in_file.read(); hence, attempting to read the file contents again using await in_file.read(1024) results to zero bytes read. Thus, either add the metadata to the database after reading and saving the file (you can use a varibale to keep the total file length e.g.,total_len += len(buffer)), or just write the file_content to the local file, as shown below:

async def upload(file: UploadFile = File(...)):
    ...
    async with aiofiles.open(filepath, 'wb') as f:
        await f.write(file_content)

Update 2

For the sake of completeness, I should also mention that there is an internal "cursor" (or "file pointer") denoting the position from which the file contents will be read (or written). When calling read() reads all the way to the end of the buffer, leaving zero bytes beyond the cursor. Thus, one could also use the seek() method to set the current position of the cursor to 0 (i.e., rewinding the cursor to the start of the file). As per FastAPI documentation:

seek(offset): Goes to the byte position offset (int) in the file.

  • E.g., await myfile.seek(0) would go to the start of the file.
  • This is especially useful if you run await myfile.read() once and then need to read the contents again.
Answered By: Chris