Unable to upload file to AWS S3 using python boto3 and upload_fileobj

Question:

I am trying to get a webp image, convert it to jpg and upload it to aws S3 without saving the file to disk (using io.BytesIO and boto3 upload_fileobj) , but with no success. The funny thing is that it works fine if I save the file to local disk and than use boto3 upload melhod.

This works:

r = requests.get(url)
 if r.status_code == 200:
  file_name = "name.jpeg"
  s3 = boto3.client("s3")
  webp_file = io.BytesIO(r.content)
  im = Image.open(webp_file).convert("RGB")
  im.save(
    f"{config.app_settings.image_tmp_dir}/{file_name}", "JPEG"
  )
  s3.upload_file(
    f"{config.app_settings.image_tmp_dir}/{file_name}",
    config.app_settings.image_S3_bucket,
    file_name,
    ExtraArgs={"ContentType": "image/jpeg"},
  )

This does not work:

r = requests.get(url)
 if r.status_code == 200:
  file_name = "name.jpeg"
  s3 = boto3.client("s3")
  webp_file = io.BytesIO(r.content)
  im = Image.open(webp_file).convert("RGB")
  jpg_file = io.BytesIO()
  im.save(
    jpg_file, "JPEG"
  )
  s3.upload_fileobj(
    jpg_file,
    config.app_settings.image_S3_bucket,
    file_name,
    ExtraArgs={"ContentType": "image/jpeg"},
  )

I can see that the jpg_file has the correct size after im.save, but when the file is uploaded to aws S3 I get empty file.

Asked By: Joaquim

||

Answers:

After calling im.save(jpg_file, "JPEG"), the stream’s position is still pointing after the newly-written image data. Anything that tries to read from jpg_file, will start reading from that position and will not see the image data.

You can use the stream’s seek() method to move the position back to the start of the stream, before s3.upload_fileobj() tries to read it:

  im.save(
    jpg_file, "JPEG"
  )

  # Reset the stream position back to the start of the stream.
  jpg_file.seek(0)

  s3.upload_fileobj(
    jpg_file,
    config.app_settings.image_S3_bucket,
    file_name,
    ExtraArgs={"ContentType": "image/jpeg"},
  )
Answered By: Jimmy