Is it possible to get the contents of an S3 file without downloading it using boto3?

Question

I am working on a process to dump files from a Redshift database, and would prefer not to have to locally download the files to process the data. I saw that Java has a StreamingObject class that does what I want, but I haven’t seen anything similar in boto3.

Asked By: flybonzai

||

Source

Answer 1

If you have a mybucket S3 bucket, which contains a beer key, here is how to download and fetch the value without storing it in a local file:

import boto3
s3 = boto3.resource('s3')
print s3.Object('mybucket', 'beer').get()['Body'].read()

Answered By: johntellsall

Answer 2

This may or may not be relevant to what you want to do, but for my situation one thing that worked well was using tempfile:

import tempfile
import boto3
    
bucket_name = '[BUCKET_NAME]'
key_name = '[OBJECT_KEY_NAME]'
s3 = boto3.resource('s3')
temp = tempfile.NamedTemporaryFile()
s3.Bucket(bucket_name).download_file(key_name, temp.name)
# do what you will with your file...
temp.close()

Answered By: James Shapiro

Answer 3

I use that solution, actually:

import boto3

s3_client = boto3.client('s3')

def get_content_from_s3(bucket: str, key: str) -> str:
  """Save s3 content locally
     param: bucket, s3 bucket
     param: key, path to the file, f.i. folder/subfolder/file.txt
  """
  s3_file = s3_client.get_ojct(Bucket=bucket, Key=key)['Body'].read()
  return s3_file.decode('utf-8').strip()

Answered By: Vova

Answer 4

smart_open is a Python 3 library for efficient streaming of very large files from/to storages such as S3, GCS, Azure Blob Storage, HDFS, WebHDFS, HTTP, HTTPS, SFTP, or local filesystem.

https://pypi.org/project/smart-open/

import boto3
import smart_open

client = boto3.client(service_name='s3',
                      aws_access_key_id=AWS_ACCESS_KEY_ID,
                      aws_secret_access_key=AWS_SECRET_KEY,
                      )
url = 's3://.............'
fin = smart_open.open(url, 'r', transport_params={'client':client})

for line in fin:
    data = json.loads(line)
    
    print(data)
    
fin.close()

Answered By: ADARSH J J

Is it possible to get the contents of an S3 file without downloading it using boto3?

Question:

Answers: