How to transfer a file from one S3 bucket to other with two different users

Question:

I need to write code (python) to copy an S3 file from one S3 bucket to another. The source bucket is in a different AWS account, and we are using an IAM user credentials to read from that bucket. The code runs in the same account as the destination bucket, so it has write access with the IAM role. One way I can think of is to create an s3 client connection with the source account, read the whole file into memory (getObject-?), and then create another s3 client with the destination bucket and write the contents (putObject-?) that have been previously read into memory. But it can get very inefficient if the file size grows, so wondering if there is a better way, preferably if boto3 provides a AWS-managed way that transfers the file without reading contents into memory.

PS: I cannot add or modify roles or policies in the source account to give direct read access to the destination account. The source account is owned by someone else and they only provide a user that can read from the bucket.

Asked By: Siddardha

||

Answers:

Streaming is the standard solution for this kind of problem. You establish a source and a destination and then you stream from one to the other.

In fact, the boto3 get_object() and upload_fileobj() methods both support streams.

Your code is going to look something like this:

import boto3

src = boto3.client('s3', src_access_key, src_secret_key)
dst = boto3.client('s3') # creds implicit through IAM role

src_response = src.get_object(Bucket=src_bucket, Key=src_key)
dst.upload_fileobj(src_response['Body'], dst_bucket, dst_key)
Answered By: jarmod

This is just a suggestion that might provide an updated approach. Most tech articles about how to transfer S3 files from one account to another rely on the destination account to "pull" the files so that the destination account ends up owning the copied files.

However, per this article from AWS, you can now configure buckets with a Bucket owner enforced setting—and in fact this is the default for newly created buckets:

Objects in Amazon S3 are no longer automatically owned by the AWS
account that uploads it. By default, any newly created buckets now
have the Bucket owner enforced setting enabled.

On the destination bucket, you should be able to grant IAM permission for the source account user to "push" files to that bucket. Then with appropriate S3 commands or API calls, you should be able to copy files directly from the source to the destination without needing to read, buffer, and write data with your Python client.

You might want to test and verify the permissions configuration with the AWS CLI, and then determine how to implement it in Python.

Answered By: Martin_W