Faster way to make S3 "folder hierarchy" than parsing of filenames?

Question:

I want to make a relatively basic tool to browse a bucket in S3 as a file hierarchy rather than simply a list of filenames with slashes in them.

Currently, I am using boto to get the list of keynames in a bucket and then parsing the keynames to make a nested dictionary of the “folders” and files. However, that process takes so long! Even just going through each key to get a list of all higher level folders takes 15+ minutes.

How do tools such as cyberduck give a list of folders so quickly?

Asked By: Daniel Gorelik

||

Answers:

Check this link: http://docs.aws.amazon.com/AmazonS3/latest/dev/ListingKeysHierarchy.html

listObjects() has a parameter called delimiter, which could be set to / and resulting list of objects will look exactly as a tree of files. I think this is what you’re looking for.

Answered By: yegor256

Using s3-tree might be helpful.

https://pypi.org/project/s3-tree/

Example:

$ s3-tree bucketname
bucketname
├── asset-manifest.json
├── favicon.ico
├── index.html
├── manifest.json
├── precache-manifest.e8c8442b93de34204de5f9b23fa0174b.js
├── service-worker.js
└── static
    ├── css
    │   ├── main.43b5e879.chunk.css
    │   └── main.43b5e879.chunk.css.map
    ├── js
    │   ├── 1.f6579156.chunk.js
    │   ├── 1.f6579156.chunk.js.map
    │   ├── main.36bbb0f4.chunk.js
    │   ├── main.36bbb0f4.chunk.js.map
    │   ├── runtime~main.229c360f.js
    │   └── runtime~main.229c360f.js.map
    └── media
        ├── her.37588412.png
        ├── me.e69004b8.png
        └── us.f114bc8d.jpg

4 directories, 17 files
Answered By: PKP