Python – Download File Using Requests, Directly to Memory

Question

The goal is to download a file from the internet, and create from it a file object, or a file like object without ever having it touch the hard drive. This is just for my knowledge, wanting to know if its possible or practical, particularly because I would like to see if I can circumvent having to code a file deletion line.

This is how I would normally download something from the web, and map it to memory:

import requests
import mmap

u = requests.get("http://www.pythonchallenge.com/pc/def/channel.zip")

with open("channel.zip", "wb") as f: # I want to eliminate this, as this writes to disk
    f.write(u.content)

with open("channel.zip", "r+b") as f: # and his as well, because it reads from disk
    mm = mmap.mmap(f.fileno(), 0)
    mm.seek(0)
    print mm.readline()
    mm.close() # question: if I do not include this, does this become a memory leak?

Asked By: Anon

||

Source

Answer 1

Your answer is u.content. The content is in the memory. Unless you write it to a file, it won’t be stored on disk.

Answered By: poke

Answer 2

r.raw (HTTPResponse) is already a file-like object (just pass stream=True):

#!/usr/bin/env python
import sys
import requests # $ pip install requests
from PIL import Image # $ pip install pillow

url = sys.argv[1]
r = requests.get(url, stream=True)
r.raw.decode_content = True # Content-Encoding
im = Image.open(r.raw) #NOTE: it requires pillow 2.8+
print(im.format, im.mode, im.size)

In general if you have a bytestring; you could wrap it as f = io.BytesIO(r.content), to get a file-like object without touching the disk:

#!/usr/bin/env python
import io
import zipfile
from contextlib import closing
import requests # $ pip install requests

r = requests.get("http://www.pythonchallenge.com/pc/def/channel.zip")
with closing(r), zipfile.ZipFile(io.BytesIO(r.content)) as archive:
    print({member.filename: archive.read(member) for member in archive.infolist()})

You can’t pass r.raw to ZipFile() directly because the former is a non-seekable file.

I would like to see if I can circumvent having to code a file deletion line

tempfile can delete files automatically f = tempfile.SpooledTemporaryFile(); f.write(u.content). Until .fileno() method is called (if some api requires a real file) or maxsize is reached; the data is kept in memory. Even if the data is written on disk; the file is deleted as soon as it closed.

Answered By: jfs

Answer 3

This is what I ended up doing.

import zipfile 
import requests
import StringIO

u = requests.get("http://www.pythonchallenge.com/pc/def/channel.zip")
f = StringIO.StringIO() 
f.write(u.content)

def extract_zip(input_zip):
    input_zip = zipfile.ZipFile(input_zip)
    return {i: input_zip.read(i) for i in input_zip.namelist()}
extracted = extract_zip(f)

Answered By: Anon

Python – Download File Using Requests, Directly to Memory

Question:

Answers: