How to measure download speed and progress using requests?

Question:

I am using requests to download files, but for large files I need to check the size of the file on disk every time because I can’t display the progress in percentage and I would also like to know the download speed. How can I go about doing it ? Here’s my code :

import requests
import sys
import time
import os

def downloadFile(url, directory) :
  localFilename = url.split('/')[-1]
  r = requests.get(url, stream=True)

  start = time.clock()
  f = open(directory + '/' + localFilename, 'wb')
  for chunk in r.iter_content(chunk_size = 512 * 1024) :
        if chunk :
              f.write(chunk)
              f.flush()
              os.fsync(f.fileno())
  f.close()
  return (time.clock() - start)

def main() :
  if len(sys.argv) > 1 :
        url = sys.argv[1]
  else :
        url = raw_input("Enter the URL : ")
  directory = raw_input("Where would you want to save the file ?")

  time_elapsed = downloadFile(url, directory)
  print "Download complete..."
  print "Time Elapsed: " + time_elapsed


if __name__ == "__main__" :
  main()

I think one way to do it would be to read the file every time in the for loop and calculate the percentage of progress based on the header Content-Length. But that would be again an issue for large files(around 500MB). Is there any other way to do it?

Asked By: Mayank Kumar

||

Answers:

see here: Python progress bar and downloads

i think the code would be something like this, it should show the average speed since start as bytes per second:

import requests
import sys
import time

def downloadFile(url, directory) :
  localFilename = url.split('/')[-1]
  with open(directory + '/' + localFilename, 'wb') as f:
    start = time.clock()
    r = requests.get(url, stream=True)
    total_length = r.headers.get('content-length')
    dl = 0
    if total_length is None: # no content length header
      f.write(r.content)
    else:
      for chunk in r.iter_content(1024):
        dl += len(chunk)
        f.write(chunk)
        done = int(50 * dl / total_length)
        sys.stdout.write("r[%s%s] %s bps" % ('=' * done, ' ' * (50-done), dl//(time.clock() - start)))
        print ''
  return (time.clock() - start)

def main() :
  if len(sys.argv) > 1 :
        url = sys.argv[1]
  else :
        url = raw_input("Enter the URL : ")
  directory = raw_input("Where would you want to save the file ?")

  time_elapsed = downloadFile(url, directory)
  print "Download complete..."
  print "Time Elapsed: " + time_elapsed


if __name__ == "__main__" :
  main()
Answered By: freeforall tousez

An improved version of the accepted answer for python3 using io.Bytes (write to memory), result in Mbps, support for ipv4/ipv6, size and port arguments.

import sys, time, io, requests

def speed_test(size=5, ipv="ipv4", port=80):

    if size == 1024:
        size = "1GB"
    else:
        size = f"{size}MB"
    url = f"http://{ipv}.download.thinkbroadband.com:{port}/{size}.zip"
    with io.BytesIO() as f:
        start = time.perf_counter()
        r = requests.get(url, stream=True)
        total_length = r.headers.get('content-length')
        dl = 0
        if total_length is None: # no content length header
            f.write(r.content)
        else:
            for chunk in r.iter_content(1024):
                dl += len(chunk)
                f.write(chunk)
                done = int(30 * dl / int(total_length))
                sys.stdout.write("r[%s%s] %s Mbps" % ('=' * done, ' ' * (30-done), dl//(time.perf_counter() -
start) / 100000))
    print( f"n{size} = {(time.perf_counter() - start):.2f} seconds")

Usage Examples:

speed_test()
speed_test(10)
speed_test(50, "ipv6")
speed_test(1024, port=8080)

Output Sample:

[==============================] 61.34037 Mbps
100MB = 17.10 seconds

Available Options:

size: 5, 10, 20, 50, 100, 200, 512, 1024

ipv: ipv4, ipv6

port: 80, 81, 8080


Updated on 20221011:

  • time.perf_counter() replaced time.clock(), which has been deprecated on python 3.3 (kudos to shiro)
Answered By: Pedro Lobito

I had a problem with a specific slow server to download a big file

  1. no Content-Length header.
  2. big file (42GB),
  3. no compression,
  4. slow server (<1MB/s),

Beeing this big, I had also problem with memory usage during the request. Requests doesn’t write output on file, like urlibs does, looks like it keep it in memory.

No content length header makes the accepted answer.. not monitoring.

So I wrote this -basic- method to monitor speed during the csv download following just the "requests" documentation.

It needs a fname (complete output path), a link (http or https) and you can specify custom headers.

BLOCK=5*1024*1024
try:
    with open(fname, 'wb') as f:
        r = requests.get(link, headers=headers, stream=True)

        ## This is, because official dozumentation suggest it, 
        ## saying it's more reliable thatn cycling directly on iterlines, to don't lose data
        lines = r.iter_lines()

        ## Init the base vars, for monitor and block management
        ## Obj is a byte object, because iterlines returno objects
        tsize = 0; obj = bytearray(); t0=time.time(); i=0;
        for line in lines:

            ## calculate the line size, in bytes, and add to the byte object
            tsize+=len(line)
            obj.extend(line)

            ## When condition reached, 
            if tsize > BLOCK:   
                ## Increment the block number
                i+=1;
                
                ## Calculate the speed.. this is in MB/s, 
                ## but you can easily change to KB/s, or Blocks/s
                t1=time.time()
                t=t1-t0;
                speed=round(5/t, 2);

                ## Write the block to the file.
                f.write(obj)

                ## Write stats
                print('got', i*5, 'MB ', 'block' ,i, ' @', speed,'MB/s')

                ## Reinit all the base vars, for a new block
                obj=bytearray(); tsize=0; t0=time.time()

        ## Write the last block part to the file. 
        f.write(obj)

except Exception as e:
        print("Error: ", e, 0)
Answered By: Daniele Rugginenti