How to use progressbar module with urlretrieve

Question:

My pyhton3 script downloads a number of images over the internet using urlretrieve, and I’d like to add a progressbar with a completed percentage and download speed for each download.

The progressbar module seems like a good solution, but although I’ve looked through their examples, and example4 seems like the right thing, I still can’t understand how to wrap it around the urlretrieve.

I guess I should add a third parameter:

urllib.request.urlretrieve('img_url', 'img_filename', some_progressbar_based_reporthook)

But how do I properly define it?

Asked By: Vasily

||

Answers:

The hook is defined as:

urlretrieve(url[, filename[, reporthook[, data]]])
“The third argument, if present, is a hook function that will be called
once on establishment of the network connection and once after each block
read thereafter. The hook will be passed three arguments; a count of blocks
transferred so far, a block size in bytes, and the total size of the file.
The third argument may be -1 on older FTP servers which do not return a
file size in response to a retrieval request. “

So, you can write a hook as follows:

# Global variables
pbar = None
downloaded = 0

def show_progress(count, block_size, total_size):
    if pbar is None:
        pbar = ProgressBar(maxval=total_size)

    downloaded += block_size
    pbar.update(block_size)
    if downloaded == total_size:
        pbar.finish()
        pbar = None
        downloaded = 0

As a side note I strongly recommend you to use requests library which is a lot easier to use and you can iterate over the response with the iter_content() method.

Answered By: Doron Cohen

The suggestion in the other answer did not progress for me past 1%. Here is a complete implementation that works for me on Python 3:

import progressbar
import urllib.request


pbar = None


def show_progress(block_num, block_size, total_size):
    global pbar
    if pbar is None:
        pbar = progressbar.ProgressBar(maxval=total_size)
        pbar.start()

    downloaded = block_num * block_size
    if downloaded < total_size:
        pbar.update(downloaded)
    else:
        pbar.finish()
        pbar = None


urllib.request.urlretrieve(model_url, model_file, show_progress)
Answered By: Nic Dahlquist

I think a better solution is to create a class that has all the needed state

import progressbar

class MyProgressBar():
    def __init__(self):
        self.pbar = None

    def __call__(self, block_num, block_size, total_size):
        if not self.pbar:
            self.pbar=progressbar.ProgressBar(maxval=total_size)
            self.pbar.start()

        downloaded = block_num * block_size
        if downloaded < total_size:
            self.pbar.update(downloaded)
        else:
            self.pbar.finish()

and call :

urllib.request.urlretrieve('img_url', 'img_filename', MyProgressBar())
Answered By: George C

In python 3 you can achieve the same result without the progressbar module:

import urllib.request

# prepare progressbar
def show_progress(block_num, block_size, total_size):
    print(round(block_num * block_size / total_size *100,2), end="r")

# use urlretrieve
urllib.request.urlretrieve(url, fileName, show_progress)
Answered By: mimau
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.