How to use multiprocessing with multiple arguments in Python 3?

Question:

I had a for-loop that would take a URL from a set of URLs and reach out to that URL and do some other stuff, but it was taking forever, so I figured I’d speed it up with some multiprocessing, but am struggling to do so.

Thank you for any assistance.

def accessAndSaveFiles(urlSet, user, verboseFlag):
    with multiprocessing.Pool(os.cpu_count()) as pool:
        pool.starmap(processURL, zip(itertools.repeat(urlSet), user, verboseFlag))

def processURL(url, user, verboseFlag):
    filePath = some_path

    img_data = requests.get(url, allow_redirects=True)
    open(filePath, 'wb').write(img_data.content)


def main():
    ...
    accessAndSaveFiles(urlSet, user, verboseFlag)
    ...

I get an error on the “pool.starmap(processURL, zip(itertools.repeat(urlSet), user, verboseFlag))” line saying “TypeError: zip argument #3 must support iteration”. I don’t want to iterate over this item, I just want to send the same value every time.

Asked By: Ryan

||

Answers:

Assuming that urlset is an iterable, you should use

pool.starmap(processURL, zip(urlSet, repeat(user), repeat(verboseFlag)))

This is because you want to iterate over the urlset but have the same user and verboseFlag for each processURL instance(thus,repeat)

For reference you should take a look at Python multiprocessing pool.map for multiple arguments

The output of zip when iterated over, should look something like

[('www.google.com','user1',True),('www.goodle.uk','user1',True),]

for pool.starmap to make sense of it.

Answered By: Ankur S