How to set page range stop value when it can be anything in Python

Question:

I’ve set up my first script to scrape webpages, but currently by defining a start and stop point in my range like so:

for page in range(1, 4):

However, I have no idea how many pages there will be at any given moment (value will regularly change) – what should I use as my stop value in this scenario? I want to avoid the hack of putting a ridiculously high value.

Asked By: cts

||

Answers:

There are many ways to skin this cat, but I would use the count method in itertools:

import itertools
for page in itertools.count(1):
    ...

My first thought was that it would be easy to write your own generator to do this. That thought was immediately followed by others that led me to the solution I gave above. But what if there was some sort of sequence you wanted to generate that wasn’t handled by an existing generator? Generators really are trivial to write. To illustrate, here’s how the count generator is implemented:

def count(start=0, step=1):
    n = start
    while True:
        yield n
        n += step

About the simplest thing you could write, yes? It’s good to know about yield and how generators work in general.

Answered By: CryptoFool
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.