Webscraping: Loop through multiple urls

Question:

I have successfully scrapped several websites individually.
However, now I want to have a single script so that I don’t have to run each script individually all the time.
I would like to build a for loop that goes through all websites and replaces the x with a string.
Unfortunately, there are no numbers, with which I could go through the individual pages with "for x in range", but there are just the strings mentioned.

Here is my current code:

from bs4 import BeautifulSoup
import requests
import pandas as pd
    

movielist = []

for x in ... ('action', 'comedy', 'thriller', 'drama', 'sport'): # what should i insert instead of ...?
    r = requests.get(f'https://movie.com/{x}', headers=headers)
    soup = BeautifulSoup(r.text, 'html.parser')
    spiele = soup.find_all('div', {'class': 'row'})

The site is not real, its just a question how to do that.

I am very happy about your help, thank you very much.

Asked By: Dumbledore

||

Answers:

Just remove the …
Your tuple is iterable to you can go through every element like this:

for x in ('action', 'comedy', 'thriller', 'drama', 'sport'): # what should i insert instead of ...?
    print(f'https://movie.com/{x}')

output:

https://movie.com/action
https://movie.com/comedy
https://movie.com/thriller
https://movie.com/drama
https://movie.com/sport
Answered By: bitflip

You were pretty close! Just iterate through movie list tuple and x will be next element for each iteration.

from bs4 import BeautifulSoup
import requests
import pandas as pd

movielist = ('action', 'comedy', 'thriller', 'drama', 'sport')

for x in movielist:
    print(f'https://movie.com/{x}')
    r = requests.get(f'https://movie.com/{x}', headers=headers)
    soup = BeautifulSoup(r.text, 'html.parser')
    spiele = soup.find_all('div', {'class': 'row'})

With this slight change it will make requests, use BS and print to constructed urls to console:

https://movie.com/action
https://movie.com/comedy
https://movie.com/thriller
https://movie.com/drama
https://movie.com/sport

However, you probably should store result after each iteration somewhere, for example append "spiele" to some list or just use it somehow.

Answered By: LesniakM
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.