Scraping Table Data from Multiple URLS, but first link is repeating

Question:

I’m looking to iterate through the URL with "count" as variables between 1 and 65.

Right now, I’m close but really struggling to figure out the last piece. I’m receiving the same table (from variable 1) 65 times, instead of receiving the different tables.

import requests
import pandas as pd

url = 'https://basketball.realgm.com/international/stats/2023/Averages/Qualified/All/player/All/desc/{count}'

res = []

for count in range(1, 65):
    html = requests.get(url).content
    df_list = pd.read_html(html)
    df = df_list[-1]
    res.append(df)

    print(res)
df.to_csv('my data.csv')

Any thoughts?

Asked By: Anthony Madle

||

Answers:

A few errors:

  • Your URL was templated incorrectly. It remains at .../{count} literally, without substituting or updating from the loop variable.
  • If you want to get page 1 to 65, use range(1, 66)
  • Unless you want to export only the last dataframe, you need to concatenate all of them first
# No count here, we will add it later
url = 'https://basketball.realgm.com/international/stats/2023/Averages/Qualified/All/player/All/desc'
res = []

for count in range(1, 66):
    # pd.read_html accepts a URL too so no need to make a separate request
    df_list = pd.read_html(f"{url}/{count}")
    res.append(df_list[-1])

pd.concat(res).to_csv('my data.csv')
Answered By: Code Different