How to read a specific table from a given url?

Question:

I am new to python and trying to download the countries GDP per capita data. I am trying to read the data from this website: https://worldpopulationreview.com/countries/by-gdp

I tried to read the data but, I found no tables found error.
I can see the data is in r.text but somehow pandas can not read that table.
How to solve the problem and read the data?

MWE

import pandas as pd
import requests

url = "https://worldpopulationreview.com/countries/by-gdp"

r = requests.get(url)
raw_html = r.text  # I can see the data is here, but pd.read_html says no tables found
df_list = pd.read_html(raw_html)
print(len(df_list))
Asked By: dallascow

||

Answers:

Data is embedded via <script id="__NEXT_DATA__" type="application/json"> and rendered by browser only, so you have to adjust your script a bit:

pd.json_normalize(
    json.loads(
        BeautifulSoup(
            requests.get(url).text
        ).select_one('#__NEXT_DATA__').text)['props']['pageProps']['data']
)

Example

import pandas as pd
import requests,json
from bs4 import BeautifulSoup

url = "https://worldpopulationreview.com/countries/by-gdp"


df = pd.json_normalize(
    json.loads(
        BeautifulSoup(
            requests.get(url).text
        ).select_one('#__NEXT_DATA__').text)['props']['pageProps']['data']
)
df[['continent', 'country', 'pop','imfGDP', 'unGDP', 'gdpPerCapita']]

Output

continent country pop imfGDP unGDP gdpPerCapita
0 North America United States 338290 2.08938e+13 18624475000000 61762.9
1 Asia China 1.42589e+06 1.48626e+13 11218281029298 10423.4
210 Asia Syria 22125.2 0 22163075121 1001.71
211 North America Turks and Caicos Islands 45.703 0 917550492 20076.4
Answered By: HedgeHog