Pagination not showing up in parsed content (BeautifulSoup)

Question:

I am new to python programming and I have a problem with pagination while using beautiful soup. all the parsed content show up except the pagination contents. image of content not showing up I have highlighted the lines which does not show up.
Website link.

from bs4 import BeautifulSoup
import requests
import time
import pandas as pd
from lxml import html

url = "https://www.yellowpages.lk/Medical.php"

result = requests.get(url)
time.sleep(5)

doc = BeautifulSoup(result.content, "lxml")

time.sleep(5)

Table = doc.find('table',{'id':'MedicalFacility'}).find('tbody').find_all('tr')
Page = doc.select('.col-lg-10')

C_List = []
D_List = []
N_List = []
A_List = []
T_List = []
W_List = []
V_List = []
M_List = []

print(doc.prettify())
print(Page)
while True:

    for i in range(0,25):
        Sort = Table[i]
        
        Category = Sort.find_all('td')[0].get_text().strip()
        C_List.insert(i,Category)
        
        District = Sort.find_all('td')[1].get_text().strip()
        D_List.insert(i,District)
        
        Name = Sort.find_all('td')[2].get_text().strip()
        N_List.insert(i,Name)
        
        Address = Sort.find_all('td')[3].get_text().strip()
        A_List.insert(i,Address)
        
        Telephone = Sort.find_all('td')[4].get_text().strip()
        T_List.insert(i,Telephone)
        
        Whatsapp = Sort.find_all('td')[5].get_text().strip()
        W_List.insert(i,Whatsapp)
        
        Viber = Sort.find_all('td')[6].get_text().strip()
        V_List.insert(i,Viber)

        MoH_Division = Sort.find_all('td')[7].get_text().strip()
        M_List.insert(i,MoH_Division)

I tried using .find() with class and .select(‘.class’) to see if the pagination contents show up so far nothing has worked

Asked By: Prog_Beginner

||

Answers:

The pagination is more or less superfluous in that page: the data is loaded anyway, and Javascript is generating pagination just for display purposes: Requests will get full data anyway.
Here is one way of getting that information in full:

import requests
from bs4 import BeautifulSoup as bs
import pandas as pd

pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36'
}

url = 'https://www.yellowpages.lk/Medical.php'

r = requests.get(url, headers=headers)
soup = bs(r.text, 'html.parser')
table = soup.select_one('table[id="MedicalFacility"]')
df = pd.read_html(str(table))[0]
print(df)

Result in terminal:

    Category    District    Name    Address Telephone   WhatsApp    Viber   MoH Division
0   Pharmacy    Gampaha A & B Pharmacy  171 Negambo Road Veyangoda  0778081515  9.477808e+10    9.477808e+10    Aththanagalla
1   Pharmacy    Trincomalee A A Pharmacy    350 Main Street Kanthale    0755576998  9.475558e+10    9.475558e+10    Kanthale
2   Pharmacy    Colombo A Baur & Co Pvt Ltd 55 Grandpass Rd Col 14  0768200100  9.476820e+10    9.476820e+10    CMC
3   Pharmacy    Colombo A Colombo Pharmacy  Ug 93 97 Peoples Park Colombo 11    0773771446  9.477377e+10    NaN CMC
4   Pharmacy    Trincomalee A R Pharmacy    Main Street Kinniya-3   0771413838  9.477500e+10    9.477500e+10    Kinniya
... ... ... ... ... ... ... ... ...
1968    Pharmacy    Ampara  Zam Zam Pharmacy    Main Street Akkaraipattu    0672277698  9.477756e+10    9.477756e+10    Akkaraipattu
1969    Pharmacy    Batticaloa  Zattra Pharmacy Jummah Mosque Rd Oddamawadi-1   0766689060  9.476669e+10    NaN Oddamavady
1970    Pharmacy    Puttalam    Zeenath Pharmacy    Norochcholei    0728431622  NaN NaN Kalpitiya
1971    Pharmacy    Puttalam    Zidha Pharmacy  Norochcholei    0773271222  NaN NaN Kalpitiya
1972    Pharmacy    Gampaha Zoomcare Pharmacy & Grocery 182/B/1 Rathdoluwa Seeduwa  0768378112  NaN NaN Seeduwa
1973 rows × 8 columns

See pandas documentation here. Also BeautifulSoup documentation, and lastly, Requests documentation.

Answered By: Barry the Platipus

If you are using pandas, all you need is just a couple of lines of code to put the entire table into a dataframe.

All you need is pandas.read_html() function as follows:

Code:

import pandas as pd

df = pd.read_html("https://www.yellowpages.lk/Medical.php")[0]

print(df)

Output:

Dataframe

Answered By: ScottC
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.