How to scrape tbody from a collapsible table using BeautifulSoup library?

Question:

Recently i did a project based of covid-19 dashboard. Where i use to scrape data from this website which has a collapsible table. Everything was ok till now, now recently the heroku app showing some errors. So i rerun my code in my local machine and the error occured at scraping tbody. Then i figured out that the site i use to scrape data has changed or updated the way it looks (table) and then my code is not able to grab it. I tried viewing page source and i am not able to find the table (tbody) that is on this page.But i am able to find tbody and all the data if i inspect the row of the table but cant find it on page source.How can i scrape the table now ?
My code:
jupyter code
The table i have to grab:
website i'm scraping

Asked By: preethh

||

Answers:

The data you see on the page is loaded from external URL via Ajax. You can use requests/json module to load it:

import json
import requests


url = 'https://www.mohfw.gov.in/data/datanew.json'
data = requests.get(url).json()

# uncomment to print all data:
# print(json.dumps(data, indent=4))

# print some data on screen:
for d in data:
    print('{:<30} {:<10} {:<10} {:<10} {:<10}'.format(d['state_name'], d['active'], d['positive'], d['cured'], d['death']))

Prints:

Andaman and Nicobar Islands    329        548        214        5         
Andhra Pradesh                 75720      140933     63864      1349      
Arunachal Pradesh              670        1591       918        3         
Assam                          9814       40269      30357      98        
Bihar                          17579      51233      33358      296       
Chandigarh                     369        1051       667        15        
Chhattisgarh                   2803       9086       6230       53        

... and so on.
Answered By: Andrej Kesely

Try:

import json
import requests
import pandas as pd
data = []
row = []
r = requests.get('https://www.mohfw.gov.in/data/datanew.json')
j = json.loads(r.text)
for i in j:
    for k in i:
        row.append(i[k])
    data.append(row)
    row = []
columns = [i for i in j[0]]

df = pd.DataFrame(data, columns=columns)
df.sno = pd.to_numeric(df.sno, errors='coerce').reset_index()
df = df.sort_values('sno',)
print(df.to_string())

prints:

    sno                                state_name  active positive    cured  death new_active new_positive new_cured new_death state_code
0     0               Andaman and Nicobar Islands     329      548      214      5        403          636       226         7         35
1     1                            Andhra Pradesh   75720   140933    63864   1349      72188       150209     76614      1407         28
2     2                         Arunachal Pradesh     670     1591      918      3        701         1673       969         3         12
3     3                                     Assam    9814    40269    30357     98      10183        41726     31442       101         18
4     4                                     Bihar   17579    51233    33358    296      18937        54240     34994       309         10
5     5                                Chandigarh     369     1051      667     15        378         1079       683        18         04
6     6                              Chhattisgarh    2803     9086     6230     53       2720         9385      6610        55         22
7     7  Dadra and Nagar Haveli and Daman and Diu     412     1100      686      2        418         1145       725         2         26
8     8                                     Delhi   10705   135598   120930   3963      10596       136716    122131      3989         07
9     9                                       Goa    1657     5913     4211     45       1707         6193      4438        48         30
10   10                                   Gujarat   14090    61438    44907   2441      14300        62463     45699      2464         24

and so on…

Answered By: UWTD TV
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.