Loading data from a dynamically generated html url

Question

I wanted to know is Selenium the only library that would be able to access data from a table in a webpage specifically here.

When I try to parse these sites using bs4 it doesn’t have any data in the tables just the headers, it works locally using selenium, but the issue is I don’t have chrome or any browser for that matter on the box I’m working on. Wondering if there was another way.

Asked By: Alex

||

Source

Answer 1

The page you linked to loads another resource using AJAX (you can see this in the Network tab of the Inspector feature of your browser):

https://httpd.sslmate.com/ocspwatch/problems

It’s plain JSON, you don’t even have to scrape it:

import requests

certificates = requests.get("https://httpd.sslmate.com/ocspwatch/problems").json()
for cert in certificates:
    print(cert["problem_time"], ":", cert["problem"], "(", cert["operator_name"], ")")

Output:

2023-03-10T00:11:32+00:00 : error parsing OCSP response: ocsp: error from server: unauthorized ( GoDaddy )
2023-03-10T00:14:58+00:00 : error parsing OCSP response: OCSP response contains bad number of responses ( eMudhra Technologies Limited )
2023-03-10T00:14:57+00:00 : OCSP responder does not know this certificate ( Netlock )
...

Answered By: Selcuk

Loading data from a dynamically generated html url

Question:

Answers: