Webscrapping with {value} and not the value showed in the webpage

Question:

I was trying to get into the brazilian lotteries the value of some items using beautiful soup. What happend is that the page shows me something when i’m navigating and when another – not really useful – when I’m trying to scrape.

html= "https://loterias.caixa.gov.br/Paginas/App/Mega-Sena.aspx"
soup = BeautifulSoup(html, 'html.parser')
soup.find_all("p", class_="value ng-binding")[0].text

The response I get is:

{{resultado.valorEstimadoProximoConcurso | currency}}                            

and what I was wishing to get is (for today, the value is this but it changes with the date):

R$ 500.000.000,00

Is there any way that I can find the values stored?

Asked By: Tanise Brandão

||

Answers:

The issue you are encountering is that you are trying to scrape data using an HTTP request, and the data you are trying to scrape is rendered with Javascript from an external API.

HTTP doesn’t execute or run Javascript like a browser, and therefore it won’t execute and render correctly, you could use something like Puppeteer, but it’s overkill for this purpose. (I don’t recommend it)

I searched for this value in the Chrome Developer Tools under "Network" and found that the external URL they are getting the data from is:

https://servicebus2.caixa.gov.br/portaldeloterias/api/megasena/

You can see it in the JSON body under valorEstimadoProximoConcurso. You will need to parse this using JSON rather than HTML (No need for Beautiful Soup!)

Hope this helped!

Answered By: Conor Reid
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.