How to extract specific part of html using Beautifulsoup?

Question:

I am trying to extract the what’s within the ‘title’ tag from the following html, but so far I didn’t manage to.

<div class="pull_right date details" title="22.12.2022 01:49:03 UTC-03:00">

This is my code:

from bs4 import BeautifulSoup

with open("messages.html") as fp:
    soup = BeautifulSoup(fp, 'html.parser')

results = soup.find_all('div', attrs={'class':'pull_right date details'})

print(results)

And the output is a list with all <div for the html file.

Asked By: Marco Almeida

||

Answers:

To access the value inside title. Simply call ['title'].

If you use find_all, then this will return a list. Therefore you will need an index (e.g [0]['title'])

For example:

from bs4 import BeautifulSoup

fp = '<html><div class="pull_right date details" title="22.12.2022 01:49:03 UTC-03:00"></html>'
soup = BeautifulSoup(fp, 'html.parser')

results = soup.find_all('div', attrs={'class':'pull_right date details'})

print(results[0]['title'])

Or:

results = soup.find('div', attrs={'class':'pull_right date details'})

print(results['title'])

Output:

22.12.2022 01:49:03 UTC-03:00
22.12.2022 01:49:03 UTC-03:00
Answered By: Greg