I have an issue related to scraping date from a website using python and Beautifulsoup like there it is the splitting issue where `.split('.', "")

Question:

I have an issue related to scraping date from a website using python and Beautifulsoup like there I am facing the splitting issue where .split('.', "") is not working on scraping only date from this p tag <p class="text-xs">Oct 24, 2017 • 4 min read</p> Actually I don’t want this dot and 4 min read from this p tag

Published_Date = soup.select_one('p[class="text-xs"]').get('datetime')
Asked By: Info Rewind

||

Answers:

  1. The bold big dot is different that . dot you are using in split() method.

  2. So replace the bold big dot with a symbol and split that symbol and take the first value using list slicing

Example:

from bs4 import BeautifulSoup

html ='''
<p class="text-xs">Oct 24, 2017 • 4 min read</p>

'''

soup = BeautifulSoup(html,'html.parser')

date = soup.select_one('p.text-xs').get_text(strip=True)
print(date.replace('•','|').split('|')[0])

Output:

Oct 24, 2017
Answered By: Fazlul