How to find all divs whose class starts with a string in BeautifulSoup?

Question:

In BeautifulSoup, if I want to find all div’s where whose class is span3, I’d just do:

result = soup.findAll("div",{"class":"span3"})

However, in my case, I want to find all div’s whose class starts with span3, therefore, BeautifulSoup should find:

<div id="span3 span49">
<div id="span3 span39">

And so on…

How do I achieve what I want? I am familiar with regular expressions; however I do not know how to implement them to beautiful soup nor did I find any help by going through BeautifulSoup’s documentation.

Asked By: George Chalhoub

||

Answers:

Well, these are id attributes you are showing:

<div id="span3 span49">
<div id="span3 span39">

In this case, you can use:

soup.find_all("div", id=lambda value: value and value.startswith("span3"))

Or:

soup.find_all("div", id=re.compile("^span3"))

If this was just a typo, and you actually have class attributes start with span3, and your really need to check the class to start with span3, you can use the “starts-with” CSS selector:

soup.select("div[class^=span3]")

This is because you cannot check the class attribute the same way you checked the id attribute because class is special, it is a multi-valued attribute.

Answered By: alecxe

This works too:

soup.select("div[class*=span3]") # with *= means: contains
Answered By: oscarAguayo
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.