Beautifulsoup multiple class selector

Question:

I want to select all the divs which have BOTH A and B as class attributes.

The following selection

soup.findAll('div', class_=['A', 'B'])

however selects all the divs which have EITHER A or B in their class attributes. Classes may have many other attributes (C, D, etc) in any order, but I want to select only those ones that have both A and B.

Asked By: Botond

||

Answers:

Use css selectors instead:

soup.select('div.A.B')
Answered By: lucasnadalutti

You can use CSS selectors instead, which is probably the best solution here.

soup.select("div.classname1.classname2")

You could also use a function.

def interesting_tags(tag):
    if tag.name == "div":
        classes = tag.get("class", [])
        return "A" in classes and "B" in classes

soup.find_all(interesting_tags)
Answered By: sytech

1
some tag like:

<span class="A B C D">XXXX</span>

if you want to use CSS selector to get the tag, you can write the code for the class attribute as following:

spans = beautifulsoup.select('span.A.B.C.D')

2 And if you want to use this for id attribute, you change as following:

<span id="A">XXXX</span>

change the symbol you use in select function:

span = beautifulsoup.select('span#A')

What we learn is that its grammer is like the CSS3

Answered By: accfcx

for latest BeautifulSoup, you can use regex to search class

code:

import re
from bs4 import BeautifulSoup

multipleClassHtml = """
<div class="A B">only A and B</div>
<div class="A     B">class contain space</div>
<div class="A B C D">except A and B contain other class</div>
<div class="A C D">only A</div>
<div class="B D">only B</div>
<div class=" D E F">no A B</div>
"""

soup = BeautifulSoup(multipleClassHtml, 'html.parser')

bothABClassP = re.compile("As+B", re.I)
foundAllAB = soup.find_all("div", attrs={"class": bothABClassP})
print("foundAllAB=%s" % foundAllAB)

output:

foundAllAB=[<div class="A B">only A and B</div>, <div class="A    B">class contain space</div>, <div class="A B C D">except A and B contain other class</div>]

vscode debug bs4

Answered By: crifan
table = soup.find_all("tr",class_=["odd","even"])

Try this way! Make sure you are using proper structure of those quotes and braces. It confused me.

Answered By: Suraj Kadam
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.