How to use OR operator between classes in beautifulsoup findAll?
Question:
I have a trouble parsing html. I am working with a website that have some items in a list with different class names. What I’m trying to do is find them all in a single findAll like this:
page_soup.findAll("li", {"Class" : "Class1" or "Class2"})
I want to have “OR” between my classes.
Sample html:
<ol class="products-list" id="products">
<li class="item odd">
</li>
<li class="item even">
</li>
<li class="item last even">
</li>
</ol>
Answers:
Full working sample :
from bs4 import BeautifulSoup
text = """
<body>
<ul>
<li class="Class1">Class 1</li>
<li class="Class2">Class 2</li>
<div class="Class1 special">Class 1 in div</div>
<div class="Class2 special">Class2 in div</div>
</ul>
</body>"""
soup = BeautifulSoup(text,"lxml")
result = soup.find_all(lambda tag: tag.name == 'li' and
( tag.get('class') == ['Class1'] or tag.get('class') == ['Class2'] ))
print(result)
Use Select
() which is faster than findAll
()
page_soup=BeautifulSoup(html,'html.parser')
for item in page_soup.select(".odd,.even"):
print(item)
Code here:
from bs4 import BeautifulSoup
html='''<ol class="products-list" id="products">
<li class="item odd">
</li>
<li class="item even">
</li>
<li class="item last even">
</li>
</ol>
'''
page_soup=BeautifulSoup(html,'html.parser')
for item in page_soup.select(".odd,.even"):
print(item)
from bs4 import BeautifulSoup
html='''
<ol class="products-list" id="products">
<li class="item odd"></li>
<li class="item even"></li>
<li class="item last even"></li>
</ol>'''
soup = BeautifulSoup(html, 'lxml')
data = soup.findall('li', class_=['odd', 'even'])
print(data)
I have a trouble parsing html. I am working with a website that have some items in a list with different class names. What I’m trying to do is find them all in a single findAll like this:
page_soup.findAll("li", {"Class" : "Class1" or "Class2"})
I want to have “OR” between my classes.
Sample html:
<ol class="products-list" id="products">
<li class="item odd">
</li>
<li class="item even">
</li>
<li class="item last even">
</li>
</ol>
Full working sample :
from bs4 import BeautifulSoup
text = """
<body>
<ul>
<li class="Class1">Class 1</li>
<li class="Class2">Class 2</li>
<div class="Class1 special">Class 1 in div</div>
<div class="Class2 special">Class2 in div</div>
</ul>
</body>"""
soup = BeautifulSoup(text,"lxml")
result = soup.find_all(lambda tag: tag.name == 'li' and
( tag.get('class') == ['Class1'] or tag.get('class') == ['Class2'] ))
print(result)
Use Select
() which is faster than findAll
()
page_soup=BeautifulSoup(html,'html.parser')
for item in page_soup.select(".odd,.even"):
print(item)
Code here:
from bs4 import BeautifulSoup
html='''<ol class="products-list" id="products">
<li class="item odd">
</li>
<li class="item even">
</li>
<li class="item last even">
</li>
</ol>
'''
page_soup=BeautifulSoup(html,'html.parser')
for item in page_soup.select(".odd,.even"):
print(item)
from bs4 import BeautifulSoup
html='''
<ol class="products-list" id="products">
<li class="item odd"></li>
<li class="item even"></li>
<li class="item last even"></li>
</ol>'''
soup = BeautifulSoup(html, 'lxml')
data = soup.findall('li', class_=['odd', 'even'])
print(data)