Custom attributes in BeautifulSoup?
Question:
I am trying to use Beautiful soup to target a DIV with a non-standard attribute. Here’s the DIV:
`<div data-asin="099655596X" data-index="1" class="sg-col-20-of-24 s-result-item sg-col-0-of-12 sg-col-28-of-32 sg-col-16-of-20 sg-col sg-col-32-of-36 sg-col-12-of-16 sg-col-24-of-28" data-cel widget="search_result_1">`
I need to find_all DIV with the data-asin attribute, and get the asin as well. BS appears to support this feature, but what I am doing isn’t working. Here’s my code that doesn’t work:
`rows = soup.find_all(attrs={"data-asin": "value"})`
How do I need to craft my BS in Python3.7 to find_all these DIV?
Answers:
Use Css Selector to get that.
from bs4 import BeautifulSoup
html = '''
<div data-asin="099655596X" data-index="1" class="sg-col-20-of-24 s-result-item sg-col-0-of-12 sg-col-28-of-32 sg-col-16-of-20 sg-col sg-col-32-of-36 sg-col-12-of-16 sg-col-24-of-28" data-cel widget="search_result_1">
'''
soup = BeautifulSoup(html,'html.parser')
items=soup.select('div[data-asin="099655596X"]')
for item in items:
print(item['data-asin'])
OutPut:
099655596X
OR
from bs4 import BeautifulSoup
html = '''
<div data-asin="099655596X" data-index="1" class="sg-col-20-of-24 s-result-item sg-col-0-of-12 sg-col-28-of-32 sg-col-16-of-20 sg-col sg-col-32-of-36 sg-col-12-of-16 sg-col-24-of-28" data-cel widget="search_result_1">
'''
soup = BeautifulSoup(html,'html.parser')
items=soup.select('div[data-asin$="X"]')
for item in items:
print(item['data-asin'])
Have you tried with also specifying the tag? I’ve had no issue when specifying the tag in the syntax I.E.
rows = soup.find_all("div",attrs={"data-asin": "value"})`
I am trying to use Beautiful soup to target a DIV with a non-standard attribute. Here’s the DIV:
`<div data-asin="099655596X" data-index="1" class="sg-col-20-of-24 s-result-item sg-col-0-of-12 sg-col-28-of-32 sg-col-16-of-20 sg-col sg-col-32-of-36 sg-col-12-of-16 sg-col-24-of-28" data-cel widget="search_result_1">`
I need to find_all DIV with the data-asin attribute, and get the asin as well. BS appears to support this feature, but what I am doing isn’t working. Here’s my code that doesn’t work:
`rows = soup.find_all(attrs={"data-asin": "value"})`
How do I need to craft my BS in Python3.7 to find_all these DIV?
Use Css Selector to get that.
from bs4 import BeautifulSoup
html = '''
<div data-asin="099655596X" data-index="1" class="sg-col-20-of-24 s-result-item sg-col-0-of-12 sg-col-28-of-32 sg-col-16-of-20 sg-col sg-col-32-of-36 sg-col-12-of-16 sg-col-24-of-28" data-cel widget="search_result_1">
'''
soup = BeautifulSoup(html,'html.parser')
items=soup.select('div[data-asin="099655596X"]')
for item in items:
print(item['data-asin'])
OutPut:
099655596X
OR
from bs4 import BeautifulSoup
html = '''
<div data-asin="099655596X" data-index="1" class="sg-col-20-of-24 s-result-item sg-col-0-of-12 sg-col-28-of-32 sg-col-16-of-20 sg-col sg-col-32-of-36 sg-col-12-of-16 sg-col-24-of-28" data-cel widget="search_result_1">
'''
soup = BeautifulSoup(html,'html.parser')
items=soup.select('div[data-asin$="X"]')
for item in items:
print(item['data-asin'])
Have you tried with also specifying the tag? I’ve had no issue when specifying the tag in the syntax I.E.
rows = soup.find_all("div",attrs={"data-asin": "value"})`