Custom attributes in BeautifulSoup?

Question:

I am trying to use Beautiful soup to target a DIV with a non-standard attribute. Here’s the DIV:

`<div data-asin="099655596X" data-index="1" class="sg-col-20-of-24 s-result-item sg-col-0-of-12 sg-col-28-of-32 sg-col-16-of-20 sg-col sg-col-32-of-36 sg-col-12-of-16 sg-col-24-of-28" data-cel widget="search_result_1">`

I need to find_all DIV with the data-asin attribute, and get the asin as well. BS appears to support this feature, but what I am doing isn’t working. Here’s my code that doesn’t work:

`rows = soup.find_all(attrs={"data-asin": "value"})`

How do I need to craft my BS in Python3.7 to find_all these DIV?

Asked By: krypterro

||

Answers:

Use Css Selector to get that.

from bs4 import BeautifulSoup
html = '''
<div data-asin="099655596X" data-index="1" class="sg-col-20-of-24 s-result-item sg-col-0-of-12 sg-col-28-of-32 sg-col-16-of-20 sg-col sg-col-32-of-36 sg-col-12-of-16 sg-col-24-of-28" data-cel widget="search_result_1">
'''
soup = BeautifulSoup(html,'html.parser')
items=soup.select('div[data-asin="099655596X"]')
for item in items:
    print(item['data-asin'])

OutPut:

099655596X

OR

from bs4 import BeautifulSoup
html = '''
<div data-asin="099655596X" data-index="1" class="sg-col-20-of-24 s-result-item sg-col-0-of-12 sg-col-28-of-32 sg-col-16-of-20 sg-col sg-col-32-of-36 sg-col-12-of-16 sg-col-24-of-28" data-cel widget="search_result_1">
'''
soup = BeautifulSoup(html,'html.parser')
items=soup.select('div[data-asin$="X"]')
for item in items:
    print(item['data-asin'])
Answered By: KunduK

Have you tried with also specifying the tag? I’ve had no issue when specifying the tag in the syntax I.E.

rows = soup.find_all("div",attrs={"data-asin": "value"})`
Answered By: John Martin
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.