Multiple span tag under one parent DIV id always returns first record
Question:
I have multiple span tag with same class name under one parent div id. But, the BeautifulSoup item loop always returns first attribute only, rest of the attributes are not printing.
Note : All of my span class names are same as mentioned below. Any suggestions?
<div class="product_constant_fields">
<div class="width-auto">
<span class="field_name">Item #:</span>
<span class="field_value">AB11223344</span>
</div>
<div class="width-auto">
<span class="field_name">Brand:</span>
<span class="field_value">Johns</span>
</div>
<div class="width-auto">
<span class="field_name">UPC#:</span>
<span class="field_value">12345678901234</span>
</div>
<div class="width-auto">
<span class="field_name">UNSPSC:</span>
<span class="field_value">12345678</span>
</div>
<div class="width-auto">
<span class="field_name">ManufacturerNo:</span>
<span class="field_value">1234567</span>
</div>
<div class="width-auto">
<span class="field_name">Alternate MFG #:</span>
<span class="field_value">87654321</span>
</div>
</div>
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, 'html.parser')
soup = BeautifulSoup(response.text, 'html5lib')
results = []
product_attributes = soup.find_all(class_="product_constant_fields")
for item in product_attributes:
parsed = {}
field_name=item.find(class_="field_name")
parsed["field_name"] = field_name.text
field_value=item.find(class_="field_value")
parsed["field_value"] = field_value.text
results.append(parsed)
print(results)
**I am getting only first record as output:**
[{'field_name': 'Item #:', 'field_value': 'AB11223344'}]
**Expected Output:**
[
{'field_name': 'Item #:', 'field_value': 'AB11223344'},
{'field_name': 'Brand:', 'field_value': 'Johns'},
{'field_name': 'UPC#:', 'field_value': '12345678901234'},
{'field_name': 'UNSPSC:', 'field_value': '12345678'},
{'field_name': 'ManufacturerNo:', 'field_value': '1234567'},
{'field_name': 'Alternate MFG #:', 'field_value': '87654321'}
Answers:
Try to change the:
product_attributes = soup.find_all(class_="product_constant_fields")
to:
product_attributes = soup.select(".product_constant_fields .width-auto")
Complete code:
from bs4 import BeautifulSoup
html_doc = '''
<div class="product_constant_fields">
<div class="width-auto">
<span class="field_name">Item #:</span>
<span class="field_value">AB11223344</span>
</div>
<div class="width-auto">
<span class="field_name">Brand:</span>
<span class="field_value">Johns</span>
</div>
<div class="width-auto">
<span class="field_name">UPC#:</span>
<span class="field_value">12345678901234</span>
</div>
<div class="width-auto">
<span class="field_name">UNSPSC:</span>
<span class="field_value">12345678</span>
</div>
<div class="width-auto">
<span class="field_name">ManufacturerNo:</span>
<span class="field_value">1234567</span>
</div>
<div class="width-auto">
<span class="field_name">Alternate MFG #:</span>
<span class="field_value">87654321</span>
</div>
</div>'''
soup = BeautifulSoup(html_doc, 'html.parser')
product_attributes = soup.select(".product_constant_fields .width-auto")
results = []
for item in product_attributes:
parsed = {}
field_name=item.find(class_="field_name")
parsed["field_name"] = field_name.text
field_value=item.find(class_="field_value")
parsed["field_value"] = field_value.text
results.append(parsed)
print(results)
Prints:
[
{"field_name": "Item #:", "field_value": "AB11223344"},
{"field_name": "Brand:", "field_value": "Johns"},
{"field_name": "UPC#:", "field_value": "12345678901234"},
{"field_name": "UNSPSC:", "field_value": "12345678"},
{"field_name": "ManufacturerNo:", "field_value": "1234567"},
{"field_name": "Alternate MFG #:", "field_value": "87654321"},
]
I have multiple span tag with same class name under one parent div id. But, the BeautifulSoup item loop always returns first attribute only, rest of the attributes are not printing.
Note : All of my span class names are same as mentioned below. Any suggestions?
<div class="product_constant_fields">
<div class="width-auto">
<span class="field_name">Item #:</span>
<span class="field_value">AB11223344</span>
</div>
<div class="width-auto">
<span class="field_name">Brand:</span>
<span class="field_value">Johns</span>
</div>
<div class="width-auto">
<span class="field_name">UPC#:</span>
<span class="field_value">12345678901234</span>
</div>
<div class="width-auto">
<span class="field_name">UNSPSC:</span>
<span class="field_value">12345678</span>
</div>
<div class="width-auto">
<span class="field_name">ManufacturerNo:</span>
<span class="field_value">1234567</span>
</div>
<div class="width-auto">
<span class="field_name">Alternate MFG #:</span>
<span class="field_value">87654321</span>
</div>
</div>
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, 'html.parser')
soup = BeautifulSoup(response.text, 'html5lib')
results = []
product_attributes = soup.find_all(class_="product_constant_fields")
for item in product_attributes:
parsed = {}
field_name=item.find(class_="field_name")
parsed["field_name"] = field_name.text
field_value=item.find(class_="field_value")
parsed["field_value"] = field_value.text
results.append(parsed)
print(results)
**I am getting only first record as output:**
[{'field_name': 'Item #:', 'field_value': 'AB11223344'}]
**Expected Output:**
[
{'field_name': 'Item #:', 'field_value': 'AB11223344'},
{'field_name': 'Brand:', 'field_value': 'Johns'},
{'field_name': 'UPC#:', 'field_value': '12345678901234'},
{'field_name': 'UNSPSC:', 'field_value': '12345678'},
{'field_name': 'ManufacturerNo:', 'field_value': '1234567'},
{'field_name': 'Alternate MFG #:', 'field_value': '87654321'}
Try to change the:
product_attributes = soup.find_all(class_="product_constant_fields")
to:
product_attributes = soup.select(".product_constant_fields .width-auto")
Complete code:
from bs4 import BeautifulSoup
html_doc = '''
<div class="product_constant_fields">
<div class="width-auto">
<span class="field_name">Item #:</span>
<span class="field_value">AB11223344</span>
</div>
<div class="width-auto">
<span class="field_name">Brand:</span>
<span class="field_value">Johns</span>
</div>
<div class="width-auto">
<span class="field_name">UPC#:</span>
<span class="field_value">12345678901234</span>
</div>
<div class="width-auto">
<span class="field_name">UNSPSC:</span>
<span class="field_value">12345678</span>
</div>
<div class="width-auto">
<span class="field_name">ManufacturerNo:</span>
<span class="field_value">1234567</span>
</div>
<div class="width-auto">
<span class="field_name">Alternate MFG #:</span>
<span class="field_value">87654321</span>
</div>
</div>'''
soup = BeautifulSoup(html_doc, 'html.parser')
product_attributes = soup.select(".product_constant_fields .width-auto")
results = []
for item in product_attributes:
parsed = {}
field_name=item.find(class_="field_name")
parsed["field_name"] = field_name.text
field_value=item.find(class_="field_value")
parsed["field_value"] = field_value.text
results.append(parsed)
print(results)
Prints:
[
{"field_name": "Item #:", "field_value": "AB11223344"},
{"field_name": "Brand:", "field_value": "Johns"},
{"field_name": "UPC#:", "field_value": "12345678901234"},
{"field_name": "UNSPSC:", "field_value": "12345678"},
{"field_name": "ManufacturerNo:", "field_value": "1234567"},
{"field_name": "Alternate MFG #:", "field_value": "87654321"},
]