Python selenium string formatting in web element
Question:
How do I iterate over multiple div web elements? I want to collect EPA Registration numbers (ex: 12455-61, 12455-61-3240) using a for loop, but I get error that the web element cannot be located using string formatting.
Here’s the HTML:
<div class="product">
<div class="mainSection"><p><span class="reportLabel">EPA Registration #:</span> 12455-61 <a class="link-no-underline" href="http://iaspub.epa.gov/apex/pesticides/f?p=PPLS:102:::NO::P102_REG_NUM:12455-61" target="_blank"><img src="/images/minimal_epalogo.png" alt="EPA logo"></a></p><p><span class="reportLabel">Registration Type:</span>
<div class="product">
<div class="mainSection"><p><span class="reportLabel">EPA Registration #:</span> 12455-61-3240 </p><p><span class="reportLabel">Registration Type:</span>Section 3 - Regular</p><p><span class="reportLabel">State Product Number:
Here’s portion of my code:
products = driver.find_elements(By.CLASS_NAME, "product")
no_of_results = driver.find_element(By.XPATH, '//*[@id="searchResults"]/p[5]')
# //*[@id="searchResults"]/div[1]/div[1]/p[1]/text() <-- this is the XPATH for EPA# 12455-61,
# //*[@id="searchResults"]/div[2]/div[1]/p[1]/text() <-- this is the XPATH for EPA# 12455-61-3240
for i in range(1, int(no_of_results.text[-1])):
for product in products:
try:
""" Find number of products found """
no_of_results = driver.find_element(By.XPATH, '//*[@id="searchResults"]/p[5]')
"Results:", no_of_results.text[-1]
EPA_reg_no = driver.find_element(By.XPATH, f'//*[@id="searchResults"]/div[%s]/div[1]/p[1]/text()') % (i)
"EPA Registration Number:", EPA_reg_no
print(EPA_reg_no)
if driver.find_element(By.ID, "searchResults").is_enabled: # No search results found
pesticide_table = None
else:
# more code
except Exception as e:
print(e)
continue
I get the error here:
EPA_reg_no = driver.find_element(By.XPATH, '//*[@id="searchResults"]/div[%s]/div[1]/p[1]/text()') % (i)
I also tried doing it like this:
EPA_reg_no = driver.find_element(By.XPATH, '//*[@id="searchResults"]/div[{}]/div[1]/p[1]/text()').format(i)
Answers:
To iterate over multiple div web elements and collect the EPA Registration numbers, you can modify your code as follows:
Accessed the .text property of the WebElement returned by driver.find_element() to retrieve the text of the EPA Registration number.
EPA_reg_no = driver.find_element(By.XPATH, '//*[@id="searchResults"]/div[{}]/div[1]/p[1]/text()'.format(i)).text
find_element
– Returns the first matching web element if the locator discovers multiple web elements
find_elements
– Returns a list of multiple matching web elements
So you should be using find_elements
to locate and return multiple elements.
Below code will print all the text within the EPA Registration # elements:
elements = driver.find_elements(By.XPATH, "//span[text()='EPA Registration #:']//parent::p")
for element in elements:
print(element.text)
# use this for loop if you want to split and keep only the values
for element in elements:
print(element.text.split(':')[1])
Result:
EPA Registration #: 12455-61
EPA Registration #: 12455-61-3240
12455-61
12455-61-3240
How do I iterate over multiple div web elements? I want to collect EPA Registration numbers (ex: 12455-61, 12455-61-3240) using a for loop, but I get error that the web element cannot be located using string formatting.
Here’s the HTML:
<div class="product">
<div class="mainSection"><p><span class="reportLabel">EPA Registration #:</span> 12455-61 <a class="link-no-underline" href="http://iaspub.epa.gov/apex/pesticides/f?p=PPLS:102:::NO::P102_REG_NUM:12455-61" target="_blank"><img src="/images/minimal_epalogo.png" alt="EPA logo"></a></p><p><span class="reportLabel">Registration Type:</span>
<div class="product">
<div class="mainSection"><p><span class="reportLabel">EPA Registration #:</span> 12455-61-3240 </p><p><span class="reportLabel">Registration Type:</span>Section 3 - Regular</p><p><span class="reportLabel">State Product Number:
Here’s portion of my code:
products = driver.find_elements(By.CLASS_NAME, "product")
no_of_results = driver.find_element(By.XPATH, '//*[@id="searchResults"]/p[5]')
# //*[@id="searchResults"]/div[1]/div[1]/p[1]/text() <-- this is the XPATH for EPA# 12455-61,
# //*[@id="searchResults"]/div[2]/div[1]/p[1]/text() <-- this is the XPATH for EPA# 12455-61-3240
for i in range(1, int(no_of_results.text[-1])):
for product in products:
try:
""" Find number of products found """
no_of_results = driver.find_element(By.XPATH, '//*[@id="searchResults"]/p[5]')
"Results:", no_of_results.text[-1]
EPA_reg_no = driver.find_element(By.XPATH, f'//*[@id="searchResults"]/div[%s]/div[1]/p[1]/text()') % (i)
"EPA Registration Number:", EPA_reg_no
print(EPA_reg_no)
if driver.find_element(By.ID, "searchResults").is_enabled: # No search results found
pesticide_table = None
else:
# more code
except Exception as e:
print(e)
continue
I get the error here:
EPA_reg_no = driver.find_element(By.XPATH, '//*[@id="searchResults"]/div[%s]/div[1]/p[1]/text()') % (i)
I also tried doing it like this:
EPA_reg_no = driver.find_element(By.XPATH, '//*[@id="searchResults"]/div[{}]/div[1]/p[1]/text()').format(i)
To iterate over multiple div web elements and collect the EPA Registration numbers, you can modify your code as follows:
Accessed the .text property of the WebElement returned by driver.find_element() to retrieve the text of the EPA Registration number.
EPA_reg_no = driver.find_element(By.XPATH, '//*[@id="searchResults"]/div[{}]/div[1]/p[1]/text()'.format(i)).text
find_element
– Returns the first matching web element if the locator discovers multiple web elements
find_elements
– Returns a list of multiple matching web elements
So you should be using find_elements
to locate and return multiple elements.
Below code will print all the text within the EPA Registration # elements:
elements = driver.find_elements(By.XPATH, "//span[text()='EPA Registration #:']//parent::p")
for element in elements:
print(element.text)
# use this for loop if you want to split and keep only the values
for element in elements:
print(element.text.split(':')[1])
Result:
EPA Registration #: 12455-61
EPA Registration #: 12455-61-3240
12455-61
12455-61-3240