How to extract informations from a li tag with selenium
Question:
I have a website i want to scrape but the information am looking for is contained in an "li" tag. This "li" tag has no class or id. Also all the "li" tags are contained in an "ul" tag without class or id. There are about 25 "li" tags contained in one "ul". How do i iterate this "li" tag to get all the informations contained in the 25 "li" tags. Meanwhile I want to do this with selenium
I want to extract the text elements contained in the "div". For example first "div" has ‘1,000,000 PPE Solutions’. I want to extract such text for all the "li" tags.
Answers:
Try below xpath in selenium
//div[contains(@class, 'tradename')]/..
You can try this:
This will list all the li
tags:
XPATH:
.//div[@class='item-list']/ul/li
CSS_SELECTOR:
.item-list ul li
Also, within the li
tag which information do you want to access?, for that , you need to post the URL or the complete HTML source.
from selenium.webdriver import webdriver , keys
driver = webdriver.chrome()
driver.get("https://your_target_site.com")
my_li_elements = driver.find_element_by_tag_name('li')
my_li_elements_text = my_li_elements.text
or
my_ul = driver.find_element_by_css_selector("ul")
my_li_elements =my_ul.find_element_by_css_selector("li")
for li in my_li_elements:
text = li.text
print(text)
i hope it will help you
The required information is within the <a>
tag which is within individual <li>
tags having a parent <ul>
tag within:
<div class="item-list">
Solution
To extract all the information from all the <li>
tags ideally you need to induce WebDriverWait for visibility_of_all_elements_located() and using List Comprehension you can use either of the following Locator Strategies:
-
Using CSS_SELECTOR:
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.item-list > ul li div.field-content > a[href]")))])
-
Using XPATH:
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='item-list']/ul//li//div[@class='field-content']/a[@href]")))])
-
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
I have a website i want to scrape but the information am looking for is contained in an "li" tag. This "li" tag has no class or id. Also all the "li" tags are contained in an "ul" tag without class or id. There are about 25 "li" tags contained in one "ul". How do i iterate this "li" tag to get all the informations contained in the 25 "li" tags. Meanwhile I want to do this with selenium
I want to extract the text elements contained in the "div". For example first "div" has ‘1,000,000 PPE Solutions’. I want to extract such text for all the "li" tags.
Try below xpath in selenium
//div[contains(@class, 'tradename')]/..
You can try this:
This will list all the li
tags:
XPATH:
.//div[@class='item-list']/ul/li
CSS_SELECTOR:
.item-list ul li
Also, within the li
tag which information do you want to access?, for that , you need to post the URL or the complete HTML source.
from selenium.webdriver import webdriver , keys
driver = webdriver.chrome()
driver.get("https://your_target_site.com")
my_li_elements = driver.find_element_by_tag_name('li')
my_li_elements_text = my_li_elements.text
or
my_ul = driver.find_element_by_css_selector("ul")
my_li_elements =my_ul.find_element_by_css_selector("li")
for li in my_li_elements:
text = li.text
print(text)
i hope it will help you
The required information is within the <a>
tag which is within individual <li>
tags having a parent <ul>
tag within:
<div class="item-list">
Solution
To extract all the information from all the <li>
tags ideally you need to induce WebDriverWait for visibility_of_all_elements_located() and using List Comprehension you can use either of the following Locator Strategies:
-
Using CSS_SELECTOR:
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.item-list > ul li div.field-content > a[href]")))])
-
Using XPATH:
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='item-list']/ul//li//div[@class='field-content']/a[@href]")))])
-
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC