Python – extract image source address from a Selenium div element

Question:

We are working on extracting the image source address from the page.

<div class="product-row">
  <div class="product-item">
  <div class="product-picture"><img src="https://t3a.coupangcdn.com/thumbnails/remote/212x212ex/image/vendor_inventory/6ca9/2e097d911efc291473d0c47052cdc8f42d7b7b8f2a3ebbb0ccc974d76fe4.jpg" alt="product"><div><button type="button" class="ant-btn hover-btn btn-open-detail">
  </div></div>
  <div class="product-item">
  <div class="product-picture">
  <img src="https://thumbnail11.coupangcdn.com/thumbnails/remote/212x212ex/image/retail/images/239519218793467-6edc7d92-4165-4476-a528-fa238ffeeeb6.jpg" alt="product"><div></div></div>

I tried to get it in the following way:

ele = driver.find_elements_by_xpath("//div[@class='product-picture']/img")
print(ele)

Output:

<selenium.webdriver.remote.webelement.WebElement (session="d9fd08b93bd5dd83fe520826c1f6fd77", element="27ef8c33-624d-4166-9dc7-3a355c4dcc32")>
<selenium.webdriver.remote.webelement.WebElement (session="d9fd08b93bd5dd83fe520826c1f6fd77", element="a6d77107-fecf-4c84-a048-9b4bda39b9df")>
<selenium.webdriver.remote.webelement.WebElement (session="d9fd08b93bd5dd83fe520826c1f6fd77", element="1f62cb8b-df58-4f06-afe6-6c60cb572527")>

I want the image source address string of every <div class="product-picture"> element on the page. Is there a way to extract a string?

Asked By: anfwkdrn

||

Answers:

Try to use get_attribute(‘src’) method to grab the src value

ele = driver.find_elements_by_xpath("//div[@class='product-picture']/img").get_attribute('src')
Answered By: Fazlul
from selenium.webdriver.common.by import By

images = driver.find_elements(By.XPATH, "//div[@class='product-picture']/img")
for img in images:
    print(img.get_attribute("src"))

This will give you the expected output:

https://t3a.coupangcdn.com/thumbnails/remote/212x212ex/image/vendor_inventory/6ca9/2e097d911efc291473d0c47052cdc8f42d7b7b8f2a3ebbb0ccc974d76fe4.jpg"
https://thumbnail11.coupangcdn.com/thumbnails/remote/212x212ex/image/retail/images/239519218793467-6edc7d92-4165-4476-a528-fa238ffeeeb6.jpg
Answered By: Himanshuman

You are using deprecated syntax. Please see Python Selenium warning "DeprecationWarning: find_element_by_* commands are deprecated"

The optimal way of locating elements which are likely to be lazy loading would be:

images = WebDriverWait(browser, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//div[@class='product-picture']/img")))
for i in images:
    print(i.get_attribute('src')

You will also need the following imports:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

Selenium docs can be found at https://www.selenium.dev/documentation/

Answered By: Barry the Platipus
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.