How to extract the dynamic images from the link using Selenium and Python?
Question:
I am trying to download the image from the link. Since it is a dynamic rendering images, how can I download the image?
Right now I tried to get the image URL, but this is rendering images I am able to fetch the first link.
Here is the code:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
executable_path = "/usr/bin/chromedriver"
chrome_options = Options()
os.environ["webdriver.chrome.driver"] = executable_path
driver = webdriver.Chrome(executable_path=executable_path, chrome_options=chrome_options)
driver.get("https://www.macfarlanepartners.com/projects/park-fifth-mid-rise/")
driver.maximize_window()
print "Entered"
elements = driver.find_elements_by_xpath("""/html/body/div/div/div[3]/div[1]/div[1]""")
time.sleep(5)
for i in elements:
image = i.find_element_by_tag_name("img")
img_src = image.get_attribute("src")
print img_src
driver.close()
Output
Am getting only first image link:
https://www.macfarlanepartners.com/wp-content/uploads/2016/08/Park-Fifth-Mid-Rise-Large-Photo-1-480x240.jpg
Also is there a way to automatically find the src and img tag to download the image, instead of searching for xpath based on different websites?
Answers:
I think you need a second loop:
for i in elements:
images = i.find_elements_by_tag_name("img")
for img in images:
img_src = img.get_attribute("src")
print img_src
You are able to retrieve only the first link as your Locator Strategy of find_elements_by_xpath("""/html/body/div/div/div[3]/div[1]/div[1]""")
returns only one element and thats why the for()
loop iterates only once to extract the value of src
attribute of the first <img>
tag.
A better solution will be to include the <img>
tag in the xpath
as follows :
elements = driver.find_elements_by_xpath("""/html/body/div/div/div[3]/div[1]/div[1]/img""")
for i in elements:
img_src = i.get_attribute("src")
print img_src
You will be able to retrieve 3 links as follows :
https://www.macfarlanepartners.com/wp-content/uploads/2016/08/Park-Fifth-Mid-Rise-Large-Photo-1-480x240.jpg
https://www.macfarlanepartners.com/wp-content/uploads/2016/08/Park-Fifth-Mid-Rise-Large-Photo-2-480x240.jpg
https://www.macfarlanepartners.com/wp-content/uploads/2016/08/Park-Fifth-Mid-Rise-Large-Photo-3-1-480x240.jpg
I am trying to download the image from the link. Since it is a dynamic rendering images, how can I download the image?
Right now I tried to get the image URL, but this is rendering images I am able to fetch the first link.
Here is the code:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
executable_path = "/usr/bin/chromedriver"
chrome_options = Options()
os.environ["webdriver.chrome.driver"] = executable_path
driver = webdriver.Chrome(executable_path=executable_path, chrome_options=chrome_options)
driver.get("https://www.macfarlanepartners.com/projects/park-fifth-mid-rise/")
driver.maximize_window()
print "Entered"
elements = driver.find_elements_by_xpath("""/html/body/div/div/div[3]/div[1]/div[1]""")
time.sleep(5)
for i in elements:
image = i.find_element_by_tag_name("img")
img_src = image.get_attribute("src")
print img_src
driver.close()
Output
Am getting only first image link:
https://www.macfarlanepartners.com/wp-content/uploads/2016/08/Park-Fifth-Mid-Rise-Large-Photo-1-480x240.jpg
Also is there a way to automatically find the src and img tag to download the image, instead of searching for xpath based on different websites?
I think you need a second loop:
for i in elements:
images = i.find_elements_by_tag_name("img")
for img in images:
img_src = img.get_attribute("src")
print img_src
You are able to retrieve only the first link as your Locator Strategy of find_elements_by_xpath("""/html/body/div/div/div[3]/div[1]/div[1]""")
returns only one element and thats why the for()
loop iterates only once to extract the value of src
attribute of the first <img>
tag.
A better solution will be to include the <img>
tag in the xpath
as follows :
elements = driver.find_elements_by_xpath("""/html/body/div/div/div[3]/div[1]/div[1]/img""")
for i in elements:
img_src = i.get_attribute("src")
print img_src
You will be able to retrieve 3 links as follows :
https://www.macfarlanepartners.com/wp-content/uploads/2016/08/Park-Fifth-Mid-Rise-Large-Photo-1-480x240.jpg
https://www.macfarlanepartners.com/wp-content/uploads/2016/08/Park-Fifth-Mid-Rise-Large-Photo-2-480x240.jpg
https://www.macfarlanepartners.com/wp-content/uploads/2016/08/Park-Fifth-Mid-Rise-Large-Photo-3-1-480x240.jpg