Wrong location and size of element returned by Selenium in Python

Question:

I am trying to find the location of tables in a webpage where I do not have the ID/XPATH/CLASSNAME of the table. I am using similarity between the table I want and the tables present in the webpage. I am getting incorrect location and size of table when I use element.size / element.location. Any solution or anything am I doing wrong in the following:

chrome_options = Options()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get(URL)
fn = lambda X: driver.execute_script('return document.body.parentNode.scroll' + X)
driver.set_window_size(1024, fn('Height'))
driver.save_screenshot("sample.png")
tables = driver.find_elements(By.TAG_NAME,"table")
for table in tables:
   table_str = table.get_attribute("innerHTML")
   similarity_tables = similarity(my_table_words,table_str)
   if(similarity_tables>90):
       th = table.size['height']
       tw = table.size['width']
       tx = table.location['x']
       ty = table.location['y']

Using this code I am able to locate the correct/desired table but the location and size of the element returned is incorrect.

Asked By: user31934

||

Answers:

I think it took a long time to load the table.
Because Selenium is a dynamic web page automation framework, it can address this problem.
I’ll tell you my know-how.

time.sleep()

chrome_options = Options()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get(URL)
fn = lambda X: driver.execute_script('return document.body.parentNode.scroll' + X)
driver.set_window_size(1024, fn('Height'))
time.sleep(10) # <------------------------------------------------
driver.save_screenshot("sample.png")
tables = driver.find_elements(By.TAG_NAME,"table")
for table in tables:
   table_str = table.get_attribute("innerHTML")
   similarity_tables = similarity(my_table_words,table_str)
   if(similarity_tables>90):
       time.sleep(10) # <------------------------------------------------
       th = table.size['height']
       tw = table.size['width']
       tx = table.location['x']
       ty = table.location['y']

location_once_scrolled_into_view

You can try scrolling the page to the table before trying to get its location and size.

table.location_once_scrolled_into_view
th = table.size['height']
tw = table.size['width']
tx = table.location['x']
ty = table.location['y']
chrome_options = Options()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get(URL)
## remove
# fn = lambda X: driver.execute_script('return document.body.parentNode.scroll' + X)
# driver.set_window_size(1024, fn('Height'))
driver.save_screenshot("sample.png")
tables = driver.find_elements(By.TAG_NAME,"table")
for table in tables:
   table_str = table.get_attribute("innerHTML")
   similarity_tables = similarity(my_table_words,table_str)
   if(similarity_tables>90):
       table.location_once_scrolled_into_view # <-----------------------
       th = table.size['height']
       tw = table.size['width']
       tx = table.location['x']
       ty = table.location['y']

use not headless mode

You should know that the location and size of an element in a headless browser may differ from that of a non-headless browser.

chrome_options = Options()
## remove
# chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get(URL)
fn = lambda X: driver.execute_script('return document.body.parentNode.scroll' + X)
driver.set_window_size(1024, fn('Height'))
driver.save_screenshot("sample.png")
tables = driver.find_elements(By.TAG_NAME,"table")
for table in tables:
   table_str = table.get_attribute("innerHTML")
   similarity_tables = similarity(my_table_words,table_str)
   if(similarity_tables>90):
       th = table.size['height']
       tw = table.size['width']
       tx = table.location['x']
       ty = table.location['y']

Do your best.

If you’ve used all the ways, but they don’t work out, try to adjust them while extracting the size yourself.

Hope this helps.

Answered By: Min