Locating an element using Python and Selenium via innerHTML

Question:

I’m new to Selenium and I’m trying to write my first real script using the package for Python.

I’m using:

  • Windows 10
  • Python 3.10.5
  • Selenium 4.3.0

So far I’ve been able to do everything I need with different selectors, like ID, name, XPATH etc.

However I’ve stumbled upon an issue where I need to find a specific element by using the innerHTML of it.

The issue I’m facing is that I need to find an element with the innerHTML-value of "Changed" as seen in the HTML below.

The first challenge I’m facing is that the element doesn’t have a unique ID, name or otherwise to identify it and there’s many objects/elements of "dlx-treeview-node".
The second challenge is that XPATH won’t work because the element changes position depending on where you are on the website (the number of "dlx-treeview-node"-elements change), so if I use XPATH I’ll get the wrong element depending on where I am.

I can successfully get the name by using the below XPATH, "get_attribute" and printing to console, which is why I know it’s innerHTML and not innerText, but as mentioned this will change depending on where I am on the website.

I would really appreciate any help I can get to solve this challenge and to learn more about the use of Selenium with Python.

Code trials:

select_filter_name = wait.until(EC.element_to_be_clickable((By.XPATH, "/html/body/div/app-root/dlx-select-filter-attribute-dialog/dlx-dialog-window/div/div[2]/div/div/div[5]/div/div/dlx-view-column-selector-component/div[1]/dlx-treeview/div/dlx-treeview-nodes/div/dlx-treeview-nodes/div/dlx-treeview-node[16]/div/div/div/div[2]/div/dlx-text-truncater/div")))
filter_name = select_filter_name.get_attribute("innerHTML")
print(filter_name)

HTML:

<dlx-treeview-node _nghost-nrk-c188="" class="ng-star-inserted">
  <div _ngcontent-nrk-c188="" dlx-droppable="" dlx-draggable="" dlx-file-drop="" class="d-flex flex-column position-relative dlx-hover on-hover-show-expandable-menu bg-control-active bg-control-hover">
    <div _ngcontent-nrk-c188="" class="d-flex flex-row ml-2">
      <div _ngcontent-nrk-c188="" class="d-flex flex-row text-nowrap expand-horizontal" style="padding-left: 15px;">
        <!---->
        <div _ngcontent-nrk-c188="" class="d-flex align-self-center ng-star-inserted" style="min-width: 16px; margin-left: 3px;">
          <!---->
        </div>
        <!---->
        <div _ngcontent-nrk-c188="" class="d-flex flex-1 flex-no-overflow-x" style="padding: 3.5px 0px;">
          <div class="d-flex flex-row justify-content-start flex-no-overflow-x align-items-center expand-horizontal ng-star-inserted">
            <!---->
            <dlx-text-truncater class="overflow-hidden d-flex flex-no-overflow-x ng-star-inserted">
              <div class="text-truncate expand-horizontal ng-star-inserted">Changed</div>
              <!---->
              <!---->
            </dlx-text-truncater>
            <!---->
          </div>
          <!---->
          <!---->
          <!---->
        </div>
      </div>
      <!---->
      <!---->
    </div>
  </div>
  <!---->
  <dlx-attachment-content _ngcontent-nrk-c188="">
    <div style="position: fixed; z-index: 10001; left: -10000px; top: -10000px; pointer-events: auto;">
      <!---->
      <!---->
    </div>
  </dlx-attachment-content>
</dlx-treeview-node>

Edit-1:

NOTE: I’m not sure I’m using the correct terms for HTML, so please correct me if I’m wrong.

I’ve learned that I have a follow up question:

How do I search for the text as described, but only searching in the "dlx-treeview-node" (there’s about 100 of these)? So basically searching in the "children" of these.

The question is because I’ve learned that there are more elements with the specific text I’m searching for in other places.

Edit-2/solution:

I ended up finding my own solution before I received answers – I’m writing it here in case it can help anyone else.
The reply that is marked as "answer" is because this came the closest to what I needed.

The final code ended up like this (first searching the nodes – then searching the children for the specific innerHTML):

select_filter_name = wait.until(EC.element_to_be_clickable((By.XPATH, "//dlx-treeview-node[.//div[text()='Changed']]")))
Asked By: BHR

||

Answers:

just run this code on your page and you will get an array of all elements which are a div with the value of Changed

# Define XPath Function (used in the next step)
driver.execute_script("function getXPathOfElement(elt) {var path = "";for (; elt && elt.nodeType == 1; elt = elt.parentNode) { idx = getElementIdx(elt); xname = elt.tagName; if (idx > 1) xname += "[" + idx + "]"; path = "/" + xname + path;} return path;}")

# Get all XPaths for all nodes which are a div with the text of "changed"
xpaths = driver.execute_script("return Array.from(document.querySelectorAll("div")).find(el => el.textContent.includes('Changed')).map((node)=>{ return getXPathOfElement(node)});');

write up

  • the first execute adds a javascript function to the dom called getXPathOfElement this function accepts a html node element and will provide the xpath string for said node.

  • the second execute gets all elements which are a div with the text of Changed this will then loop through each element and then provide you with an array of strings, where each string is an xpath by calling the above getXPathOfElement function on each node.

the js is quite simple and harmless.

Tips

  • check if xpaths length is more than or equal to 1
  • index xpaths such as xpaths[0] or do loops to make your changes
  • you will now have an xpath which can be used like a normal selector.

good luck

Edit 1

execute_script() synchronously executes JavaScript in the current window/frame.

or find more here

Answered By: Dean Van Greunen

Presuming the innerText of the <div> element as a unique text within the HTML DOM to locate the element with the innerHTML as Changed you can use either of the following xpath based locator strategies:

  • Using xpath and text():

    element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[text()='Changed']")))
    
  • Using xpath and contains():

    element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[contains(., 'Changed')]")))
    
Answered By: undetected Selenium