Differentiate by section class in Selenium

Question:

Im facing a problem in Python selenium,
I would like to print on my code the following data, an email address: [email protected]

I just need a hint, that’s all…

HTML:

<section tabindex="-1" class="pv-profile-section pv-contact-info artdeco-container-card">
<!---->
<h2 class="text-body-large-open mb4">
   InformaciĆ³n de contacto
</h2>
<div class="pv-profile-section__section-info section-info" tabindex="-1">
<section class="pv-contact-info__contact-type ci-vanity-url">
   <li-icon aria-hidden="true" type="linkedin-bug" class="pv-contact-info__contact-icon" size="medium">
      <svg  rel="nofollow noreferrer">enter image description here

My desired ouput:

[email protected]

Asked By: Lorenzo Castagno
||

Answers:

  1. Using XPath locator, you can use the below code:
email_element = driver.find_element(By.XPATH, "//h3[contains(text(),'Email')]//following::a[1]")
print (email_element.text)

Result:

[email protected]

XPath expression explanation: Below XPath expression will locate the first <a> node which is located immediately after the <h3> node containing text Email.

//h3[contains(text(),'Email')]//following::a[1]
  1. Using CLASS locator:

In your code, just change below:

contact_info = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CLASS_NAME, 'pv-contact-info__ci-container')))

To:

contact_info = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CLASS_NAME, 'pv-contact-info__contact-type.ci-email')))

Full working code:

contact_info = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CLASS_NAME, 'pv-contact-info__contact-type.ci-email')))

# Find the email element within the contact info section
email_element = contact_info.find_element(By.CSS_SELECTOR, 'a.pv-contact-info__contact-link')

email = email_element.get_attribute('innerHTML')
print(email)

Result:

[email protected]
Answered By: Shawn

To extract the text [email protected] instead of presence_of_element_located() you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:

  • Using XPATH, following-sibling and text attribute:

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//h3[contains(., 'Email') and contains(@class, 'pv-contact-info__header')]//following-sibling::div[1]/a"))).text)
    
  • Using XPATH, following and get_attribute("innerHTML"):

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//h3[contains(., 'Email') and contains(@class, 'pv-contact-info__header')]//following::div[1]/a"))).get_attribute("innerHTML"))
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python


References

Link to useful documentation:

Answered By: undetected Selenium