In Playwright for Python, how do I get elements relative to ElementHandle (children, parent, grandparent, siblings)?
Question:
In playwright-python I know I can get an elementHandle
using querySelector()
.
Example (sync):
from playwright import sync_playwright
with sync_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = browser_type.launch()
page = browser.newPage()
page.goto('https://duckduckgo.com/')
element = page.querySelector('input[id="search_form_input_homepage"]')
How do I get the an element relative to this based on this elementHandle
? I.e. the parent, grandparent, siblings, children handles?
Answers:
Original answer:
Using querySelector()
/ querySelectorAll
with
XPath (XML Path Language) lets you retrieve the elementHandle
(respectively a collection of handles). Generally speaking, XPath can be used to navigate through elements and attributes in an XML document.
from playwright import sync_playwright
with sync_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = browser_type.launch(headless=False)
page = browser.newPage()
page.goto('https://duckduckgo.com/')
element = page.querySelector('input[id="search_form_input_homepage"]')
parent = element.querySelector('xpath=..')
grandparent = element.querySelector('xpath=../..')
siblings = element.querySelectorAll('xpath=following-sibling::*')
children = element.querySelectorAll('xpath=child::*')
browser.close()
Update (2022-07-22):
It seems that browser.newPage()
is deprecated, so in newer versions of playwright, the function is called browser.new_page()
(note the different function name).
Optionally create a browser context first (and close it afterwards) and call new_page()
on that context.
The way the children/parent/grandparent/siblings are accessed stays the same.
from playwright import sync_playwright
with sync_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = browser_type.launch(headless=False)
context = browser.new_context()
page = context.new_page()
page.goto('https://duckduckgo.com/')
element = page.querySelector('input[id="search_form_input_homepage"]')
parent = element.querySelector('xpath=..')
grandparent = element.querySelector('xpath=../..')
siblings = element.querySelectorAll('xpath=following-sibling::*')
children = element.querySelectorAll('xpath=child::*')
context.close()
browser.close()
The Accepted answer is in older version of playwright.Use the following format for current version it will work.
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = browser_type.launch(headless=False)
context = browser.new_context()
page =context.new_page()
page.goto('https://duckduckgo.com/')
element = page.query_selector('input[id="search_form_input_homepage"]')
parent = element.query_selector('xpath=..')
grandparent = element.query_selector('xpath=../..')
siblings = element.query_selector_all('xpath=following-sibling::*')
children = element.query_selector_all('xpath=child::*')
context.close()
browser.close()
In playwright-python I know I can get an elementHandle
using querySelector()
.
Example (sync):
from playwright import sync_playwright
with sync_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = browser_type.launch()
page = browser.newPage()
page.goto('https://duckduckgo.com/')
element = page.querySelector('input[id="search_form_input_homepage"]')
How do I get the an element relative to this based on this elementHandle
? I.e. the parent, grandparent, siblings, children handles?
Original answer:
Using querySelector()
/ querySelectorAll
with
XPath (XML Path Language) lets you retrieve the elementHandle
(respectively a collection of handles). Generally speaking, XPath can be used to navigate through elements and attributes in an XML document.
from playwright import sync_playwright
with sync_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = browser_type.launch(headless=False)
page = browser.newPage()
page.goto('https://duckduckgo.com/')
element = page.querySelector('input[id="search_form_input_homepage"]')
parent = element.querySelector('xpath=..')
grandparent = element.querySelector('xpath=../..')
siblings = element.querySelectorAll('xpath=following-sibling::*')
children = element.querySelectorAll('xpath=child::*')
browser.close()
Update (2022-07-22):
It seems that browser.newPage()
is deprecated, so in newer versions of playwright, the function is called browser.new_page()
(note the different function name).
Optionally create a browser context first (and close it afterwards) and call new_page()
on that context.
The way the children/parent/grandparent/siblings are accessed stays the same.
from playwright import sync_playwright
with sync_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = browser_type.launch(headless=False)
context = browser.new_context()
page = context.new_page()
page.goto('https://duckduckgo.com/')
element = page.querySelector('input[id="search_form_input_homepage"]')
parent = element.querySelector('xpath=..')
grandparent = element.querySelector('xpath=../..')
siblings = element.querySelectorAll('xpath=following-sibling::*')
children = element.querySelectorAll('xpath=child::*')
context.close()
browser.close()
The Accepted answer is in older version of playwright.Use the following format for current version it will work.
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = browser_type.launch(headless=False)
context = browser.new_context()
page =context.new_page()
page.goto('https://duckduckgo.com/')
element = page.query_selector('input[id="search_form_input_homepage"]')
parent = element.query_selector('xpath=..')
grandparent = element.query_selector('xpath=../..')
siblings = element.query_selector_all('xpath=following-sibling::*')
children = element.query_selector_all('xpath=child::*')
context.close()
browser.close()