Python web scraping: how to print paragraph after a specific class element in browser?
Question:
I am new to scraping in python (I am using PyCharm
interface with Python 3.10
).
I have spent hours trying to troubleshoot this but seems nothing is working. My HTML is of the attached picture format.
Ideally I want to print the 1st sentence/paragraph after three dots appear in the browser (to separate sections). So the underlined sentences in the attached picture. Amongst others, I have tried the following:
WebDriverWait(browser, timeout=10).until(
EC.presence_of_element_located((By.CLASS_NAME, "wp-block-separator has-css-opacity is-style-dots")))
and then getting the text with the XPATH
but unfortunately this does not work. Any suggestions would be very much appreciated and thank you very much!
Answers:
As I remember selenium
expects single class and it converts it to CSS by adding dot at the beginning but "wp-block-separator has-css-opacity is-style-dots"
means three classes which would need dots before every class.
It may need to use manually dot between classes (without dot before first class) to fix this problem
(BY.CLASS_NAME, "wp-block-separator.has-css-opacity.is-style-dots")
or you may have to use CSS selector
with dot even before first class
(By.CSS_SELECTOR, ".wp-block-separator.has-css-opacity.is-style-dots")
I am new to scraping in python (I am using PyCharm
interface with Python 3.10
).
I have spent hours trying to troubleshoot this but seems nothing is working. My HTML is of the attached picture format.
Ideally I want to print the 1st sentence/paragraph after three dots appear in the browser (to separate sections). So the underlined sentences in the attached picture. Amongst others, I have tried the following:
WebDriverWait(browser, timeout=10).until(
EC.presence_of_element_located((By.CLASS_NAME, "wp-block-separator has-css-opacity is-style-dots")))
and then getting the text with the XPATH
but unfortunately this does not work. Any suggestions would be very much appreciated and thank you very much!
As I remember selenium
expects single class and it converts it to CSS by adding dot at the beginning but "wp-block-separator has-css-opacity is-style-dots"
means three classes which would need dots before every class.
It may need to use manually dot between classes (without dot before first class) to fix this problem
(BY.CLASS_NAME, "wp-block-separator.has-css-opacity.is-style-dots")
or you may have to use CSS selector
with dot even before first class
(By.CSS_SELECTOR, ".wp-block-separator.has-css-opacity.is-style-dots")