web scraping with selenium with python
Question:
I’m writing a python script to web scrape a betting site for up coming games with selenium. I keep getting this long error.
Here’s the code.
import os
from selenium import webdriver
from selenium.webdriver.common.by import By
os.environ['PATH'] += r"C:/Users/dub6m/OneDrive/Documents/chromedriver_win32"
driver = webdriver.Chrome()
driver.get('https://www.888sport.com/tennis/')
athlete1 = driver.find_element(By.CLASS_NAME, 'featured-matches-widget__event-text featured-matches-widget__event-competitor')
print(athlete1.text)
This was meant to print the first athlete in a tennis match.
It threw this error.
Traceback (most recent call last):
File "C:Usersdub6mOneDriveDesktopbetbot888sports.py", line 8, in <module>
element = driver.find_element(By.CLASS_NAME, 'featured-matches-widget__event-text featured-matches-widget__event-competitor')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:Usersdub6mAppDataLocalProgramsPythonPython311Libsite-packagesseleniumwebdriverremotewebdriver.py", line 831, in find_element
return self.execute(Command.FIND_ELEMENT, {"using": by, "value": value})["value"]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:Usersdub6mAppDataLocalProgramsPythonPython311Libsite-packagesseleniumwebdriverremotewebdriver.py", line 440, in execute
self.error_handler.check_response(response)
File "C:Usersdub6mAppDataLocalProgramsPythonPython311Libsite-packagesseleniumwebdriverremoteerrorhandler.py", line 245, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":".featured-matches-widget__event-text featured-matches-widget__event-competitor"}
(Session info: chrome=112.0.5615.50)
Stacktrace:
Backtrace:
GetHandleVerifier [0x0070DCE3+50899]
(No symbol) [0x0069E111]
(No symbol) [0x005A5588]
(No symbol) [0x005D08F9]
(No symbol) [0x005D0AFB]
(No symbol) [0x005FF902]
(No symbol) [0x005EB944]
(No symbol) [0x005FE01C]
(No symbol) [0x005EB6F6]
(No symbol) [0x005C7708]
(No symbol) [0x005C886D]
GetHandleVerifier [0x00973EAE+2566302]
GetHandleVerifier [0x009A92B1+2784417]
GetHandleVerifier [0x009A327C+2759788]
GetHandleVerifier [0x007A5740+672048]
(No symbol) [0x006A8872]
(No symbol) [0x006A41C8]
(No symbol) [0x006A42AB]
(No symbol) [0x006971B7]
BaseThreadInitThunk [0x77007BA9+25]
RtlInitializeExceptionChain [0x77A7BB3B+107]
RtlClearBits [0x77A7BABF+191]
Answers:
Try doing this way:
import os
from selenium import webdriver
from selenium.webdriver.common.by import By
from time import sleep
driver = webdriver.Chrome()
driver.get('https://www.888sport.com/tennis/')
athlete1 = driver.find_element(By.CLASS_NAME, 'featured-matches-widget__event-competitor')
print(athlete1.text)
driver.quit()
Output:
Purcell, M
The problem is By.CLASS_NAME
expects a single class name and you’ve passed it two, ‘featured-matches-widget__event-text’ and ‘featured-matches-widget__event-competitor’. Turn it into a CSS selector
driver.find_element(By.CSS_SELECTOR, '.featured-matches-widget__event-text.featured-matches-widget__event-competitor')
where a .
means a class name.
You should also add a wait to make sure the page is fully loaded before beginning to scrape the page.
NOTE: As of Selenium 4.6+, you no longer need to specify the path to ChromeDriver. It now has DriverManager which will download, setup, and generally take care of drivers for you.
The full working code is
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
driver = webdriver.Chrome()
driver.get('https://www.888sport.com/tennis/')
wait = WebDriverWait(driver, 10)
athlete1 = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, '.featured-matches-widget__event-text.featured-matches-widget__event-competitor')))
print(athlete1.text)
and it prints
Purcell, M
I’m writing a python script to web scrape a betting site for up coming games with selenium. I keep getting this long error.
Here’s the code.
import os
from selenium import webdriver
from selenium.webdriver.common.by import By
os.environ['PATH'] += r"C:/Users/dub6m/OneDrive/Documents/chromedriver_win32"
driver = webdriver.Chrome()
driver.get('https://www.888sport.com/tennis/')
athlete1 = driver.find_element(By.CLASS_NAME, 'featured-matches-widget__event-text featured-matches-widget__event-competitor')
print(athlete1.text)
This was meant to print the first athlete in a tennis match.
It threw this error.
Traceback (most recent call last):
File "C:Usersdub6mOneDriveDesktopbetbot888sports.py", line 8, in <module>
element = driver.find_element(By.CLASS_NAME, 'featured-matches-widget__event-text featured-matches-widget__event-competitor')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:Usersdub6mAppDataLocalProgramsPythonPython311Libsite-packagesseleniumwebdriverremotewebdriver.py", line 831, in find_element
return self.execute(Command.FIND_ELEMENT, {"using": by, "value": value})["value"]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:Usersdub6mAppDataLocalProgramsPythonPython311Libsite-packagesseleniumwebdriverremotewebdriver.py", line 440, in execute
self.error_handler.check_response(response)
File "C:Usersdub6mAppDataLocalProgramsPythonPython311Libsite-packagesseleniumwebdriverremoteerrorhandler.py", line 245, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":".featured-matches-widget__event-text featured-matches-widget__event-competitor"}
(Session info: chrome=112.0.5615.50)
Stacktrace:
Backtrace:
GetHandleVerifier [0x0070DCE3+50899]
(No symbol) [0x0069E111]
(No symbol) [0x005A5588]
(No symbol) [0x005D08F9]
(No symbol) [0x005D0AFB]
(No symbol) [0x005FF902]
(No symbol) [0x005EB944]
(No symbol) [0x005FE01C]
(No symbol) [0x005EB6F6]
(No symbol) [0x005C7708]
(No symbol) [0x005C886D]
GetHandleVerifier [0x00973EAE+2566302]
GetHandleVerifier [0x009A92B1+2784417]
GetHandleVerifier [0x009A327C+2759788]
GetHandleVerifier [0x007A5740+672048]
(No symbol) [0x006A8872]
(No symbol) [0x006A41C8]
(No symbol) [0x006A42AB]
(No symbol) [0x006971B7]
BaseThreadInitThunk [0x77007BA9+25]
RtlInitializeExceptionChain [0x77A7BB3B+107]
RtlClearBits [0x77A7BABF+191]
Try doing this way:
import os
from selenium import webdriver
from selenium.webdriver.common.by import By
from time import sleep
driver = webdriver.Chrome()
driver.get('https://www.888sport.com/tennis/')
athlete1 = driver.find_element(By.CLASS_NAME, 'featured-matches-widget__event-competitor')
print(athlete1.text)
driver.quit()
Output:
Purcell, M
The problem is By.CLASS_NAME
expects a single class name and you’ve passed it two, ‘featured-matches-widget__event-text’ and ‘featured-matches-widget__event-competitor’. Turn it into a CSS selector
driver.find_element(By.CSS_SELECTOR, '.featured-matches-widget__event-text.featured-matches-widget__event-competitor')
where a .
means a class name.
You should also add a wait to make sure the page is fully loaded before beginning to scrape the page.
NOTE: As of Selenium 4.6+, you no longer need to specify the path to ChromeDriver. It now has DriverManager which will download, setup, and generally take care of drivers for you.
The full working code is
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
driver = webdriver.Chrome()
driver.get('https://www.888sport.com/tennis/')
wait = WebDriverWait(driver, 10)
athlete1 = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, '.featured-matches-widget__event-text.featured-matches-widget__event-competitor')))
print(athlete1.text)
and it prints
Purcell, M