Scraping pdfs from this web


I am trying to scrape with python 2.7 from this website:

I want to scrape the main one, that has many categories (Event), the one that appears next to the MotoGP Race Classification 2017 blue letters

And after that scrape for years as well. So far I have:

import re
from bs4 import BeautifulSoup
from urllib.request import urlopen
url = ""
r  = urlopen(url).read()
soup = BeautifulSoup(r)

match ='"(.*?.pdf)"', r)
pdf_url="" +'utf8')

The links are this type:

So I should add the thing "?" after the character. The main problem is how to switch from event to event to get all the links in this type of format.

Asked By: Gotey



According to the description you have provided above, this is how can get those pdf links:

from selenium import webdriver
from import By
from import WebDriverWait
from import expected_conditions as EC

driver = webdriver.Chrome()
wait = WebDriverWait(driver, 10)

for item in wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "#event option"))):
    elem = wait.until(EC.presence_of_element_located((By.CLASS_NAME, "padleft5")))


Partial output:
Answered By: SIM
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.