How to scrape web-page with button/menuitems optionvalue?

Question:

In particular, I’am trying to scrape this web site

I would like to setup the Button-menuitems on "50" rows per page:

My Currently core is the follow:

Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//[@class='btn btn-default dropdown-toggle']")))).select_by_visible_text('50')

where is my wrong? Can you help me ?

Thank you in advance for youre Time!

Asked By: Agustos Imola

||

Answers:

You can try this easier code which doesn’t need Selenium but rather directly makes a call to the data API of the site with requests.

Please note the argument limit at the end of the query string that sets the limit to 50 rows, as you want. If you want to scrape the next 50 items just increase the offset to 50 then 100, 150, etc. This will get you all the available data.

import requests
import pandas as pd
import json

url = "https://whalewisdom.com/filer/holdings?id=berkshire-hathaway-inc&q1=-1&type_filter=1,2,3,4&symbol=&change_filter=&minimum_ranking=&minimum_shares=&is_etf=0&sc=true&sort=current_mv&order=desc&offset=0&limit=50"
raw = requests.get(url)

data = json.loads(raw.content)
df = pd.DataFrame(data["rows"])
df.head()

Print out:

    symbol  permalink   security_type   name    sector  industry    current_shares  previous_shares     shares_change   position_change_type    ...     percent_ownership   quarter_first_owned     quarter_id_owned    source_type     source_date     filing_date     avg_price   recent_price    quarter_end_price   id
0   AAPL    aapl    SH  Apple Inc   INFORMATION TECHNOLOGY  COMPUTERS & PERIPHERALS     8.909234e+08    8.871356e+08    3787856.0   addition    ...     5.5045625   Q1 2016     61  13F     2022-03-31  2022-05-16  36.6604     160.01  174.61  None
1   BAC     bac     SH  Bank of America Corp. (North Carolina National...   FINANCE     BANKS   1.010101e+09    1.010101e+09    0.0     None    ...     12.5371165  Q3 2017     67  13F     2022-03-31  2022-05-16  25.5185     33.04   41.22   None
2   AXP     axp     SH  American Express Co     FINANCE     CONSUMER FINANCE    1.516107e+08    1.516107e+08    0.0     None    ...     20.1326115  Q1 2001     1   13F     2022-03-31  2022-05-16  39.3110     151.60  187.00  None
3   CVX     cvx     SH  Chevron Corp. (Standard Oil of California)  ENERGY  INTEGRATED OIL & GAS    1.591781e+08    3.824504e+07    120933081.0     addition    ...     8.1014366   Q4 2020     80  13F     2022-03-31  2022-05-16  125.3424    159.14  162.83  None
4   KO  ko  SH  Coca Cola Co.   CONSUMER STAPLES
Answered By: petezurich

You’re trying to pass non-select node to Select class instance. This won’t work

Try this code

WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//*[@class='btn btn-default dropdown-toggle']"))).click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.LINK_TEXT, "50"))).click()
Answered By: JaSON

This should work for your case

driver  = webdriver.Firefox(service=s)
driver.get(' https://whalewisdom.com/filer/fisher-asset-management-llc#tabholdings_tab_link')
  
button = driver.find_element(By.CSS_SELECTOR, '.btn-group.dropdown')
button.click()
element = driver.find_element(By.XPATH, '//li[@role="menuitem"]/a[contains(text(), "50")]')
element.click()
Answered By: Himanshuman