How to extract links from a website in python?

Question:

I am trying to webscrape the below website. As a first step, I would like to get the links from which to extract the text. However, when I do the following, I get an empty list:

import pandas as pd
from bs4 import BeautifulSoup

url = 'https://www.federalreserve.gov/newsevents/speeches.htm'
r = BeautifulSoup(requests.get(url).content, features = "lxml")

r.select('.itemTitle')


Can anyone tell me what am I doing wrong?

Thanks

Asked By: Rollo99

||

Answers:

You could also request the JSON from the enpoint data is loaded from and based on your imports convert it into a pandas dataframe.

import requests, json
import pandas as pd 


pd.DataFrame(
    json.loads(requests.get(f'https://www.federalreserve.gov/json/ne-speeches.json').content)
)
Output
d t s lo l a o v video updateDate
0 3/29/2023 8:30:00 AM Brief Remarks Vice Chair for Supervision Michael S. Barr At the National Community Reinvestment Coalition Just Economy Conference, Washington, D.C. (via prerecorded video) /newsevents/speech/barr20230329a.htm no No nan
1 3/27/2023 5:00:00 PM Implementation and Transmission of Monetary Policy Governor Philip N. Jefferson At the H. Parker Willis Lecture, Washington and Lee University, Lexington, Virginia /newsevents/speech/jefferson20230327a.htm no No nan
2 3/14/2023 5:20:00 PM The Innovation Imperative: Modernizing Traditional Banking Governor Michelle W. Bowman At the Independent Community Bankers of America ICBA Live 2023 Conference, Honolulu, Hawaii /newsevents/speech/bowman20230314a.htm no No nan
3 3/9/2023 10:00:00 AM Supporting Innovation with Guardrails: The Federal Reserve’s Approach to Supervision and Regulation of Banks’ Crypto-related Activities Vice Chair for Supervision Michael S. Barr At the Peterson Institute for International Economics, Washington, D.C. /newsevents/speech/barr20230309a.htm no https://www.youtube.com/user/PetersonInstitute No nan
4 3/3/2023 3:00:00 PM Panel on “Design Issues for Central Bank Facilities in the Future” Governor Michelle W. Bowman At The Chicago Booth Initiative on Global Markets Workshop on Market Dysfunction, Chicago, Illinois /newsevents/speech/bowman20230303a.htm no No nan
973 1/18/2017 3:00:00 PM The Goals of Monetary Policy and How We Pursue Them Chair Janet L. Yellen At the Commonwealth Club, San Francisco, California /newsevents/speech/yellen20170118a.htm no Yes nan
974 1/17/2017 10:00:00 AM Monetary Policy in a Time of Uncertainty Governor Lael Brainard At the Brookings Institution, Washington, D.C. /newsevents/speech/brainard20170117a.htm no Yes nan
975 1/12/2017 7:00:00 PM Welcoming Remarks Chair Janet L. Yellen At the Conversation with the Chair: A Teacher Town Hall Meeting, Washington, D.C. /newsevents/speech/yellen20170112a.htm no Yes nan
976 1/7/2017 11:15:00 AM Low Interest Rates and the Financial System Governor Jerome H. Powell At the 77th Annual Meeting of the American Finance Association, Chicago, Illinois /newsevents/speech/powell20170107a.htm no No nan
Answered By: HedgeHog

No pandas approach:

import json
import string

import requests

url = "https://www.federalreserve.gov/json/ne-speeches.json"
speeches = json.loads(
    "".join(filter(lambda x: x in string.printable, requests.get(url).text))
)
for speech in speeches:
    try:
        print(f"https://www.federalreserve.gov{speech['l']}")
    except KeyError:
        print("No link :(")

Output:

https://www.federalreserve.gov/newsevents/speech/barr20230329a.htm
https://www.federalreserve.gov/newsevents/speech/jefferson20230327a.htm
https://www.federalreserve.gov/newsevents/speech/bowman20230314a.htm
https://www.federalreserve.gov/newsevents/speech/barr20230309a.htm
https://www.federalreserve.gov/newsevents/speech/bowman20230303a.htm
https://www.federalreserve.gov/newsevents/speech/waller20230302a.htm
https://www.federalreserve.gov/newsevents/speech/jefferson20230227a.htm
https://www.federalreserve.gov/newsevents/speech/jefferson20230224a.htm
https://www.federalreserve.gov/newsevents/speech/cook20230216a.htm
https://www.federalreserve.gov/newsevents/speech/bowman20230215a.htm
https://www.federalreserve.gov/newsevents/speech/bowman20230213a.htm
https://www.federalreserve.gov/newsevents/speech/waller20230210a.htm

...
Answered By: baduker