Starting out with Python and Web Scraping… I don't quite understand why this isn't working?

Question:

I used the following code, to try and insert a new line after each comma that appeared after the html text was printed, to seprate the links that I was trying to find using beautifulSoup (Since they appeared as a text with commas indicating differernt links, and I wanted to seperate them). I tried this, and it doesn’t seem to do anything…and I don’t know why?

file = requests.get(url) 
UsualError = file.text
Extractor = BeautifulSoup(UsualError)
run = print(Extractor.find_all('link'))
for text in run: 
    if ',':
        +"/n";
print(run)

I tried other methods aswell, but I don’t think they were entirely right….and I’m not too sure how to go about this, so if someone could point out, what I’m thinking is extremely obvious, you’ll be helping somoene get to grips with something 🙂

Asked By: Arete'sSBar

||

Answers:

There might be no commas in the links. All the links are stored in a python list where all the entries in a python list are separated by a comma. So you cannot replace those commas with anything.

The main issue with your code is run = print(Extractor.find_all('link')); you are assigning a print statement to run.

If you want to see all the contents as such:

file = requests.get(url) 
UsualError = file.text
Extractor = BeautifulSoup(UsualError)
run = Extractor.find_all('link')
for text in run: 
    print(text)

If you want to see only the hyperlinks:

file = requests.get(url) 
UsualError = file.text
Extractor = BeautifulSoup(UsualError)
run = Extractor.find_all('link')
for text in run: 
    print(text.get('href')

If you want to store only href links in the list run:

file = requests.get(url) 
UsualError = file.text
Extractor = BeautifulSoup(UsualError)
run = Extractor.find_all('link')
run = [text.get('href') for text in run]

# now run contains only href links
# optionally you can print
# print(run)
# but commas can be seen between each link entry, as it is syntactically mandatory
Answered By: rajkumar_data
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.