BeautifulSoup giving me many error lines when used

Question:

I’ve installed beautifulsoup (file named bs4) into my pythonproject folder which is the same folder as the python file I am running. The .py file contains the following code, and for input I am using this URL to a simple page with 1 link which the code is supposed to retrieve.

URL used as url input: http://data.pr4e.org/page1.htm

.py code:

import urllib.request, urllib.parse, urllib.error
from bs4 import BeautifulSoup
import ssl

ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE

url = input('Enter - ')
html = urllib.request.urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, 'html.parser')

# Retrieve all of the anchor tags
tags = soup('a')
for tag in tags:
    print(tag.get('href', None))

Though I could be wrong, it appears to me that bs4 imports correctly because my IDE program suggests BeautifulSoup when I begin typing it. After all, it is installed in the same directory as the .py file. however, It spits out the following lines of error when I run it using the previously provided url:

Traceback (most recent call last):
  File "C:UsersThomasPycharmProjectspythonProjectmain.py", line 16, in <module>
    soup = BeautifulSoup(html, 'html.parser')
  File "C:UsersThomasPycharmProjectspythonProjectbs4__init__.py", line 215, in __init__
    self._feed()
  File "C:UsersThomasPycharmProjectspythonProjectbs4__init__.py", line 241, in _feed
    self.endData()
  File "C:UsersThomasPycharmProjectspythonProjectbs4__init__.py", line 315, in endData
    self.object_was_parsed(o)
  File "C:UsersThomasPycharmProjectspythonProjectbs4__init__.py", line 320, in 
object_was_parsed
    previous_element = most_recent_element or self._most_recent_element
  File "C:UsersThomasPycharmProjectspythonProjectbs4element.py", line 1001, in __getattr__
    return self.find(tag)
  File "C:UsersThomasPycharmProjectspythonProjectbs4element.py", line 1238, in find
    l = self.find_all(name, attrs, recursive, text, 1, **kwargs)
  File "C:UsersThomasPycharmProjectspythonProjectbs4element.py", line 1259, in find_all
    return self._find_all(name, attrs, text, limit, generator, **kwargs)
  File "C:UsersThomasPycharmProjectspythonProjectbs4element.py", line 516, in _find_all
    strainer = SoupStrainer(name, attrs, text, **kwargs)
  File "C:UsersThomasPycharmProjectspythonProjectbs4element.py", line 1560, in __init__
    self.text = self._normalize_search_value(text)
  File "C:UsersThomasPycharmProjectspythonProjectbs4element.py", line 1565, in _ 
normalize_search_value
    if (isinstance(value, str) or isinstance(value, collections.Callable) or hasattr(value, 
'match')
AttributeError: module 'collections' has no attribute 'Callable'

Process finished with exit code 1

The lines being referred to in the error messages are from files inside bs4 that were downloaded as part of it. I haven’t edited any of the bs4 contained files or even touched them. Can anyone help me figure out why bs4 isn’t working?

Asked By: TGajend

||

Answers:

Are you using python 3.10? Looks like beautifulsoup library is using removed deprecated aliases to Collections Abstract Base Classes. More info here: https://docs.python.org/3/whatsnew/3.10.html#removed

A quick fix is to paste these 2 lines just below your imports:

import collections
collections.Callable = collections.abc.Callable
Answered By: Andrey Merzlyakov

Andrey, i cannot comment yet. But i tried your fix and Im using Thonny and using 3.10 in terminal. But after adding the two import collections and callable lines. i get another error in Thonny that isnt shown in terminal. when i run the program in terminal it simply seems to do nothing. In Thonny it suggests that "Module has no attribute "Callable"

Answered By: lctsolutions
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.