How to get rid of BeautifulSoup user warning?

Question:

After I installed BeautifulSoup, whenever I run my Python in from the command line, this warning comes out:

D:Applicationpythonlibsite-packagesbeautifulsoup4-4.4.1-py3.4.eggbs4__init__.py:166:
UserWarning: No parser was explicitly specified, so I'm using the best 
available HTML parser for this system ("html.parser"). This usually isn't a
problem, but if you run this code on another system, or in a different
virtual environment, it may use a different parser and behave differently.

To get rid of this warning, change this:

 BeautifulSoup([your markup])

to this:

 BeautifulSoup([your markup], "html.parser")

I have no idea why it comes out and how to solve it.

Asked By: jellyfishhuang

||

Answers:

The solution to your problem is clearly stated in the error message. Code like the below does not specify an XML/HTML/etc. parser.

BeautifulSoup( ... )

In order to fix the error, you’ll need to specify which parser you’d like to use, like so:

BeautifulSoup( ..., "html.parser" )

You can also install a 3rd party parser if you’d like.

Answered By: Ethan Bierlein

Documentation recommends that you install and use lxml for speed.

BeautifulSoup(html, "lxml")

If you’re using a version of Python 2 earlier than 2.7.3, or a version
of Python 3 earlier than 3.2.2, it’s essential that you install lxml
or html5lib–Python’s built-in HTML parser is just not very good in
older versions.

Installing LXML parser

  • On Ubuntu (debian)

    apt-get install python-lxml 
    
  • Fedora (RHEL based)

    dnf install python-lxml
    
  • Using PIP

    pip install lxml
    
Answered By: Gayan Weerakutti

For HTML parser, you need to install html5lib, run:

pip install html5lib

then add html5lib in the BeautifulSoup method:

htmlDoc = bs4.BeautifulSoup(req1.text, 'html5lib')
print(htmlDoc)
Answered By: Wilson Wu

In my opinion, the previous posts did not answer the question.

Yes, as everyone said, you can remove the warning by specifying the parser.
And as pointed by the documentation, it is a best-practice for performances 1 and for consistency 2.

But in some cases, you want to silence the warning… Hence this post.

  • since BeautifulSoup 4 rev 460, the warning message does not appear in interactive (REPL) mode
  • there are more generalist answers at: How to disable Python warnings? to control Python warnings (TL;DL: PYTHONWARNINGS=ignore or -Wignore)
  • suppressing the warning explicitly (bs4 ≥ rev 569) by adding to your code:
    import warnings
    from bs4 import GuessedAtParserWarning
    warnings.filterwarnings('ignore', category=GuessedAtParserWarning)
    
  • cheating by letting bs4 think you provided the parser, i.e.:
    bs4.BeautifulSoup(
      your_markup,
      builder=bs4.builder_registry.lookup(*bs4.BeautifulSoup.DEFAULT_BUILDER_FEATURES)
    )
    
Answered By: bufh