What is a beautiful soup bound method?

Question:

I’m experimenting with http://robobrowser.readthedocs.org/en/latest/readme.html, a new python library based on the beautiful soup library. I’m trying to test it out by opening an html page and returning it within a django app, but I can’t figure out to do this most simple task. My django app contains :

def index(request):    

    p=str(request.POST.get('p', False)) # p='https://www.yahoo.com/'
    browser = RoboBrowser(history=True)
    browser.open(p)
    html = browser.find_all
    return HttpResponse(html)

when I look at the outputted html I see:

<bound method BeautifulSoup.find_all of 
    <!DOCTYPE html>
    <html>
    ......................
        <head>
    ...............
        </body>
    </html>
>

What is a beautiful soup bound method? How can I get the straight html?

Asked By: user1592380

||

Answers:

It’s a method object, bound to the BeautifulSoup object. You didn’t call it.

It’s representation is a little confusing because the repr() of the BeautifulSoup parse tree is included, which is simply the tree rendered as a HTML source string.

To get to the underlying BeautifulSoup parse tree, you can use; use str() to turn that back into a source string:

html = str(browser.state.parsed)

Alternatively, you can still access the original requests response object with:

browser.state.response

which means that the original downloaded HTML is found as:

html = browser.state.response.content
Answered By: Martijn Pieters

BeautifulSoup is a Python package used for parsing HTML and XML documents, it creates a parse tree for parsed paged which can be used for web scraping.

There are many Beautifulsoup methods, which allows us to search a parse tree. If we search out of that tree it will be out of bound.

.next_sibling and .previous_sibling are the tags that are used for navigating between page elements that are on same level of the parse tree.

Reference

Answered By: Md Shayon