Incredibly basic lxml questions: getting HTML/string content of lxml.etree._Element?
Question:
This is such a basic question that I actually can’t find it in the docs :-/
In the following:
img = house_tree.xpath('//img[@id="mainphoto"]')[0]
How do I get the HTML of the <img/>
tag?
I’ve tried adding html_content()
but get AttributeError: 'lxml.etree._Element' object has no attribute 'html_content'
.
Also, it was a tag with some content inside (e.g. <p>text</p>
) how would I get the content (e.g. text
)?
Many thanks!
Answers:
I suppose it will be as simple as:
from lxml.etree import tostring
inner_html = tostring(img)
As for getting content from inside <p>
, say, some selected element el
:
content = el.text_content()
This is such a basic question that I actually can’t find it in the docs :-/
In the following:
img = house_tree.xpath('//img[@id="mainphoto"]')[0]
How do I get the HTML of the <img/>
tag?
I’ve tried adding html_content()
but get AttributeError: 'lxml.etree._Element' object has no attribute 'html_content'
.
Also, it was a tag with some content inside (e.g. <p>text</p>
) how would I get the content (e.g. text
)?
Many thanks!
I suppose it will be as simple as:
from lxml.etree import tostring
inner_html = tostring(img)
As for getting content from inside <p>
, say, some selected element el
:
content = el.text_content()