lxml

retrieving xml element value by searching the element by substring in its name

retrieving xml element value by searching the element by substring in its name Question: I would need to retrieve xml element value by searching the element by substring in its name, eg. I would need to get value for all elements in XML file which names contains client. I found a way how to find …

Total answers: 1

What is the fastest way to extract content from XML document using LXML?

What is the fastest way to extract content from XML document using LXML? Question: I’m using LXML to extract information from a bunch of XML files. I’m wondering whether the way I’m approaching this task is the most efficient. Right now I use the xpath() method in LXML to identify the specific targets and then …

Total answers: 2

Cannot install Scrapy on macOS M1 13.0.1 – lxml installation error: '/usr/bin/clang' failed with exit code 1

Cannot install Scrapy on macOS M1 13.0.1 – lxml installation error: '/usr/bin/clang' failed with exit code 1 Question: I’m trying to install Scrapy in my Macbook M1 with macOS Ventura 13.0.1, but it is throwing an error while trying to install lxml. Installing collected packages: lxml, jmespath, itemadapter, idna, filelock, cssselect, charset-normalizer, cffi, certifi, attrs, …

Total answers: 2

Merge and manipulate xslt file using python lxml

Merge and manipulate xslt file using python lxml Question: im a newbie in python and i have a difficult task to cope. Suppose we have two xslt files, the first one is like this: <xsl:stylesheet version="1.0"> <xsl:function name="grp:MapToCD538A_var107"> <xsl:param name="var106_cur" as="node()"/> </xsl:function> <xsl:template match="/"> <CD123> <xsl:attribute name="xsi:schemaLocation" namespace="http://www.w3.org/2001/XMLSchema-instance"/> <xsl:for-each select="(./ns0:CD538C)[fn:not(fn:exists(*:ExportOperation[fn:namespace-uri() eq ”]/*:requestRejectionReasonCode[fn:namespace-uri() eq ”]))]"> <SynIde …

Total answers: 2

XPath on lxml's iterparse matches elements outside its scope

XPath on lxml's iterparse matches elements outside its scope Question: I have huge corpora that I am parsing with lxml, so I am using iterparse which makes it easy to read XML on-the-fly. By using iterparse(fh, tag="your_tag") we can efficiently iterate over nodes in large files. I wish to do some XPath matching for each …

Total answers: 1

parsing xml with namespace from request with lxml in python

parsing xml with namespace from request with lxml in python Question: I am trying to get some text out of a table from an online xml file. I can find the tables: from lxml import etree import requests main_file = requests.get(‘https://training.gov.au/TrainingComponentFiles/CUA/CUAWRT601_R1.xml’) main_file.encoding = ‘utf-8-sig’ root = etree.fromstring(main_file.content) tables = root.xpath(‘//foo:table’, namespaces={"foo": "http://www.authorit.com/xml/authorit"}) print(tables) But I …

Total answers: 1

xml parsing with extra 'n' and whitespaces using lxml library

xml parsing with extra 'n' and whitespaces using lxml library Question: I wrote a python program with lxml library to parse a xml file using its xpath. The value and xpath are all correct but it returns many ‘n’ and white spaces just like the xml file’s formatting. here is my code: from lxml import …

Total answers: 1

Xpath returns empty array – lxml

Xpath returns empty array – lxml Question: I’m trying to write a program that scrapes https://www.tcgplayer.com/ to get a list of Pokemon TCG prices based on a specified list from lxml import etree, html import requests import string def clean_text(element): all_text = element.text_content() cleaned = ‘ ‘.join(all_text.split()) return cleaned page = requests.get("http://www.tcgplayer.com/product/231462/pokemon-first-partner-pack-pikachu?xid=pi731833d1-f2cc-4043-9551-4ca08506b43a&page=1&Language=English") tree = html.fromstring(page.content) …

Total answers: 1