防止lxml创建自动关闭标签
我有一个(旧的)工具,该工具不了解像这样的自动关闭标签<STATUS/>
。所以,我们需要序列与这样的开启/关闭的标签我们的XML文件:<STATUS></STATUS>
。
目前我有:
>>> from lxml import etree
>>> para = """<ERROR>The status is <STATUS></STATUS>.</ERROR>"""
>>> tree = etree.XML(para)
>>> etree.tostring(tree)
'<ERROR>The status is <STATUS/>.</ERROR>'
如何使用打开/关闭的标签进行序列化?
<ERROR>The status is <STATUS></STATUS>.</ERROR>
解
>>> from lxml import etree
>>> para = """<ERROR>The status is <STATUS></STATUS>.</ERROR>"""
>>> tree = etree.XML(para)
>>> for status_elem in tree.xpath("//STATUS[string() = '']"):
... status_elem.text = ""
>>> etree.tostring(tree)
'<ERROR>The status is <STATUS></STATUS>.</ERROR>'
-
似乎
<STATUS>
标签已分配了的text
属性None
:>>> tree[0] <Element STATUS at 0x11708d4d0> >>> tree[0].text >>> tree[0].text is None True
如果
text
将<STATUS>
标记的属性设置为空字符串,则应获得所需的内容:>>> tree[0].text = '' >>> etree.tostring(tree) '<ERROR>The status is <STATUS></STATUS>.</ERROR>'
考虑到这一点,您可能可以
text
在编写XML之前遍历DOM树并修复属性。像这样:# prevent creation of self-closing tags for node in tree.iter(): if node.text is None: node.text = ''