ElementTree:Element.remove()跳跃迭代
我有这个xml输入文件:
<?xml version="1.0"?>
<zero>
<First>
<second>
<third-num>1</third-num>
<third-def>object001</third-def>
<third-len>458</third-len>
</second>
<second>
<third-num>2</third-num>
<third-def>object002</third-def>
<third-len>426</third-len>
</second>
<second>
<third-num>3</third-num>
<third-def>object003</third-def>
<third-len>998</third-len>
</second>
</First>
</zero>
我的目标是删除<third-def>
没有价值的任何第二层。为此,我编写了以下代码:
try:
import xml.etree.cElementTree as ET
except ImportError:
import xml.etree.ElementTree as ET
inputfile='inputfile.xml'
tree = ET.parse(inputfile)
root = tree.getroot()
elem = tree.find('First')
for elem2 in tree.iter(tag='second'):
if elem2.find('third-def').text == 'object001':
pass
else:
elem.remove(elem2)
#elem2.clear()
我的问题是elem.remove(elem2)
。它每隔第二级跳过一次。这是此代码的输出:
<?xml version="1.0" ?>
<zero>
<First>
<second>
<third-num>1</third-num>
<third-def>object001</third-def>
<third-len>458</third-len>
</second>
<second>
<third-num>3</third-num>
<third-def>object003</third-def>
<third-len>998</third-len>
</second>
</First>
</zero>
现在,如果我取消注释该elem2.clear()
行,则脚本可以完美运行,但是输出效果不佳,因为它保留了所有已删除的 第二级 :
<?xml version="1.0" ?>
<zero>
<First>
<second>
<third-num>1</third-num>
<third-def>object001</third-def>
<third-len>458</third-len>
</second>
<second/>
<second/>
</First>
</zero>
有人知道我的element.remove()
陈述为什么错误吗?
-
您正在遍历活动树:
for elem2 in tree.iter(tag='second'):
然后在迭代时进行更改。该迭代的“计数器”将不被告知更改的一些元素,所以元素0前瞻性和上元件数1移除元素,迭代器然后移动,但什么时候 是
单元号1现在是单元号0。首先捕获所有元素的列表,然后在其上循环:
for elem2 in tree.findall('.//second'):
.findall()
返回结果列表,该列表在您更改树时不会更新。现在迭代不会跳过最后一个元素:
>>> print ET.tostring(tree) <zero> <First> <second> <third-num>1</third-num> <third-def>object001</third-def> <third-len>458</third-len> </second> </First> </zero>
这种现象不仅限于ElementTree树;请参阅循环“忘记”以删除一些项目