Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
397 views
in Technique[技术] by (71.8m points)

python's lxml and iterparse method

Say i have this sample XML.

<result>
    <field k='field1'>
        <value h='1'><text>text_value1</text></value>
    </field>
    <field k='field2'>
        <value><text>text_value2</text></value>
    </field>
    <field k='field3'>
        <value><text>some_text</text></value>
    </field>
</result>

Using python's lxml, how can i get the value of each field for every result set? So basically, i want to iterate over ever result set, then iterate over every field in that result set and print the text data.

This is what i have so far:

context = etree.iterparse(contentBuffer, tag='result')
for action, elem in context:
    print elem.tag, elem.data

Any help would be greatly appreciated.

EDIT Here is the code that i came up with. It seems a bit clunky having to call getparent() twice to read the attribute of corresponding text value. Is there a better way to do this?

for action, elem in context:
    list = elem.xpath('//text')
    print "result set:"
    for item in list:
        field = item.getparent().getparent().attrib['k']
        value = item.text
        print "%s = %s"%(field, value)
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

How about:

import io
import lxml.etree as ET

content='''
<result>
    <field k='field1'>
        <value h='1'><text>text_value1</text></value>
    </field>
    <field k='field2'>
        <value><text>text_value2</text></value>
    </field>
    <field k='field3'>
        <value><text>some_text</text></value>
    </field>
</result>'''

contentBuffer=io.BytesIO(content)
context = ET.iterparse(contentBuffer,tag='result')
for action, elem in context:
    fields=elem.xpath('field/@k')
    values=elem.xpath('field/value/text/text()')
    for field,value in zip(fields,values):
        print('{f} = {v}'.format(f=field,v=value))

which yields

field1 = text_value1
field2 = text_value2
field3 = some_text

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...