Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
875 views
in Technique[技术] by (71.8m points)

algorithm - Python file parsing: Build tree from text file

I have an indented text file that will be used to build a tree. Each line represents a node, and indents represent depth as well as node the current node is a child of.

For example, a file might look like

ROOT
   Node1
      Node2
         Node3
            Node4
   Node5
   Node6

Which indicates that ROOT contains three children: 1, 5, and 6, Node1 has one child: 2, and Node2 has one child: 3, etc.

I have come up with a recursive algorithm and have programmed it and it works, but it's kind of ugly and especially treats the example above very crudely (when going from node 4 to node 5)

It uses "indent count" as the basis for recursion, so if the number of indents = current depth + 1, I would go one level deeper. But this means when I read a line with less indents, I have to come back up one level at a time, checking the depth each time.

Here is what I have

def _recurse_tree(node, parent, depth):
    tabs = 0
    
    while node:
        tabs = node.count("")
        if tabs == depth:
            print "%s: %s" %(parent.strip(), node.strip())
        elif tabs == depth + 1:
            node = _recurse_tree(node, prev, depth+1)
            tabs = node.count("")
            
            #check if we have to surface some more
            if tabs == depth:
                print "%s: %s" %(parent.strip(), node.strip())
            else:
                return node
        else:
            return node
        
        prev = node
        node = inFile.readline().rstrip()
        
inFile = open("test.txt")
root = inFile.readline().rstrip()
node = inFile.readline().rstrip()
_recurse_tree(node, root, 1)

Right now I am just printing out the nodes to verify that the parent node is correct for each line, but maybe there is a cleaner way to do it? Especially the case in the elif block when I'm coming back from each recursion call.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

If you don't insist on recursion, this works too:

from itertools import takewhile

is_tab = ''.__eq__

def build_tree(lines):
    lines = iter(lines)
    stack = []
    for line in lines:
        indent = len(list(takewhile(is_tab, line)))
        stack[indent:] = [line.lstrip()]
        print stack

source = '''ROOT
Node1
Node2
Node3
Node4
Node5
Node6'''

build_tree(source.split('
'))

Result:

['ROOT']
['ROOT', 'Node1']
['ROOT', 'Node1', 'Node2']
['ROOT', 'Node1', 'Node2', 'Node3']
['ROOT', 'Node1', 'Node2', 'Node3', 'Node4']
['ROOT', 'Node5']
['ROOT', 'Node6']

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...