constructing a tree view from a nested list with duplicate subtrees (using anytree/treelib)

Question:

I have a nested list like the following:

lst = [['a', 'b', 'e'],      # this e is the branch of b
       ['a', 'f', 'e'],      # this e is the branch of f,
       ['a', 'h', 'i i i']]  # string with spaces

and I wanna to construct a tree like:

a
├── b
│   └── e
├── f
|   └── e
└── h
    └── i i i

I want to use either of the two packages: treelib and anytree. I’ve read many posts and tried many different methods and didn’t make it work.

Update:

I came up with the following method, but the problems I have now are

  1. the vertical order of the branches (e.g., “b”, “f”, “h”) is not guaranteed (when I have many lists in a list ).
  2. “e” as a branch of “f” won’t show up
from treelib import Node, Tree

# make list flat
lst = sum([i for i in lst], [])

tree = Tree()
tree_dict = {}

# create root node
tree_dict[lst[0]] = tree.create_node(lst[0])

for index, item in enumerate(lst[1:], start=1):
    if item not in tree_dict.keys():
        partent_node = tree_dict[lst[index-1]]
        tree_dict[item] = tree.create_node(item, parent=partent_node)

tree.show()
Asked By: steven

||

Answers:

i looked into anytree and came up with this:

from anytree import Node, RenderTree

lst = [["a", "b", "c", "e"], ["a", "b", "f"], ["a", "b", "c", "g", "h"], ["a", "i"]]


def list_to_anytree(lst):
    root_name = lst[0][0]
    root_node = Node(root_name)
    nodes = {root_name: root_node}  # keeping a dict of the nodes
    for branch in lst:
        assert branch[0] == root_name
        for parent_name, node_name in zip(branch, branch[1:]):
            node = nodes.setdefault(node_name, Node(node_name))
            parent_node = nodes[parent_name]
            if node.parent is not None:
                assert node.parent.name == parent_name
            else:
                node.parent = parent_node
    return root_node


anytree = list_to_anytree(lst)
for pre, fill, node in RenderTree(anytree):
    print(f"{pre}{node.name}")

there is not much happening here. i just convert your list to anytree nodes (and assert the list representation is valid while doing that). and i keep a dictionary of the nodes i already have in nodes.

the output is indeed

a
├── b
│   ├── c
│   │   ├── e
│   │   └── g
│   │       └── h
│   └── f
└── i

if you have multiple nodes with the same name you can not use the dict above; you need to iterate from the root node over the children:

def list_to_anytree(lst):
    root_name = lst[0][0]
    root_node = Node(root_name)
    for branch in lst:
        parent_node = root_node
        assert branch[0] == parent_node.name
        for cur_node_name in branch[1:]:
            cur_node = next(
                (node for node in parent_node.children if node.name == cur_node_name),
                None,
            )
            if cur_node is None:
                cur_node = Node(cur_node_name, parent=parent_node)
            parent_node = cur_node
    return root_node

your example

lst = [
    ["a", "b", "e"],  # this e is the branch of b
    ["a", "f", "e"],  # this e is the branch of f,
    ["a", "h", "i i i"],
]

anytree = list_to_anytree(lst)
for pre, fill, node in RenderTree(anytree):
    print(f"{pre}{node.name}")

then gives:

a
├── b
│   └── e
├── f
│   └── e
└── h
    └── i i i
Answered By: hiro protagonist