How to convert a list of edges to a tree in python?
Question:
I am having a list of edges that has the following format:
edges=[[1,4],[1,3],[1,2],[3,5],[3,6],[3,7]]
Here in each edge the first element is the parent node and the second is a child node i.e, in
[1,4]—->(1 is the parent node and 4 is child node)
I have to create a function that returns the pointer to the root of the tree. At first I tried creating a dictionary but after creating I am unable to proceed.
Please provide any ideas of how to implement this?
Answers:
Assuming it is always a tree (and thus we have no two separate graphs) the task is the as determining which number never appears in the second position.
So:
- Get a list of all numbers (nodes) which we call
possible_roots
- Iterate the list of your edges and remove the child node from our above list
possible_roots
- If it is a tree, there must only be one element left in
possible_roots
. This is the root of your tree.
There are many ways to create a tree data structure… moreover Python does not have a pointer data type, so the root of a tree would be an object.
Here is one way to do it:
First define a Node class:
class Node():
def __init__(self, data=None):
self.data = data
self.children = []
Then the main algorithm:
def create_tree(edges):
# Get all the unique keys into a set
node_keys = set(key for keys in edges for key in keys)
# Create a Node instance for each of them, keyed by their key in a dict:
nodes = { key: Node(key) for key in node_keys }
# Populate the children attributes from the edges
for parent, child in edges:
nodes[parent].children.append(nodes[child])
# Remove the child from the set, so we will be left over with the root
node_keys.remove(child)
# Get the root from the set, which at this point should only have one member
for root_key in node_keys: # Just need one
return nodes[root_key]
Run it as follows:
# Example run
edges = [[1,4],[1,3],[1,2],[3,5],[3,6],[3,7]]
root = create_tree(edges)
If you want to quickly verify the tree’s shape, add this method to the Node
class:
def __repr__(self, indent=""):
return (indent + repr(self.data) + "n"
+ "".join(child.__repr__(indent+" ")
for child in self.children))
And use it to print the tree:
print(root)
This print
method is just a very simple way to visualise the tree. With a bit more code you can also draw the branches of the tree, but this is enough to debug the code.
You need to find out which vertex has no parent. This can be done by building a set
of all vertices, then discarding vertices that have a parent.
Alternatively, this can be done from building on the one hand, the set of all parent vertices, and on the other hand, the set of all child vertices; then taking the set different parents - children
.
Then there are three possibilities:
- There are no vertices left. This means that your directed graph contained a cycle, and there is no root. Example:
[[0,1], [1,2], [2,0]]
.
- There is more than one vertex left. This means that your directed graph contains more than one "root". Example:
[[0,2], [1,2]]
.
- There is exactly one vertex left. This must be the root.
# FIRST METHOD
def get_root(dag):
candidates = set(parent for (parent,_) in dag)
for _,child in dag:
candidates.discard(child)
assert(len(candidates) == 1) # or handle len(candidates) == 0 and len(candidates) > 1 with an if/elif/else
return candidates.pop()
# SECOND METHOD
def get_root(dag):
parents, children = map(frozenset, zip(*dag))
candidates = parents - children
root, = candidates # will raise exception if len(candidates) != 1
return root
Testing:
print( get_root([[1,4],[1,3],[1,2],[3,5],[3,6],[3,7]]) )
# 1
print( get_root([[0,2], [1,2]]) )
# ValueError: too many values to unpack (expected 1)
print( get_root([[0,1], [1,2], [2,0]]) )
# ValueError: not enough values to unpack (expected 1, got 0)
I am having a list of edges that has the following format:
edges=[[1,4],[1,3],[1,2],[3,5],[3,6],[3,7]]
Here in each edge the first element is the parent node and the second is a child node i.e, in
[1,4]—->(1 is the parent node and 4 is child node)
I have to create a function that returns the pointer to the root of the tree. At first I tried creating a dictionary but after creating I am unable to proceed.
Please provide any ideas of how to implement this?
Assuming it is always a tree (and thus we have no two separate graphs) the task is the as determining which number never appears in the second position.
So:
- Get a list of all numbers (nodes) which we call
possible_roots
- Iterate the list of your edges and remove the child node from our above list
possible_roots
- If it is a tree, there must only be one element left in
possible_roots
. This is the root of your tree.
There are many ways to create a tree data structure… moreover Python does not have a pointer data type, so the root of a tree would be an object.
Here is one way to do it:
First define a Node class:
class Node():
def __init__(self, data=None):
self.data = data
self.children = []
Then the main algorithm:
def create_tree(edges):
# Get all the unique keys into a set
node_keys = set(key for keys in edges for key in keys)
# Create a Node instance for each of them, keyed by their key in a dict:
nodes = { key: Node(key) for key in node_keys }
# Populate the children attributes from the edges
for parent, child in edges:
nodes[parent].children.append(nodes[child])
# Remove the child from the set, so we will be left over with the root
node_keys.remove(child)
# Get the root from the set, which at this point should only have one member
for root_key in node_keys: # Just need one
return nodes[root_key]
Run it as follows:
# Example run
edges = [[1,4],[1,3],[1,2],[3,5],[3,6],[3,7]]
root = create_tree(edges)
If you want to quickly verify the tree’s shape, add this method to the Node
class:
def __repr__(self, indent=""):
return (indent + repr(self.data) + "n"
+ "".join(child.__repr__(indent+" ")
for child in self.children))
And use it to print the tree:
print(root)
This print
method is just a very simple way to visualise the tree. With a bit more code you can also draw the branches of the tree, but this is enough to debug the code.
You need to find out which vertex has no parent. This can be done by building a set
of all vertices, then discarding vertices that have a parent.
Alternatively, this can be done from building on the one hand, the set of all parent vertices, and on the other hand, the set of all child vertices; then taking the set different parents - children
.
Then there are three possibilities:
- There are no vertices left. This means that your directed graph contained a cycle, and there is no root. Example:
[[0,1], [1,2], [2,0]]
. - There is more than one vertex left. This means that your directed graph contains more than one "root". Example:
[[0,2], [1,2]]
. - There is exactly one vertex left. This must be the root.
# FIRST METHOD
def get_root(dag):
candidates = set(parent for (parent,_) in dag)
for _,child in dag:
candidates.discard(child)
assert(len(candidates) == 1) # or handle len(candidates) == 0 and len(candidates) > 1 with an if/elif/else
return candidates.pop()
# SECOND METHOD
def get_root(dag):
parents, children = map(frozenset, zip(*dag))
candidates = parents - children
root, = candidates # will raise exception if len(candidates) != 1
return root
Testing:
print( get_root([[1,4],[1,3],[1,2],[3,5],[3,6],[3,7]]) )
# 1
print( get_root([[0,2], [1,2]]) )
# ValueError: too many values to unpack (expected 1)
print( get_root([[0,1], [1,2], [2,0]]) )
# ValueError: not enough values to unpack (expected 1, got 0)