Networkx : getting all possible paths in DAG

Question:

I am trying to split a directed (acyclic) graph into direction-connected path, relying on connectivity :

Example graph

When I test weak and strong connectivity subgraphs, here is what I get :

Weak connectivity :
['16', '17'], ['3', '41', '39', '42']
Strong connectivity :
['17'], ['16'], ['39'], ['41'], ['3'], ['42']

I understand the weak connectivity result, but not the strong-connectivity one, as I would expect 3 subgraphs : [16, 17], [42, 39] and [3, 41, 39].

What am I missing here, why those single node lists ? How to get the expected result ?

Here is the code :

import networkx as nx
import matplotlib.pyplot as plt

G = nx.DiGraph()
G.add_edges_from([('16', '17'), ('3', '41'), ('41', '39'), ('42', '39')])

print("Weak connectivity : ")
for subgraph in (G.subgraph(c).copy() for c in nx.weakly_connected_components(G)) :
    print(subgraph.nodes)
print("Strong connectivity : ")
for subgraph in (G.subgraph(c).copy() for c in nx.strongly_connected_components(G)) :
    print(subgraph.nodes)

nx.draw_networkx(G, pos=nx.circular_layout(G))
plt.show()
Asked By: Arkeen

||

Answers:

According to the definition of strongly connected graph, the result you get is correct.

DEFINITION: strongly connected graph

A directed graph G=(V,E) is said to be strongly connected if every vertex v in V is reachable from every other vertex in V.

Answered By: sentence

What you’re missing is the definition of strongly connected:

[A directed graph] is strongly connected, diconnected, or simply
strong if it contains a directed path from u to v and a directed path
from v to u for every pair of vertices u, v. The strong components are
the maximal strongly connected subgraphs.

You have no strong connection between any two nodes of the graph shown, let alone the 3-node subgraph you list. You can, indeed, traverse 3 -> 41 -> 39, but there is no path back to 41, let alone 3. That graph is, therefore, not strongly connected.

Answered By: Prune

So, thanks to comments & answers, I realised that “connectivity” was a false lead for what I want to achieve. To be clear : I want to get every possible path between all starting nodes to their connected ending nodes, in a directed acyclic graph.

So I ended up writing my own solution, which is quite simple to understand, but probably not the best, regarding performance or style (pythonic / networkx). Improvment suggestions are welcome 🙂

import networkx as nx
import matplotlib.pyplot as plt

G = nx.DiGraph()
G.add_edges_from([('16', '17'), ('3', '41'), ('41', '39'), ('42', '39')])

roots = []
leaves = []
for node in G.nodes :
  if G.in_degree(node) == 0 : # it's a root
    roots.append(node)
  elif G.out_degree(node) == 0 : # it's a leaf
    leaves.append(node)

for root in roots :
  for leaf in leaves :
    for path in nx.all_simple_paths(G, root, leaf) :
      print(path)

nx.draw_networkx(G, pos=nx.circular_layout(G))
plt.show()

(If there is a built-in function in networkx, I clearly missed it)

Answered By: Arkeen

@Arkeen, Your solution looks very close to what is in the networkx documentation for all_simple_paths. It reads,

    Iterate over each path from the root nodes to the leaf nodes in a
    directed acyclic graph passing all leaves together to avoid unnecessary
    compute::

        >>> G = nx.DiGraph([(0, 1), (2, 1), (1, 3), (1, 4)])
        >>> roots = (v for v, d in G.in_degree() if d == 0)
        >>> leaves = [v for v, d in G.out_degree() if d == 0]
        >>> all_paths = []
        >>> for root in roots:
        ...     paths = nx.all_simple_paths(G, root, leaves)
        ...     all_paths.extend(paths)
        >>> all_paths
        [[0, 1, 3], [0, 1, 4], [2, 1, 3], [2, 1, 4]]

I think this works fine if there are only a few components in the graph. If the number of components is large, then this approach spends most of its time trying to connect roots to leaves that are in other components. But, still works.

Answered By: brocla