How to divide by two groups (Python Pandas DataFrame)?

Question:

I have a dataset like this about installation(ON)-remove(OFF) of equipment.
A,B,C,D are ID of each independant equipment. I want to divide these ID by two groups with some rules.

enter image description here

As you see, when I remove B, I install A. And After removing A, I install C. It is same for T after removing C.
In the same way, when I remove D, I install F. Same for H after F.

My hypothesis is that there are two group of equipement. For example, we can say that :

Group 1 : B-A-C-T

Group 2 : D-F-H

ON = ['A','C','F','T','H']
OFF = ['B','A','D','C','F']
df= pd.DataFrame({'ON':ON,'OFF':OFF})

Maybe I can try something with dictionary, but I have no idea.

I want to two list as a result :

Group 1 = ['B','A','C','T']
Group 2 = ['D','F','H']

Asked By: stat_man

||

Answers:

Using a network library like networkx can simplify the problem. What you want is to find all paths from root and leaf nodes.

# pip install networkx
import networkx as nx
import itertools

# Create a directed graph from Pandas edges list
G = nx.from_pandas_edgelist(df, source='OFF', target='ON', create_using=nx.DiGraph)

# Find all roots and leaves
roots = [node for node, degree in G.in_degree if degree == 0]
leaves = [node for node, degree in G.out_degree if degree == 0]

# Get all possible paths between roots and leaves
paths = []
for root, leaf in itertools.product(roots, leaves):
    for path in nx.all_simple_paths(G, root, leaf):
        paths.append(path)

Output:

>>> paths
[['B', 'A', 'C', 'T'], ['D', 'F', 'H']]

Visualization:

import matplotlib.pyplot as plt

nx.draw_networkx(G)
plt.show()

Output:
enter image description here

Answered By: Corralien