Pandas "rolling" groupby

Question:

Assuming this is my df:

    group     connected_to
0     1              1
1     2              0
2     2              1
3     2              2
4     3              5
5     4              4
6     3              7 
7     5              5

what I want to get is the minimal group per connected rows.

So row 0 is connected to 1, thus they are in the same group. Row 2 is also connected to 1 – thus it joins the group. Row 3 is connected to row 2 which joined the group, thus it is also joining the group etc.
Row 4 is not connected to any row in the first group thus it is a new group. The output should look like that:

    group     connected_to   minimal_group
0     1              1            1
1     2              0            1
2     2              1            1
3     2              2            1
4     3              5            3
5     4              4            3
6     3              7            3 
7     5              5            3

I implemented it using a for inside a while – really ugly solution.
Is there a more elegant way to do it on pandas?

Asked By: Binyamin Even

||

Answers:

Use:

import networkx as nx

#convert index to column index
df1 = df.reset_index()

# Create the graph from the dataframe
g = nx.Graph()
g = nx.from_pandas_edgelist(df1,'index','connected_to')

connected_components = nx.connected_components(g)

# Find the component id of the nodes
node2id = {}
for cid, component in enumerate(connected_components):
    for node in component:
        node2id[node] = cid

mapping index column by connected groups and get minimal group to new column
df['minimal_group'] = df1.groupby(df1['index'].map(node2id))['group'].transform('min')
print (df)
   group  connected_to  minimal_group
0      1             1              1
1      2             0              1
2      2             1              1
3      2             2              1
4      3             5              3
5      4             4              3
6      3             7              3
7      5             5              3
Answered By: jezrael
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.