# Getting Unique Values and their Following Pairs from a List of Tuples

## Question:

I have a list of tuples like this:

``````[
('a', 'AA'), # pair 1

('d', 'AA'), # pair 2
('d', 'a'),  # pair 3
('d', 'EE'), # pair 4

('b', 'BB'), # pair 5
('b', 'CC'), # pair 6
('b', 'DD'), # pair 7

('c', 'BB'), # pair 8
('c', 'CC'), # pair 9
('c', 'DD'), # pair 10

('c', 'b'),  # pair 11

('d', 'FF'), # pair 12

]
``````

Each tuple in the list above shows a similar pair of items (or duplicate items). I need to create a dictionary in which keys will be one of the unique item from the tuples and values will be lists filled with all the other items that the key occurred in conjunction with. For example, ‘a’ is similar to ‘AA'(pair 1), which in turn is similar to ‘d'(pair 2) and ‘d’ is similar to ‘EE’ and ‘FF’ (pairs 4 and 12). same is the case with other items.

My expected output is:

``````{'a':['AA', 'd', 'EE', 'FF'], 'b':['BB', 'CC', 'DD', 'c']}
``````

According to the tuples, `['a', 'AA', 'd', 'EE', 'FF']` are similar; so, any one of them can be the key, while the remaining items will be it’s values. so, output can also be: `{'AA':['a', 'd', 'EE', 'FF'], 'c':['BB', 'CC', 'DD', 'b']}`. So, key of the output dict can be anything from the duplicate pairs.

How do I do this for a list with thousands of such tuples in a list?

You can use 3 dictionaries, one for the output (`out`), one to track the seen values (`seen`), and one to map the equal keys (`mapper`):

``````out = {}
seen = {}
mapper = {}

for a, b in l:
if b in seen:
out.setdefault(seen[b], []).append(a)
mapper[a] = seen[b]
else:
out.setdefault(mapper.setdefault(a, a), []).append(b)
seen[b] = a

# remove duplicates
out = {k: list(dict.fromkeys(x for x in v if x!=k))
for k, v in out.items()}
``````

Output:

``````{'a': ['AA', 'd', 'EE', 'FF'],
'b': ['BB', 'CC', 'DD', 'c']}
``````

#### graph approach

Or you might want to approach this using a graph:

``````import networkx as nx

G = nx.from_edgelist(l)

out = {(x:=list(g)): x[1:] for g in nx.connected_components(G)}
``````

Output:

``````{'EE': ['d', 'a', 'FF', 'AA'],
'CC': ['b', 'c', 'BB', 'DD']}
``````

Graph: As said in the comment, you seem to want to find the equivalence classes on the given relationship.

``````ecs = []
for a, b in data:
a_ec = next((ec for ec in ecs if a in ec), None)
b_ec = next((ec for ec in ecs if b in ec), None)
if a_ec:
if b_ec:
# Found equivalence classes for both elements, everything is okay
if a_ec is not b_ec:
# We only need one of them though
ecs.remove(b_ec)
a_ec.update(b_ec)
else:
# Add the new element to the found equivalence class
else:
if b_ec:
# Add the new element to the found equivalence class