# Nearest neighbor for list of arrays

## Question:

`I have a list of arrays like this (in x, y coordinates):

``````coordinates= array([[ 300, 2300],
[ 670, 2360],
[ 400, 2300]]), array([[1500, 1960],
[1620, 2200],
[1505, 1975]]), array([[ 980, 1965],
[1060, 2240],
[1100, 2250],
[ 980, 1975]]), array([[ 565, 1940],
[ 680, 2180],
[ 570, 1945]])]
``````

I want the arrays to be sorted by the nearest neighbor. So, the last coordinate of the first array should be close to the first coordinate of the next array and so forth.

I extracted all of the first and last coordinates and put them in two lists using python numpy. Then tried to use sklearn.neighbors NearestNeighbors, but it didn’t work.`

expected output:

``````coordinates= array([[ 300, 2300],
[ 670, 2360],
[ 400, 2300]]), array([[ 565, 1940],
[ 680, 2180],
[ 570, 1945]]), array([[ 980, 1965],
[1060, 2240],
[1100, 2250],
[ 980, 1975]]), array([[1500, 1960],
[1620, 2200],
[1505, 1975]])]
``````

This looks like a `traveling_salesman_problem`. Compute the pairwise distance between all nodes, make the self distance NaN/Inf, then get the shortest path:

``````from numpy import array
import networkx as nx

coordinates= [array([[ 300, 2300],
[ 670, 2360],
[ 400, 2300]]),
array([[1500, 1960],
[1620, 2200],
[1505, 1975]]),
array([[ 980, 1965],
[1060, 2240],
[1100, 2250],
[ 980, 1975]]),
array([[ 565, 1940],
[ 680, 2180],
[ 570, 1945]])]

# extract starts and ends
starts = np.vstack([a[0]  for a in coordinates])
ends   = np.vstack([a[-1] for a in coordinates])

# compute pairwise distances, except self
dist = np.sqrt(((starts[:, None] - ends)**2).sum(-1))
np.fill_diagonal(dist, np.nan)

# build the graph
G = nx.from_numpy_array(np.round(dist, 1), create_using=nx.DiGraph)
G.remove_edges_from(nx.selfloop_edges(G))

# find shortest path (NB. cycle could be False)
path = nx.approximation.traveling_salesman_problem(G, cycle=True)
# [1, 0, 3, 2, 1]

out = [coordinates[i] for i in path[1:]]
``````

If you don’t necessarily want a cycle, use `cycle=False` and `out = [coordinates[i] for i in path]`.

Output:

``````[array([[ 300, 2300],
[ 670, 2360],
[ 400, 2300]]),
array([[ 565, 1940],
[ 680, 2180],
[ 570, 1945]]),
array([[ 980, 1965],
[1060, 2240],
[1100, 2250],
[ 980, 1975]]),
array([[1500, 1960],
[1620, 2200],
[1505, 1975]])]
``````

Intermediate `dist`:

``````array([[          nan, 1248.05849222,  753.67433285,  446.01008957],
[1151.34703717,           nan,  520.21630117,  930.12095988],
[ 669.79474468,  525.09522946,           nan,  410.48751504],
[ 396.01136347,  940.65137006,  416.47328846,           nan]])
``````

Graph (with distances as edge labels and the shortest path in red):

Categories: questions
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.