Finding closest values in a list of dictionary keys Python
Question:
Given a point:
a=[X,Y,Z]
I am essentially trying to find the closest 3 points to that given point from a list of dictionaries.
A simplifed example of the kind of data it needs to compare to is given in the form:
points=[{'Point':1,'co-ordinate':[0,1,2]},{'Point':2',co-ordinate':[0,1,3]},{'Point':3,'co-ordinate':[1,1,2]}] etc.
Any ideas or suggestions?
Answers:
You can keep a reverse lookup table, where you return the key-value pairs and store the coordinates as the key. This is easy to implement. Then you can return the keys again and do the distance formula on each coordinate.
As you know, the distance formula is:
dist = sqrt((x1 - x2)**2 + (y1 - y2)**2 + (z1 - z2)**2)
Note: It looks like you have 3 different dictionaries in that list.
Closest implies that you define a distance function. For a point in space, the norm 2 is usually used. Let’s code first a function that computes that norm between two points, but as we will probably have to use it against an iterator (or maybe because I foresee something, as a key function) we make it a closure (to find the closest value, that cool).
from math import sqrt
def norm2(ptA):
def norm2_close(ptB):
xA, yA, zA = ptA
xB, yB, zb = ptB
return sqrt((xA-xB)**2 + (yA-yB)**2 + (zA-zB)**2)
return norm2_close
Now, we can do
>>> normFromA = norm2([1, 2, 3])
>>> normFromA([3, 2, 1])
2.8284271247461903
>>> normfromA([4, 5, 6])
5.196152422706632
Very well. But we still need to get the minimum from your list of dicts. There are many possibilities, but as we wrote a nice closure, let’s just modify it to suit our needs:
def norm2InDict(ptA):
def norm2InDict_close(dict_for_ptB):
xA, yA, zA = ptA
xB, yB, zB = dict_for_ptB['co-ordinate']
return sqrt((xA-xB)**2 + (yA-yB)**2 + (zA-zB)**2)
return norm2InDict_close
and let python do the boring work
>>> min(points, key=norm2InDict([1, 2, 3]))
{'co-ordinate': [0, 1, 3], 'Point': 2}
To understand the function, python will loop through the elements of the lists (each dictionary), apply the key function on them (that will compute the norm 2), compare the keys and return the element that has the smallest key. Right. And if I want the three closest elements, and not a single one? Well, the documentation tells us we can use the heapq module for that (I add some points to the list, for more fun):
>>> import heapq
>>> points=[
{'Point':1,'co-ordinate':[0,1,2]},
{'Point':2,'co-ordinate':[0,1,3]},
{'Point':3,'co-ordinate':[1,1,2]},
{'Point':4,'co-ordinate':[2,5,2]},
{'Point':5,'co-ordinate':[1,0,2]},
{'Point':6,'co-ordinate':[1,2,2]}
]
>>> heapq.nsmallest(3, points, key=norm2InDict([1, 2, 3]))
[{'co-ordinate': [1, 2, 2], 'Point': 6}, {'co-ordinate': [0, 1, 3], 'Point': 2}, {'co-ordinate': [1, 1, 2], 'Point': 3}]
You could sort the list of points based on a distance function then use the first one.
import math
a=[0,0,0]
def dist(p0,p1):
return math.sqrt((p1[0]-p0[0])**2+(p1[1]-p0[1])**2+(p1[2]-p0[2])**2)
points=[{'Point':1,'co-ordinate':[0,1,2]},{'Point':2,'co-ordinate':[0,1,3]},{'Point':3,'co-ordinate':[1,1,2]},]
sorted_by_dist = sorted(points,key=lambda p:dist(a,p['co-ordinate']))
closest = sorted_by_dist[0]
furthest = sorted_by_dist[-1]
Learn about the sorted
function here. Look for the key
option in the sorted
function.
Once you know about the sorted function, you can just sort your dictionary, and to key, just supply the function to sort it with. Thus, let us say that you have the point p
as
p = [2,3,4] # or any other list value ...
Then a function that takes this point and another and returns a distance can be written as:
# Note that there is no need for the Numpy dependency. I do this only for
# brevety. You can use the dist function which was previously mentioned.
import numpy as np
def dist(p1, p2):
p1, p2 = np.array(p1), np.array(p2)
return sqrt(sum((p1 - p2)**2))
Now you can just sort the array, and take the first 3 points as:
pointList = sorted(points, key = lambda x: dist(x['co-ordinate'], p) )[:3]
Given a point:
a=[X,Y,Z]
I am essentially trying to find the closest 3 points to that given point from a list of dictionaries.
A simplifed example of the kind of data it needs to compare to is given in the form:
points=[{'Point':1,'co-ordinate':[0,1,2]},{'Point':2',co-ordinate':[0,1,3]},{'Point':3,'co-ordinate':[1,1,2]}] etc.
Any ideas or suggestions?
You can keep a reverse lookup table, where you return the key-value pairs and store the coordinates as the key. This is easy to implement. Then you can return the keys again and do the distance formula on each coordinate.
As you know, the distance formula is:
dist = sqrt((x1 - x2)**2 + (y1 - y2)**2 + (z1 - z2)**2)
Note: It looks like you have 3 different dictionaries in that list.
Closest implies that you define a distance function. For a point in space, the norm 2 is usually used. Let’s code first a function that computes that norm between two points, but as we will probably have to use it against an iterator (or maybe because I foresee something, as a key function) we make it a closure (to find the closest value, that cool).
from math import sqrt
def norm2(ptA):
def norm2_close(ptB):
xA, yA, zA = ptA
xB, yB, zb = ptB
return sqrt((xA-xB)**2 + (yA-yB)**2 + (zA-zB)**2)
return norm2_close
Now, we can do
>>> normFromA = norm2([1, 2, 3])
>>> normFromA([3, 2, 1])
2.8284271247461903
>>> normfromA([4, 5, 6])
5.196152422706632
Very well. But we still need to get the minimum from your list of dicts. There are many possibilities, but as we wrote a nice closure, let’s just modify it to suit our needs:
def norm2InDict(ptA):
def norm2InDict_close(dict_for_ptB):
xA, yA, zA = ptA
xB, yB, zB = dict_for_ptB['co-ordinate']
return sqrt((xA-xB)**2 + (yA-yB)**2 + (zA-zB)**2)
return norm2InDict_close
and let python do the boring work
>>> min(points, key=norm2InDict([1, 2, 3]))
{'co-ordinate': [0, 1, 3], 'Point': 2}
To understand the function, python will loop through the elements of the lists (each dictionary), apply the key function on them (that will compute the norm 2), compare the keys and return the element that has the smallest key. Right. And if I want the three closest elements, and not a single one? Well, the documentation tells us we can use the heapq module for that (I add some points to the list, for more fun):
>>> import heapq
>>> points=[
{'Point':1,'co-ordinate':[0,1,2]},
{'Point':2,'co-ordinate':[0,1,3]},
{'Point':3,'co-ordinate':[1,1,2]},
{'Point':4,'co-ordinate':[2,5,2]},
{'Point':5,'co-ordinate':[1,0,2]},
{'Point':6,'co-ordinate':[1,2,2]}
]
>>> heapq.nsmallest(3, points, key=norm2InDict([1, 2, 3]))
[{'co-ordinate': [1, 2, 2], 'Point': 6}, {'co-ordinate': [0, 1, 3], 'Point': 2}, {'co-ordinate': [1, 1, 2], 'Point': 3}]
You could sort the list of points based on a distance function then use the first one.
import math
a=[0,0,0]
def dist(p0,p1):
return math.sqrt((p1[0]-p0[0])**2+(p1[1]-p0[1])**2+(p1[2]-p0[2])**2)
points=[{'Point':1,'co-ordinate':[0,1,2]},{'Point':2,'co-ordinate':[0,1,3]},{'Point':3,'co-ordinate':[1,1,2]},]
sorted_by_dist = sorted(points,key=lambda p:dist(a,p['co-ordinate']))
closest = sorted_by_dist[0]
furthest = sorted_by_dist[-1]
Learn about the sorted
function here. Look for the key
option in the sorted
function.
Once you know about the sorted function, you can just sort your dictionary, and to key, just supply the function to sort it with. Thus, let us say that you have the point p
as
p = [2,3,4] # or any other list value ...
Then a function that takes this point and another and returns a distance can be written as:
# Note that there is no need for the Numpy dependency. I do this only for
# brevety. You can use the dist function which was previously mentioned.
import numpy as np
def dist(p1, p2):
p1, p2 = np.array(p1), np.array(p2)
return sqrt(sum((p1 - p2)**2))
Now you can just sort the array, and take the first 3 points as:
pointList = sorted(points, key = lambda x: dist(x['co-ordinate'], p) )[:3]