Django sort by distance

Question:

I have the following model:

class Vacancy(models.Model):
    lat = models.FloatField('Latitude', blank=True)
    lng = models.FloatField('Longitude', blank=True)

How should I make a query to sort by distance (distance is infinity)?

Working on PosgreSQL, GeoDjango if it is required.

Asked By: Rukomoynikov

||

Answers:

Note: Please check cleder’s answer below which mentions about
deprecation issue (distance -> annotation) in Django versions.

First of all, it is better to make a point field instead of making lat and lnt separated:

from django.contrib.gis.db import models

location = models.PointField(null=False, blank=False, srid=4326, verbose_name='Location')

Then, you can filter it like that:

from django.contrib.gis.geos import Point
from django.contrib.gis.measure import D

distance = 2000 
ref_location = Point(1.232433, 1.2323232)

res = YourModel.objects.filter(
    location__distance_lte=(
        ref_location,
        D(m=distance)
    )
).distance(
    ref_location
).order_by(
    'distance'
)
Answered By: cem

Here is a solution that does not require GeoDjango.

from django.db import models
from django.db.models.expressions import RawSQL


class Location(models.Model):
    latitude = models.FloatField()
    longitude = models.FloatField()
    ...


def get_locations_nearby_coords(latitude, longitude, max_distance=None):
    """
    Return objects sorted by distance to specified coordinates
    which distance is less than max_distance given in kilometers
    """
    # Great circle distance formula
    gcd_formula = "6371 * acos(least(greatest(
    cos(radians(%s)) * cos(radians(latitude)) 
    * cos(radians(longitude) - radians(%s)) + 
    sin(radians(%s)) * sin(radians(latitude)) 
    , -1), 1))"
    distance_raw_sql = RawSQL(
        gcd_formula,
        (latitude, longitude, latitude)
    )
    qs = Location.objects.all() 
    .annotate(distance=distance_raw_sql) 
    .order_by('distance')
    if max_distance is not None:
        qs = qs.filter(distance__lt=max_distance)
    return qs

Use as follow:

nearby_locations = get_locations_nearby_coords(48.8582, 2.2945, 5)

If you are using sqlite you need to add somewhere

import math
from django.db.backends.signals import connection_created
from django.dispatch import receiver


@receiver(connection_created)
def extend_sqlite(connection=None, **kwargs):
    if connection.vendor == "sqlite":
        # sqlite doesn't natively support math functions, so add them
        cf = connection.connection.create_function
        cf('acos', 1, math.acos)
        cf('cos', 1, math.cos)
        cf('radians', 1, math.radians)
        cf('sin', 1, math.sin)
        cf('least', 2, min)
        cf('greatest', 2, max)
Answered By: rphlo

the .distance(ref_location) is removed in django >=1.9 you should use an annotation instead.

from django.contrib.gis.db.models.functions import Distance
from django.contrib.gis.measure import D
from django.contrib.gis.geos import Point

ref_location = Point(1.232433, 1.2323232, srid=4326)
yourmodel.objects.filter(location__distance_lte=(ref_location, D(m=2000)))                                                     
    .annotate(distance=Distance("location", ref_location))                                                                
    .order_by("distance")

also you should narrow down your search with the dwithin operator which uses the spatial index, distance does not use the index which slows your query down:

yourmodel.objects.filter(location__dwithin=(ref_location, 0.02))
    .filter(location__distance_lte=(ref_location, D(m=2000)))
    .annotate(distance=Distance('location', ref_location))
    .order_by('distance')

see this post for an explanation of location__dwithin=(ref_location, 0.02)

Answered By: cleder

If you don’t want/have no opportunity to use gis, here is sollution (haversine distance fomula writter in django orm sql):

lat = 52.100
lng = 21.021

earth_radius=Value(6371.0, output_field=FloatField())

f1=Func(F('latitude'), function='RADIANS')
latitude2=Value(lat, output_field=FloatField())
f2=Func(latitude2, function='RADIANS')

l1=Func(F('longitude'), function='RADIANS')
longitude2=Value(lng, output_field=FloatField())
l2=Func(longitude2, function='RADIANS')

d_lat=Func(F('latitude'), function='RADIANS') - f2
d_lng=Func(F('longitude'), function='RADIANS') - l2

sin_lat = Func(d_lat/2, function='SIN')
cos_lat1 = Func(f1, function='COS')
cos_lat2 = Func(f2, function='COS')
sin_lng = Func(d_lng/2, function='SIN')

a = Func(sin_lat, 2, function='POW') + cos_lat1 * cos_lat2 * Func(sin_lng, 2, function='POW')
c = 2 * Func(Func(a, function='SQRT'), Func(1 - a, function='SQRT'), function='ATAN2')
d = earth_radius * c

Shop.objects.annotate(d=d).filter(d__lte=10.0)

PS
change models, change filter to order_by, change keyword and parametrize

PS2
for sqlite3, you should ensure, that there are available function SIN, COS, RADIANS, ATAN2, SQRT

Answered By: 404pio

Best practice for this changes quite quickly, so I’ll answer with what I think is most up-to-date as of 2020-01-18.

With GeoDjango

Using geography=True with GeoDjango makes this much easier. It means everything is stored in lng/lat, but distance calculations are done in meters on the surface of the sphere. See the docs

from django.db import models
from django.contrib.gis.db.models import PointField

class Vacancy(models.Model):
    location = PointField(srid=4326, geography=True, blank=True, null=True)

Django 3.0

If you have Django 3.0, you can sort your whole table using the following query. It uses postgis’ <-> operator, which means sorting will use the spacial index and the annotated distance will be exact (for Postgres 9.5+). Note that “sorting by distance” implicitly requires a distance from something. The first argument to Point is the longitude and the second is latitude (the opposite of the normal convention).

from django.contrib.gis.db.models.functions import GeometryDistance
from django.contrib.gis.geos import Point

ref_location = Point(140.0, 40.0, srid=4326)
Vacancy.objects.order_by(GeometryDistance("location", ref_location))

If you want to use the distance from the reference point in any way, you’ll need to annotate it:

Vacancy.objects.annotate(distance=GeometryDistance("location", ref_location))
    .order_by("distance")

If you have a lot of results, calculating the exact distance for every entry will still be slow. You should reduce the number of results with one of the following:

Limit the number of results with queryset slicing

The <-> operator won’t calculate exact distance for (most) results it won’t return, so slicing or paginating the results is fast. To get the first 100 results:

Vacancy.objects.annotate(distance=GeometryDistance("location", ref_location))
    .order_by("distance")[:100]

Only get results within a certain distance with dwithin

If there is a maximum distance that you want results for, you should use dwithin. The dwithin django query uses ST_DWithin, which means it’s very fast. Setting geography=True means this calculation is done in meters, not degrees. The final query for everything within 50km would be:

Vacancy.objects.filter(location__dwithin=(ref_location, 50000))
    .annotate(distance=GeometryDistance("location", ref_location))
    .order_by("distance")

This can speed up queries a bit even if you are slicing down to a few results.

The second argument to dwithin also accepts django.contrib.gis.measure.D objects, which it converts into meters, so instead of 50000 meters, you could just use D(km=50).

Filtering on distance

You can filter directly on the annotated distance, but it will duplicate the <-> call and be a fair amount slower than dwithin.

Vacancy.objects.annotate(distance=GeometryDistance("location", ref_location))
    .filter(distance__lte=50000)
    .order_by("distance")

Django 2.X

If you don’t have Django 3.0, you can still sort your whole table using Distance instead of GeometryDistance, but it uses ST_Distance, which might be slow if it is done on every entry and there are a lot of entries. If that’s the case, you can use dwithin to narrow down the results.

Note that slicing will not be fast because Distance needs to calculate the exact distance for everything in order to sort the results.

Without GeoDjango

If you don’t have GeoDjango, you’ll need a sql formula for calculating distance. The efficiency and correctness varies from answer to answer (especially around the poles/dateline), but in general it will be fairly slow.

One way to speed queries up is to index lat and lng and use mins/maxes for each before annotating the distance. The math is quite complicated because the bounding “box” isn’t exactly a box. See here: How to calculate the bounding box for a given lat/lng location?

Answered By: Jonathan Richards

On Django 3.0 there will be a GeometryDistance function, which works the same way as Distance, but uses the <-> operator instead, which uses spatial indexes on ORDER BY queries, eliminating the need for a dwithin filter:

from django.contrib.gis.db.models.functions import GeometryDistance
from django.contrib.gis.geos import Point

ref_location = Point(140.0, 40.0, srid=4326)
Vacancy.objects.annotate(
    distance=GeometryDistance('location', ref_location)
).order_by('distance')

If you want to use it before Django 3.0 is released, you could use something like this:

from django.contrib.gis.db.models.functions import GeoFunc
from django.db.models import FloatField
from django.db.models.expressions import Func

class GeometryDistance(GeoFunc):
   output_field = FloatField()
   arity = 2
   function = ''
   arg_joiner = ' <-> '
   geom_param_pos = (0, 1)

   def as_sql(self, *args, **kwargs):
       return Func.as_sql(self, *args, **kwargs)
Answered By: Francisco

WITHOUT POSTGIS

If you don’t want to change your models i.e, keep lat and lng as seperate fields and even don’t want to use too much Geodjango and want to solve this problem with some basic code then here is the solution;

origin = (some_latitude, some_longitude) #coordinates from where you want to measure distance
distance = {} #creating a dict which will store the distance of users.I am using usernames as keys and the distance as values.
for m in models.objects.all():
    dest = (m.latitude, m.longitude)
    distance[m.username] = round(geodesic(origin, dest).kilometers, 2) #here i am using geodesic function which takes two arguments, origin(coordinates from where the distance is to be calculated) and dest(to which distance is to be calculated) and round function rounds off the float to two decimal places

#Here i sort the distance dict as per value.So minimum distant users will be first.
s_d = sorted(distance.items(), key=lambda x: x[1]) #note that sorted function returns a list of tuples as a result not a dict.Those tuples have keys as their first elements and vaues as 2nd.

new_model_list = []
for i in range(len(s_d)):
    new_model_list.append(models.objects.get(username=s_d[i][0]))

Now the new_model_list will contain all the users ordered in distance.By iterating over it, you will get them ordered on the basis of distance.

WITH POSTGIS

Add a point field in your models;

from django.contrib.gis.db import models

class your_model(models.Model):
    coords = models.PointField(null=False, blank=False, srid=4326, verbose_name='coords')

Then in views.py;

from django.contrib.gis.db.models.functions import Distance
from .models import your_model

user = your_model.objects.get(id=some_id) # getting a user with desired id

sortedQueryset = your_model.objects.all().annotate(distance=Distance('coords', user.coords, spheroid=True)).order_by('distance')

Distance function takes first parameter as the field from the database against which we have to calculate the distance (here coords).
2nd parameter is the coordinates from which the distance is to be calculated.

Spheroid specifies the accuracy of the distance. By setting this to True, it will give more accurate distance else less accurate as for Spheroid = False, it treats points as the points on a sphere (which is wrong for earth).

Answered By: Irfan wani

in views.py use CustomHaystackGEOSpatialFilter for filter_backends :

class LocationGeoSearchViewSet(HaystackViewSet):
    index_models = [yourModel]
    serializer_class = LocationSerializer
    filter_backends = [CustomHaystackGEOSpatialFilter]

in filters.py define CustomHaystackGEOSpatialFilter and override apply_filters method so you can order the distance and limit your result count like :

class CustomHaystackGEOSpatialFilter(HaystackGEOSpatialFilter):
    # point_field = 'location'
   def apply_filters(self, queryset, applicable_filters=None, applicable_exclusions=None):
        if applicable_filters:
            queryset = queryset.dwithin(**applicable_filters["dwithin"]).distance(
                **applicable_filters["distance"]).order_by("distance")[:100]
        return queryset
Answered By: Mahmoud Masri
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.