Fastest way to get all the points between two (X,Y) coordinates in python

Question:

So I have a shapely LineString:

print np.round(shapely_intersecting_lines.coords).astype(np.int) 
>>> array([[ 1520, -1140],
           [ 1412,  -973]])

This can be interpreted as a numpy array as well as seen above.

I want to get all the points in between, that is I want to get the points of the line in between as integer values. The output should be something like this:

array([[ 1520, -1140],
       [ 1519, -1139],
       [ 1519, -1138],
       ..., 
       [ 1413,  -975],
       [ 1412,  -974],
       [ 1412,  -973]], dtype=int32)

I posted this earlier in gis.stackexchange hoping there was a solution in shapely that was efficient. The solution was good at first, however, the solution is now too slow as I run this over 50000 times in my code. On my computer each loop takes about 0.03s resulting in over a day of running. It is too slow for what I need here and was hoping to see if anyone knows of a vectorized solution to this.

Asked By: jlcv

||

Answers:

Using generators so that way you will save memory

import numpy as np
import math

d = np.array(
    [
        [1520,-1140],
        [1412,-973]
    ],dtype=float);
rise = d[0,1]-d[1,1];
run = d[0,0]-d[1,0];
slope = rise/run;
print slope

fromPt = d[0];
sign = 1
if (slope<0):
    sign = -1;
points = ([d[0,0]+sign*i,math.floor(d[0,1]+(sign*i*slope))] for i in xrange(1+int(math.ceil(abs(d[0,0]-d[1,0])))))
for pt in points:
    print pt
Answered By: Robert I

Here’s Bresenhams line algorithm as a generator. If you want a list just call list() on the output:

def line(x0, y0, x1, y1):
    deltax = x1-x0
    dxsign = int(abs(deltax)/deltax)
    deltay = y1-y0
    dysign = int(abs(deltay)/deltay)
    deltaerr = abs(deltay/deltax)
    error = 0
    y = y0
    for x in range(x0, x1, dxsign):
        yield x, y
        error = error + deltaerr
        while error >= 0.5:
            y += dysign
            error -= 1
    yield x1, y1
Answered By: Turksarama

Bresenham may be smart but I’m pretty sure brute force vectorization is faster. I’ve written two variants – the first is easier to read, the second is faster (80 us vs 50 us).

Update Fixed a bug (thanks @Varlor) and added an nd variant.

import numpy as np
from timeit import timeit

def connect(ends):
    d0, d1 = np.abs(np.diff(ends, axis=0))[0]
    if d0 > d1: 
        return np.c_[np.linspace(ends[0, 0], ends[1, 0], d0+1, dtype=np.int32),
                     np.round(np.linspace(ends[0, 1], ends[1, 1], d0+1))
                     .astype(np.int32)]
    else:
        return np.c_[np.round(np.linspace(ends[0, 0], ends[1, 0], d1+1))
                     .astype(np.int32),
                     np.linspace(ends[0, 1], ends[1, 1], d1+1, dtype=np.int32)]


def connect2(ends):
    d0, d1 = np.diff(ends, axis=0)[0]
    if np.abs(d0) > np.abs(d1): 
        return np.c_[np.arange(ends[0, 0], ends[1,0] + np.sign(d0), np.sign(d0), dtype=np.int32),
                     np.arange(ends[0, 1] * np.abs(d0) + np.abs(d0)//2,
                               ends[0, 1] * np.abs(d0) + np.abs(d0)//2 + (np.abs(d0)+1) * d1, d1, dtype=np.int32) // np.abs(d0)]
    else:
        return np.c_[np.arange(ends[0, 0] * np.abs(d1) + np.abs(d1)//2,
                               ends[0, 0] * np.abs(d1) + np.abs(d1)//2 + (np.abs(d1)+1) * d0, d0, dtype=np.int32) // np.abs(d1),
                     np.arange(ends[0, 1], ends[1,1] + np.sign(d1), np.sign(d1), dtype=np.int32)]


def connect_nd(ends):
    d = np.diff(ends, axis=0)[0]
    j = np.argmax(np.abs(d))
    D = d[j]
    aD = np.abs(D)
    return ends[0] + (np.outer(np.arange(aD + 1), d) + (aD//2)) // aD


ends = np.array([[ 1520, -1140],
                 [ 1412,  -73]])

ends_4d = np.array([[  100, -302, 101, -49],
                    [ -100,  -45, 112, 100]])

print(connect(ends))
print(connect_nd(ends_4d))


assert np.all(connect(ends)==connect2(ends))
assert np.all(connect(ends)==connect_nd(ends))
assert np.all(connect(ends)==connect(ends[:, ::-1])[:, ::-1])
assert np.all(connect(ends)==connect(ends[::-1])[::-1])

print(timeit('f(ends)', globals={'f': connect, 'ends': ends}, number=10000)*100, 'us')
print(timeit('f(ends)', globals={'f': connect2, 'ends': ends}, number=10000)*100, 'us')
print(timeit('f(ends)', globals={'f': connect_nd, 'ends': ends}, number=10000)*100, 'us')

Sample output:

[[ 1520 -1140]
 [ 1520 -1139]
 [ 1520 -1138]
 ..., 
 [ 1412   -75]
 [ 1412   -74]
 [ 1412   -73]]
[[ 100 -302  101  -49]
 [  99 -301  101  -48]
 [  98 -300  101  -48]
 ..., 
 [ -98  -47  112   99]
 [ -99  -46  112   99]
 [-100  -45  112  100]]
78.8237597000034 us
48.02509490000375 us
62.78072760001123 us
Answered By: Paul Panzer

Good base code to start:


def points_between(x1, y1, x2, y2):
    
    d0 = x2 - x1
    d1 = y2 - y1
    
    count = max(abs(d1+1), abs(d0+1))

    if d0 == 0:
        return (
            np.full(count, x1),
            np.round(np.linspace(y1, y2, count)).astype(np.int32)
        )

    if d1 == 0:
        return (
            np.round(np.linspace(x1, x2, count)).astype(np.int32),
            np.full(count, y1),  
        )

    return (
        np.round(np.linspace(x1, x2, count)).astype(np.int32),
        np.round(np.linspace(y1, y2, count)).astype(np.int32)
Answered By: Demetry Pascal

here is a short solution using np library

    import numpy as np
    x = [1,2,3]
    y = [5,6,7]
    ks,cs = np.meshgrid(x,y)
    np.vstack((ks.flatten(),cs.flatten())).T
Answered By: Ahmed Saleh