Java vs. Python specific code-fragment performance improvement

Question

I have a question regarding performance of a specific piece of code in Java and Python.

Algorithm:
I am generating random N-dimensional points and then for all points under a certain distance threshold of each other I do some processing. The processing itself is not important here as it does not impact total execution time. Generating the points also takes a fraction of a second in both cases, so I am only interested in the part that does the comparing.

Execution times:
For a fixed input of 3000 points and 2 dimensions, Java does this in 2 to 4 seconds while Python takes anywhere between 15 and 200 seconds.

I am a little skeptical about the Python execution times. Is there anything I’m missing in this Python code? Are there any algorithmic improvement suggestions (e.g. pre-allocating/reusing memory, a way of lowering Big-Oh complexity, etc.)?

Java

double random_points[][] = new double[number_of_points][dimensions];
for(i = 0; i < number_of_points; i++)
  for(d = 0; d < dimensions; d++)
    random_points[i][d] = Math.random();

double p1[], p2[];
for(i = 0; i < number_of_points; i++)
{
  p1 = random_points[i];
  for(j = i + 1; j < number_of_points; j++)
  {
    p2 = random_points[j];

    double sum_of_squares = 0;
    for(d = 0; d < DIM_; d++)
      sum_of_squares += (p2[d] - p1[d]) * (p2[d] - p1[d]);

    double distance = Math.sqrt(ss);
    if(distance > SOME_THRESHOLD) continue;

    //...else do something with p1 and p2

  }
}

Python 3.2

random_points = [[random.random() for _d in range(0,dimensions)] for _n in range(0,number_of_points)]

for i, p1 in enumerate(random_points):
  for j, p2 in enumerate(random_points[i+1:]):
    distance = math.sqrt(sum([(p1[d]-p2[d])**2 for d in range(0,dimensions)]))
    if distance > SOME_THRESHOLD: continue

    #...else do something with p1 and p2

Asked By: Vuk

||

Source

Answer 1

You might want to consider using numpy.

I’ve just tried the following:

import numpy
from scipy.spatial.distance import pdist
D=2
N=3000
p=numpy.random.uniform(size=(N,D))
dist=pdist(p, 'euclidean')

The last line computes the distance matrix (this is equivalent to computing distance in your code for every pair of points). On my computer it takes about 0.07s.

The main disadvantage of this method is that it requires O(n^2) memory for the distance matrix. If that’s a problem, the following may be a better alternative:

for i in xrange(1, N):
  v = p[:N-i] - p[i:]
  dist = numpy.sqrt(numpy.sum(numpy.square(v), axis=1))
  for j in numpy.nonzero(dist > 1.4)[0]:
    print j, i+j

For N=3000, this takes ~0.33s on my computer.

Answered By: NPE

Answer 2

If I take 30K points and 5 dimensions, which is 100x more work.

int number_of_points = 30000;
int dimensions = 5;
double SOME_THRESHOLD = 0.1;

long start = System.nanoTime();
double random_points[][] = new double[number_of_points][dimensions];
for (int i = 0; i < number_of_points; i++)
    for (int d = 0; d < dimensions; d++)
        random_points[i][d] = Math.random();

double p1[], p2[];
Comparator<double[]> compareX = new Comparator<double[]>() {
    @Override
    public int compare(double[] o1, double[] o2) {
        return Double.compare(o1[0], o2[0]);
    }
};
Arrays.sort(random_points, compareX);

double[] key = new double[dimensions];
int count = 0;
for (int i = 0; i < number_of_points; i++) {
    p1 = random_points[i];
    key[0] = p1[0] + SOME_THRESHOLD;
    int index = Arrays.binarySearch(random_points, key, compareX);
    if (index < 0) index = ~index;
    NEXT: for (int j = i + 1; j < index; j++) {
        p2 = random_points[j];

        double sum_of_squares = 0;
        for (int d = 0; d < dimensions; d++) {
            sum_of_squares += (p2[d] - p1[d]) * (p2[d] - p1[d]);
            if (sum_of_squares > SOME_THRESHOLD * SOME_THRESHOLD) 
                continue NEXT;
        }

        //...else do something with p1 and p2
        count++;
    }
}
long time = System.nanoTime() - start;
System.out.println("Took " + time / 1000 / 1000 + " ms. count= " + count);

Prints

Took 1549 ms. count= 20197

Answered By: Peter Lawrey

Answer 3

Does speed really matter? Here are a couple obvious speed-ups:

Don’t compute square root. Just square your threshold and compare against the squared threshold.
Sort your points along one dimension (the outer one in your loop). When two points i and j are farther than than your threshold on just this dimension, then further incrementing j will only produce points farther than this threshold, and you can continue the outer loop.

There may be other algorithmic speed ups, even the above is still O(n^d).

Answered By: rlibby

Java vs. Python specific code-fragment performance improvement

Question:

Answers: