Sorting points on multiple lines

Question

Given that we have two lines on a graph (I just noticed that I inverted the numbers on the Y axis, this was a mistake, it should go from 11-1)

Two lines on a graph

And we only care about whole number X axis intersections

Two lines on a graph with intersections

We need to order these points from highest Y value to lowest Y value regardless of their position on the X axis (Note I did these pictures by hand so they may not line up perfectly).

Two lines on a graph with intersections and ordering

I have a couple of questions:

1) I have to assume this is a known problem, but does it have a particular name?

2) Is there a known optimal solution when dealing with tens of billions (or hundreds of millions) of lines? Our current process of manually calculating each point and then comparing it to a giant list requires hours of processing. Even though we may have a hundred million lines we typically only want the top 100 or 50,000 results some of them are so far “below” other lines that calculating their points is unnecessary.

Asked By: samwise

||

Source

Answer 1

It’s not a really complicated thing, just a "normal" sorting problem.
Usually sorting requires a large amount of computing time. But your case is one where you don’t need to use complex sorting techniques.

You on both graphs are growing or falling constantly, there are no "jumps". You can use this to your advantage. The basic algorithm:

identify if a graph is growing or falling.
write a generator, that generates the values; from left to right if raising, form right to left if falling.
get the first value from both graphs
insert the lower on into the result list
get a new value from the graph that had the lower value
repeat the last two steps until one generator is "empty"
append the leftover items from the other generator.

Answered By: Klaus D.

Answer 2

Your data structure is a set of tuples
```
lines = {(y0, Δy0), (y1, Δy1), ...}
```
You need only the ntop points, hence build a set containing only
the top ntop yi values, with a single pass over the data
```
top_points = choose(lines, ntop)
```
EDIT — to choose the ntop we had to keep track of the smallest
one, and this is interesting info, so let’s return also this value
from choose, also we need to initialize decremented
```
top_points, smallest = choose(lines, ntop)
decremented = top_points
```
and start a loop…
```
while True:
```

Generate a set of decremented values

~~decremented = {(y-Δy, Δy) for y, Δy in top_points}~~

    decremented = {(y-Δy, Δy) for y, Δy in decremented if y>smallest}
    if decremented == {}: break

Generate a set of candidates

    candidates = top_lines.union(decremented)

generate a new set of top points
```
    new_top_points, smallest = choose(candidates, ntop)
```
The following is no more necessary

check if new_top_points == top_points

if new_top_points == top_points: break top_points = new_top_points</strike>

~~of course we are in a loop…~~

The difficult part is the choose function, but I think that this
answer to the question
How can I sort 1 million numbers, and only print the top 10 in Python?
could help you.

Answered By: gboffi

Sorting points on multiple lines

Question:

Answers: