What is this odd sorting algorithm?

Question:

Some answer originally had this sorting algorithm:

for i from 0 to n-1:
    for j from 0 to n-1:
        if A[j] > A[i]:
            swap A[i] and A[j]

Note that both i and j go the full range and thus j can be both larger and smaller than i, so it can make pairs both correct and wrong order (and it actually does do both!). I thought that’s a mistake (and the author later called it that) and that this would jumble the array, but it does appear to sort correctly. It’s not obvious why, though. But the code simplicity (going full ranges, and no +1 as in bubble sort) makes it interesting.

Is it correct? If so, why does it work? And does it have a name?

Python implementation with testing:

from random import shuffle

for _ in range(3):
    n = 20
    A = list(range(n))
    shuffle(A)
    print('before:', A)

    for i in range(n):
        for j in range(n):
            if A[j] > A[i]:
                A[i], A[j] = A[j], A[i]

    print('after: ', A, 'n')

Sample output (Try it online!):

before: [9, 14, 8, 12, 16, 19, 2, 1, 10, 11, 18, 4, 15, 3, 6, 17, 7, 0, 5, 13]
after:  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19] 

before: [5, 1, 18, 10, 19, 14, 17, 7, 12, 16, 2, 0, 6, 8, 9, 11, 4, 3, 15, 13]
after:  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19] 

before: [11, 15, 7, 14, 0, 2, 9, 4, 13, 17, 8, 10, 1, 12, 6, 16, 18, 3, 5, 19]
after:  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19] 

Edit: Someone pointed out a very nice brand new paper about this algorithm. Just to clarify: We’re unrelated, it’s a coincidence. As far as I can tell it was submitted to arXiv before that answer that sparked my question and published by arXiv after my question.

Asked By: no comment

||

Answers:

To prove that it’s correct, you have to find some sort of invariant. Something that’s true during every pass of the loop.

Looking at it, after the very first pass of the inner loop, the largest element of the list will actually be in the first position.

Now in the second pass of the inner loop, i = 1, and the very first comparison is between i = 1 and j = 0. So, the largest element was in position 0, and after this comparison, it will be swapped to position 1.

In general, then it’s not hard to see that after each step of the outer loop, the largest element will have moved one to the right. So after the full steps, we know at least the largest element will be in the correct position.

What about all the rest? Let’s say the second-largest element sits at position i of the current loop. We know that the largest element sits at position i-1 as per the previous discussion. Counter j starts at 0. So now we’re looking for the first A[j] such that it’s A[j] > A[i]. Well, the A[i] is the second largest element, so the first time that happens is when j = i-1, at the first largest element. Thus, they’re adjacent and get swapped, and are now in the "right" order. Now A[i] again points to the largest element, and hence for the rest of the inner loop no more swaps are performed.

So we can say: Once the outer loop index has moved past the location of the second largest element, the second and first largest elements will be in the right order. They will now slide up together, in every iteration of the outer loop, so we know that at the end of the algorithm both the first and second-largest elements will be in the right position.

What about the third-largest element? Well, we can use the same logic again: Once the outer loop counter i is at the position of the third-largest element, it’ll be swapped such that it’ll be just below the second largest element (if we have found that one already!) or otherwise just below the first largest element.

Ah. And here we now have our invariant: After k iterations of the outer loop, the k-length sequence of elements, ending at position k-1, will be in sorted order:

After the 1st iteration, the 1-length sequence, at position 0, will be in the correct order. That’s trivial.

After the 2nd iteration, we know the largest element is at position 1, so obviously the sequence A[0], A[1] is in the correct order.

Now let’s assume we’re at step k, so all the elements up to position k-1 will be in order. Now i = k and we iterate over j. What this does is basically find the position at which the new element needs to be slotted into the existing sorted sequence so that it’ll be properly sorted. Once that happens, the rest of the elements "bubble one up" until now the largest element sits at position i = k and no further swaps happen.

Thus finally at the end of step N, all the elements up to position N-1 are in the correct order, QED.

Answered By: Lagerbaer

I’m not too sure if the above algorithm has an explicit name, but from some quick output analysis it just looks like an inefficient implementation of insertion sort, where the sorted region is from indices 0 to i inclusive after running iteration i.

Print Debugging

This can be verified by inspection if we put a print statement right after the inner loop:

for i from 0 to n-1:
    for j from 0 to n-1:
        if A[j] > A[i]:
            swap A[i] and A[j]
    print(A) <- add here
A = [5, 5, 0, 9, 2]
0.  [9, 5, 0, 5, 2]
1.  [5, 9, 0, 5, 2]
2.  [0, 5, 9, 5, 2]
3.  [0, 5, 5, 9, 2]
4.  [0, 2, 5, 5, 9]

Proof

We can prove this by induction on i, the outer loop. After having run iteration i, indices 0 to i inclusive of A, or A[0:i] is sorted, with A[i] = max(A).

Base Case: i = 0

For i = 0, the maximum of A will be stored at index 0. This pretty much follows by inspection of the algorithm.

Inductive Step: i > 0

Our inductive hypothesis is that A[0:i-1] is sorted and that A[i - 1] = max(A). What happens in iteration i? Basically, we’re determining where A[i] should be placed in the sorted region (handled by the inner loop), then readjusting it.

Subcase 1: A[i] < A[j] for some 0 <= j <= i - 1

From the above algorithm, A[j] will be swapped with Ap = A[i]. Notice that from our hypothesis, A[0:i-1] was sorted. So, it follows that for the rest of the indices from j + 1 <= i we’ll be reordering our sorted region after inserting Ap. It follows that A[0:i] will be sorted when j = i.

Subcase 2: A[i] >= A[j] for all 0 <= j <= i - 1

No swaps happen in this case, and it follows that A[0:i] is sorted from the A[0:i-1] being sorted and the fact that A[i] >= A[i - 1].

Other case: j > i

Notice that, after j reaches index i, the maximum of A will be back at index i. So, for the rest of the inner loop, no swaps will be made. So, it follows that A[0:i] will be sorted.

Because the above holds for all i < n = len(A), we can conclude that running iteration n - 1 will effectively sort A[0:n-1] = A.

Verification/Improvement

From the above proof, we saw that the check for j > i was redundant. To make the algorithm more efficient and more in-tune with the usual insertion sort, we can run the below code that will also sort the array.

for i from 0 to n-1:
    for j from 0 to i: <- claim this line can be changed
        if A[j] > A[i]:
            swap A[i] and A[j]
Answered By: wLui155

It’s a rather strange sort — definitely not a Bubble Sort. At the end of each for i in range(n) iteration, the ith element will now contain the largest element of the list A. That is, we swap element i with element j whenever element j is greater than element i. Clearly by the end of the algorithm the last element will be the largest.

The key point is this: At the end of iteration i, every element to the left of position i (lower values of i) must have values that are less than or equal, i.e. A[j] <= A[j+1] for j < i.

The following program attempts to demonstrate this:

from random import shuffle

def assert_partial_sort_order(A, i):
    """
    Assert A[j] <= A[j+1] for j < i
    """
    for j in range(i):
        assert A[j] <= A[j+1]

def sort(A):
    n = len(A)
    print('sort before:', A)
    n_swaps = 0
    for i in range(n):
        print('i =', i)
        for j in range(n):
            if A[j] > A[i]:
                A[i], A[j] = A[j], A[i]
                print('    swapping for j =', j)
                print('    A =', A)
        assert_partial_sort_order(A, i)
    print('sort after:', A, 'n')


n = 10
A = list(range(n))
shuffle(A)
sort(A)

Prints:

sort before: [7, 0, 4, 2, 8, 9, 5, 1, 3, 6]
i = 0
    swapping for j = 4
    A = [8, 0, 4, 2, 7, 9, 5, 1, 3, 6]
    swapping for j = 5
    A = [9, 0, 4, 2, 7, 8, 5, 1, 3, 6]
i = 1
    swapping for j = 0
    A = [0, 9, 4, 2, 7, 8, 5, 1, 3, 6]
i = 2
    swapping for j = 1
    A = [0, 4, 9, 2, 7, 8, 5, 1, 3, 6]
i = 3
    swapping for j = 1
    A = [0, 2, 9, 4, 7, 8, 5, 1, 3, 6]
    swapping for j = 2
    A = [0, 2, 4, 9, 7, 8, 5, 1, 3, 6]
i = 4
    swapping for j = 3
    A = [0, 2, 4, 7, 9, 8, 5, 1, 3, 6]
i = 5
    swapping for j = 4
    A = [0, 2, 4, 7, 8, 9, 5, 1, 3, 6]
i = 6
    swapping for j = 3
    A = [0, 2, 4, 5, 8, 9, 7, 1, 3, 6]
    swapping for j = 4
    A = [0, 2, 4, 5, 7, 9, 8, 1, 3, 6]
    swapping for j = 5
    A = [0, 2, 4, 5, 7, 8, 9, 1, 3, 6]
i = 7
    swapping for j = 1
    A = [0, 1, 4, 5, 7, 8, 9, 2, 3, 6]
    swapping for j = 2
    A = [0, 1, 2, 5, 7, 8, 9, 4, 3, 6]
    swapping for j = 3
    A = [0, 1, 2, 4, 7, 8, 9, 5, 3, 6]
    swapping for j = 4
    A = [0, 1, 2, 4, 5, 8, 9, 7, 3, 6]
    swapping for j = 5
    A = [0, 1, 2, 4, 5, 7, 9, 8, 3, 6]
    swapping for j = 6
    A = [0, 1, 2, 4, 5, 7, 8, 9, 3, 6]
i = 8
    swapping for j = 3
    A = [0, 1, 2, 3, 5, 7, 8, 9, 4, 6]
    swapping for j = 4
    A = [0, 1, 2, 3, 4, 7, 8, 9, 5, 6]
    swapping for j = 5
    A = [0, 1, 2, 3, 4, 5, 8, 9, 7, 6]
    swapping for j = 6
    A = [0, 1, 2, 3, 4, 5, 7, 9, 8, 6]
    swapping for j = 7
    A = [0, 1, 2, 3, 4, 5, 7, 8, 9, 6]
i = 9
    swapping for j = 6
    A = [0, 1, 2, 3, 4, 5, 6, 8, 9, 7]
    swapping for j = 7
    A = [0, 1, 2, 3, 4, 5, 6, 7, 9, 8]
    swapping for j = 8
    A = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
sort after: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Answered By: Booboo

It’s a correct but odd and inefficient insertion sort.

Let’s first visualize it by printing A after each complete inner loop iteration. Example:

before: [1, 12, 13, 8, 15, 18, 19, 16, 7, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
        [19, 1, 12, 8, 13, 15, 18, 16, 7, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
        [1, 19, 12, 8, 13, 15, 18, 16, 7, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
        [1, 12, 19, 8, 13, 15, 18, 16, 7, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
        [1, 8, 12, 19, 13, 15, 18, 16, 7, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
        [1, 8, 12, 13, 19, 15, 18, 16, 7, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
        [1, 8, 12, 13, 15, 19, 18, 16, 7, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
        [1, 8, 12, 13, 15, 18, 19, 16, 7, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
        [1, 8, 12, 13, 15, 16, 18, 19, 7, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
        [1, 7, 8, 12, 13, 15, 16, 18, 19, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
        [1, 7, 8, 11, 12, 13, 15, 16, 18, 19, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
        [1, 6, 7, 8, 11, 12, 13, 15, 16, 18, 19, 14, 3, 2, 9, 5, 4, 0, 10, 17]
        [1, 6, 7, 8, 11, 12, 13, 14, 15, 16, 18, 19, 3, 2, 9, 5, 4, 0, 10, 17]
        [1, 3, 6, 7, 8, 11, 12, 13, 14, 15, 16, 18, 19, 2, 9, 5, 4, 0, 10, 17]
        [1, 2, 3, 6, 7, 8, 11, 12, 13, 14, 15, 16, 18, 19, 9, 5, 4, 0, 10, 17]
        [1, 2, 3, 6, 7, 8, 9, 11, 12, 13, 14, 15, 16, 18, 19, 5, 4, 0, 10, 17]
        [1, 2, 3, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15, 16, 18, 19, 4, 0, 10, 17]
        [1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15, 16, 18, 19, 0, 10, 17]
        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15, 16, 18, 19, 10, 17]
        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 18, 19, 17]
        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
after:  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19] 

Graphically, but updating after every swap, the red bar shows index i (code here):

Visualization

For i = 0, it actually does more or less jumble the list, but it brings the largest value to A[0] (or one largest value, if there are multiple). From then on, this value will act as a sentinel.

Due to this initial jumbling, it would be hard (and pointless) to state an invariant based on the initial state of the array. Instead, let’s define A0 to be the state of the array after the outer loop for i = 0:

Invariant: After the outer loop for some i:

  • The overall largest value is at A[i]
  • A[0 to i] contain A0[0 to i] in sorted order.

Proof by induction:

Base case i = 0 is trivial.

Cases i > 0: Before the outer loop with this i, we have the sentinel (overall largest value) at A[i-1], and A[0 to i-1] contains A0[0 to i-1] in sorted order. Now the inner loop goes over all elements and we swap A[i] and A[j] whenever A[j] > A[i]. Let’s look at an example row from above again:

[1, 7, 8, 12, 13, 15, 16, 18, 19, 11, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]

The 19 is the sentinel, the part up to it is sorted, and i is the index of the 11. What happens when we go with j from 0 to the end? The values 1, 7 and 8 are not larger than the 11, so nothing happens. The 12 is larger, so it’ll get swapped with the 11:

[1, 7, 8, 11, 13, 15, 16, 18, 19, 12, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]

Now it’s the 12 that’s at index i. And next it gets compared to 13. Since 13 is larger, they’re swapped:

[1, 7, 8, 11, 12, 15, 16, 18, 19, 13, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]

This continues, always swapping, until we swap with the sentinel:

[1, 7, 8, 11, 12, 13, 16, 18, 19, 15, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
[1, 7, 8, 11, 12, 13, 15, 18, 19, 16, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
[1, 7, 8, 11, 12, 13, 15, 16, 19, 18, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]
[1, 7, 8, 11, 12, 13, 15, 16, 18, 19, 6, 14, 3, 2, 9, 5, 4, 0, 10, 17]

At this point the insertion of the 11 into the sorted portion is complete. And the sentinel moved to index i, where it then prevents any further swaps during the inner loop (i.e., swaps of it with elements further right). In general, for a new value (like the 11), the numbers smaller than it stay where they are, and the numbers larger than it are swapped out and back in one place to the right.

When we’re all done, we’ve had the outer loop with i = n-1. The invariant then tells us that A[0 to n-1] contain A0[0 to n-1] in sorted order. I.e., the array indeed ends up correctly sorted.

Why I call it an insertion sort: After the jumbling i = 0, if you look at the array after every full inner loop, it’s indistinguishable from insertion sort (see my big visualization block at the top). For example, the 11 simply got inserted into the sorted part on its left. It just differs by how that insertion happens. Normal insertion sort "bubbles" the 11 leftwards until its correct place, not even looking at the even smaller numbers. This algorithm here instead searches the insertion point, starting from the very left, and then inserts the 11 there and "bubbles" the larger numbers up to the sentinel rightwards. And then continues with fruitless further comparisons against the sentinel now sitting at A[i].

Update: smusamashah shared their fantastic visualization tool, which nicely lets us compare this algorithm with the other algorithms that this was claimed to be. Click the checkmarks on the right for Bubble Sort, Insertion Sort and Selection Sort, then click Start. You’ll again see our sort is very much like insertion sort, and not at all like the others. And if you instead do Exchange Sort (sadly not included) you’ll see that’s more like Selection Sort (but slower because it swaps more and the tool shows all swaps).

Answered By: no comment

I think it is informative to rewrite it with the inefficiencies removed.

In pseudo-code:

Find the largest element.  Move it to the location A[0].
For i from 1 upto n:
  For j from 0 upto i:
    swap A[i] and A[j] if A[j] is greater than A[i]
    Invariant: A[0...j] is sorted
    Invariant: A[j+1...i-1] is sorted
    Invariant: A[i] is larger than anything in A[0...j]

  Invariant: A[0...i] contains the same elements sorted, but now sorted

This skips two things. First, it skips the mostly meaningless initial "jumbling" of the array, and it skips the never-does-anything comparison of the largest element and the items after it.

On the i=5 loop, we might start with:

                      i
 10  20  30  40  99  25

j then scans:

[10] 20  30  40  99  25
 10 [20] 30  40  99  25
 10  20 [30] 40  99  25

at this point, A[j] (aka 30) is greater than A[i] (aka 25). They swap and j advances:

 10  20  25 [40] 99  30
 10  20  25  30 [99] 40
 10  20  25  30  40 [99]

In the original algorithm, the scan of j up to n continues. This does nothing, because the largest element is already at A[i], so A[j] is never greater than A[i].

The only remaining quirk is the "jumble" it does to move the highest element to A[0].

Looking at how this works:

A[0] is compared with A[1..n-1] in succession. Any element larger than A[0] is swapped in. Once the largest element is found, the remaining comparisons don’t do anything.

It isn’t anything super special.

The final thing to think about is what the circuit — the infinitely parallel version, where we only care about information dependencies, essential sequencing not incidental — of this algorithm looks like. Let X be a circuit that reads from the top left and top right, and outputs the higher element to the bottom right and lower to the bottom left. Let Y instead output the higher element to the bottom left, and the lower to the bottom right.

1 2 3 4 5 6 7
 Y  | | | | |
|  Y  | | | |
| |  Y  | | |
| | |  Y  | |
| | | |  Y  |
| | | | |  Y
| | | | | | *

then we rotate * back down to the first column.

So each element that is larger than anything left of it is moved right until it finds a larger element. Finally, the largest element is put into A[0] (as it has no larger element to stop at).

On later iterations

1 2 3 * 5 6 7 8

5 is "logically" moved to the far left of the values:

5 1 2 3 * 6 7 8
 X  | |
|  X  |
| |  X

and then swapped into position with a circuit like that.

If you track the * (the largest element) you’ll note that we already know all of its comparisons. And while we move it to location 0 in step 1, we then proceed to slide it (with no needed comparisons) all the way to the end.

So removing the * from consideration

1 2 3 4 5 6 7 8

gets wired up as:

5 1 2 3 4 6 7 8
 X  | | |
|  X  | |
| |  X  |
| | |  X

then we put 6 in

6 5 1 2 3 4 6 7 8
 X  | | | |
|  X  | | |
| |  X  | |
| | |  X  |
| | | |  X

Each step we take the new element, stuff it in at the left, then do a chain of swaps. We can take that reversal as unimportant and get this wiring diagram:

1 2 3 4 5 6 7
 X  | | | | |
|  X  | | | | 
 X   X  | | |
|  X   X  | | 
 X   X   X  |
|  X   X   X
 X   X   X  |
|  X   X  | |
 X   X  | | |
|  X  | | | |
 X  | | | | |

which is basically what you get when you do this to a bubble sort. In this algorithm, the inner loops exist diagonally in the above circuit. Meanwhile, it a bubble sort, the inner loops exists horizontally in the above circuit.

In conclusion, I think this is a relativistic variant of bubble sort; it is a bubble sort, with a slanted time-space axis. But the dependency graph for each computation ends up looking almost identical, with some relatively unimportant differences (the find largest element, and the order flipping bits).

Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.