Why isn't there a bucket sort library(or is there?)

Question:

I’ve been studying algorithm, and just came across this bucket sort. Although it can only be used for few cases, it just looks too efficient to be not implemented in standard library, since it can sort a list in O(n) time. so my question is, why isn’t there a given library that supports bucket sort, or any other counting-sort-like algorithm such as radix sort in most of the languages? I’ve checked java, python, and c++ library, but it doesn’t look like it supports any sorting algorithm other then sorting algorithms based on comparing.

Although implementing such algorithm requires list to have integer in specific range, it doesn’t seem impossible to implement such method. Java, for example, can have a interface similar to Comparator() which returns integer in given range that would be used as index for sorting. So what is the reason that makes O(n) sorting algorithms not used? Or is there a library that actually uses bucket sort that I just missed? Sorry if it was a silly question, I just thought there must be a reason that makes O(n) algorithm unused.

Asked By: cobaltblue

||

Answers:

Java has a static method Arrays.sort that could in principle be implemented as a radix sort for the overloads that accept integer types (https://docs.oracle.com/javase/10/docs/api/java/util/Arrays.html#sort(int[])). They chose to implement with quicksort. There’s no justification given, but I imagine 1. Radix sort requires additional memory, where quicksort doesn’t 2. The difference between n log n and n is blunted by the fact that quicksort has good cache locality, where radix sort, less so.

Answered By: David Eisenstat

Definitely an omission. JgrahT has an implementation, but only for Integer arays. If you want to sort an array of objects from any user-defined class, you will need to go through the key one bit at a time. Thus the class will have to provide a method like getIthKeyBit(). Surprised that this was ignored.

Answered By: sarnath

Java

Arrays.sort(byte[]), Arrays.sort(short[]) and Arrays.sort(char[])

At least starting from Java 7 are implemented using an in-place counting sort with O(n) time and O(1) space complexities:

static void sort(short[] a, int low, int high) {
  if (high - low > MIN_SHORT_OR_CHAR_COUNTING_SORT_SIZE) {
    countingSort(a, low, high);
  } else {
    sort(a, 0, low, high);
  }
}

private static void countingSort(short[] a, int low, int high) {
  int[] count = new int[NUM_SHORT_VALUES];

  /*
   * Compute a histogram with the number of each values.
   */
  for (int i = high; i > low; ++count[a[--i] & 0xFFFF]);

  /*
   * Place values on their final positions.
   */
  if (high - low > NUM_SHORT_VALUES) {
    for (int i = MAX_SHORT_INDEX; --i > Short.MAX_VALUE; ) {
      int value = i & 0xFFFF;

      for (low = high - count[value]; high > low;
           a[--high] = (short) value
      );
    }
  } else {
    for (int i = MAX_SHORT_INDEX; high > low; ) {
      while (count[--i & 0xFFFF] == 0);

      int value = i & 0xFFFF;
      int c = count[value];

      do {
        a[--high] = (short) value;
      } while (--c > 0);
    }
  }
}

Counting sort implementation

Arrays.sort(int[]), Arrays.sort(long[]), Arrays.sort(float[]) and Arrays.sort(double[])

There is a ticket JDK-8266431 and a related pull request https://github.com/openjdk/jdk/pull/3938 that intend to switch to radix sort for large arrays.

Efficiency of radix sort

Radix sort is faster than quicksort or any other comparison-based sorting algorithms both in theory and in practice. There are multiple benchmarks that demonstrate this:

Radix sort is three times faster than quicksort in Java
Radix sort is 4-6 times faster than quicksort in C++

https://erik.gorset.no/2011/04/radix-sort-is-faster-than-quicksort.html
https://probablydance.com/2016/12/02/investigating-radix-sort/

Answered By: Denis Stafichuk
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.