Bucket Sort is a non-comparison-based sorting algorithm. Its core idea is to distribute the elements to be sorted into several “buckets”, sort the elements within each bucket locally, and then merge the elements from all buckets in order to obtain the final sorted array. This sorting algorithm is particularly suitable for scenarios where the data distribution is relatively uniform, such as when the data consists of integers with a small range, or when the data can be mapped to a smaller integer range. In the ideal case (uniform data distribution and a small number of elements in each bucket), its time complexity can reach O(n), and the space complexity is O(n).

Steps of Bucket Sort

  1. Determine the Number and Range of Buckets: First, clarify the range of the data to be sorted. For example, if the data consists of integers with a maximum value of max, the number of buckets can be set to max + 1, where each bucket corresponds to an integer range (from 0 to max). For floating-point data, a mapping relationship can be used to convert it into integer buckets (e.g., multiplying by a multiple to convert to an integer).

  2. Create Buckets: Based on the number of buckets, create a container (e.g., an array of ArrayLists), where each container corresponds to a bucket and is used to store elements within that bucket’s range.

  3. Distribute Elements to Buckets: Traverse the array to be sorted and place each element into the corresponding bucket based on its value. For example, the bucket index for an element num is num (since bucket index 0 corresponds to element 0, index 1 corresponds to element 1, etc.).

  4. Sort Elements Within Each Bucket: Sort the elements in each bucket. Since the number of elements within each bucket is usually small, a simple sorting algorithm (e.g., Insertion Sort, Bubble Sort) can be used, or the Collections.sort() method provided by Java can be directly called.

  5. Merge the Results of Buckets: Traverse each bucket in order and add the elements within each bucket to the result array sequentially, resulting in the final sorted array.

Example Demonstration

Suppose we want to sort the array [3, 1, 4, 1, 5, 9, 2, 6, 5]. The specific steps are as follows:

  1. Determine the Bucket Range: The elements in the array are integers from 0 to 9, so the number of buckets is 9 + 1 = 10 (indices 0 to 9).

  2. Create Buckets: Create an ArrayList array of length 10, where each element is an ArrayList used to store elements within the corresponding range.

  3. Distribute Elements to Buckets: Traverse the array and place each element into the corresponding index bucket:
    - Element 3 → Bucket 3 → [3]
    - Element 1 → Bucket 1 → [1]
    - Element 4 → Bucket 4 → [4]
    - Element 1 → Bucket 1 → [1, 1]
    - Element 5 → Bucket 5 → [5]
    - Element 9 → Bucket 9 → [9]
    - Element 2 → Bucket 2 → [2]
    - Element 6 → Bucket 6 → [6]
    - Element 5 → Bucket 5 → [5, 5]

  4. Sort Elements Within Each Bucket: Sort the elements in each bucket (no changes occur here since the number of elements is small):
    - Bucket 1: [1, 1] (already sorted)
    - Bucket 5: [5, 5] (already sorted)
    - Other buckets (e.g., Buckets 0, 2, 3, 4, 6, 9) contain only one element and do not require sorting.

  5. Merge Results: Traverse the buckets in order (0 to 9) and collect elements to form the sorted array: [1, 1, 2, 3, 4, 5, 5, 6, 9].

Java Code Implementation

import java.util.ArrayList;
import java.util.Collections;

public class BucketSort {

    public static void bucketSort(int[] arr) {
        // Handle boundary cases: empty array or single-element array needs no sorting
        if (arr == null || arr.length <= 1) {
            return;
        }

        // Step 1: Determine the range of buckets (assuming non-negative integers with max value 'max')
        int max = arr[0];
        for (int num : arr) {
            if (num > max) {
                max = num;
            }
        }

        // Step 2: Create buckets (number of buckets = max + 1, each bucket corresponds to a range 0~max)
        ArrayList<Integer>[] buckets = new ArrayList[max + 1];
        for (int i = 0; i <= max; i++) {
            buckets[i] = new ArrayList<>();
        }

        // Step 3: Distribute elements into corresponding buckets
        for (int num : arr) {
            int bucketIndex = num; // The bucket index for element 'num' is 'num'
            buckets[bucketIndex].add(num);
        }

        // Step 4: Sort elements within each bucket
        for (ArrayList<Integer> bucket : buckets) {
            Collections.sort(bucket); // Use Collections.sort for simplified sorting
        }

        // Step 5: Merge elements from all buckets into the original array to get the sorted result
        int index = 0; // Index for the original array
        for (ArrayList<Integer> bucket : buckets) {
            for (int num : bucket) { // Traverse elements within the bucket
                arr[index++] = num; // Overwrite the original array with elements in order
            }
        }
    }

    // Test code
    public static void main(String[] args) {
        int[] arr = {3, 1, 4, 1, 5, 9, 2, 6, 5};
        System.out.println("Before sorting: " + java.util.Arrays.toString(arr));
        bucketSort(arr);
        System.out.println("After sorting: " + java.util.Arrays.toString(arr));
    }
}

Code Explanation

  • Boundary Handling: If the array is empty or has only one element, return immediately without sorting.
  • Determine Bucket Range: Traverse the array to find the maximum value max, so the number of buckets is max + 1, each corresponding to an integer range (0 to max).
  • Create Buckets: Use an array of ArrayLists to represent buckets, where each ArrayList stores elements within the corresponding range.
  • Distribute Elements: Traverse the original array and place each element into the bucket with the index equal to its value (i.e., element num goes to bucket num).
  • Sort Within Buckets: Sort each bucket’s ArrayList using Collections.sort() to ensure elements within the bucket are ordered.
  • Merge Results: Traverse each bucket in order (0 to max), collect elements, and overwrite the original array to obtain the final sorted result.

Advantages and Disadvantages of Bucket Sort

  • Advantages: When the data is uniformly distributed and the range is small, the time complexity is close to O(n), making it highly efficient. Each element only needs one distribution and one merge operation, and the sorting cost within buckets is low.
  • Disadvantages: Requires additional space to store buckets. When the data range is large (e.g., elements from 0 to 10^9), the number of buckets becomes excessive, leading to space waste. If the data distribution is uneven (e.g., most elements are concentrated in a few buckets), the sorting time within buckets may increase, potentially degrading to O(n^2).

Bucket Sort is suitable for scenarios where the data is uniformly distributed and the range is controllable. By reasonably designing the number and range of buckets, the sorting efficiency can be significantly improved.

Xiaoye