Sorting Data Structure

Arranging the data in ascending or descending order is known as sorting. Sorting is very important from the point of view of our practical life.

The best example of sorting can be phone numbers in our phones. If, they are not maintained in an alphabetical order we would not be able to search any number effectively.

Why is Sorting Important?

  • Efficient Searching: Searching algorithms like Binary Search require sorted data.
  • Better Readability: Sorted data is easier to understand and use.
  • Data Optimization: Used in databases, file systems, and memory management.
  • Improves Algorithm Performance: Many algorithms assume or perform better with sorted input.

Types of Sorting

Sorting algorithms can be broadly classified into Internal Sorting and External Sorting based on where the data to be sorted is stored during the sort operation.

Internal Sorting

Internal sorting is performed entirely in the main memory (RAM). All the data that needs to be sorted must fit into memory.

  • Fast and efficient for small to medium-sized datasets.
  • Relies on in-memory data structures like arrays and lists.

Common Internal Sorting Algorithms

  1. Bubble Sort
  2. Insertion Sort
  3. Selection Sort
  4. Merge Sort (in-memory)
  5. Quick Sort
  6. Heap Sort

Example: Sorting a list of 10,000 student records in a program that has enough RAM to store all of them.

Advantages:

  • Faster due to high-speed memory access.
  • Simpler implementation.

Disadvantages

  • Not suitable for very large datasets that exceed RAM.

External Sorting

External sorting is used when the data is too large to fit into main memory and must be stored in external memory (like hard disks or SSDs) during sorting.

  • Designed for handling massive data sets (e.g., terabytes of data).
  • Involves reading/writing to disk in chunks (blocks or pages).
  • Sorting is done in phases, often using merge-based techniques.

Common External Sorting Algorithm:

  1. External Merge Sort (also known as Multi-way Merge Sort)

How It Works (Example: External Merge Sort):

  • Divide the data into chunks that fit into RAM.
  • Sort each chunk using an internal sorting algorithm.
  • Store the sorted chunks (runs) on disk.
  • Merge these sorted runs into a single sorted file using a k-way merge algorithm.

Example: Sorting a 100 GB log file on a machine with only 8 GB of RAM.

Advantages

  • Can handle huge data sets beyond RAM size.
  • Efficient when combined with optimized disk I/O strategies.

Disadvantages

  • Slower due to disk access times.
  • More complex implementation.

Comparison Table

FeatureInternal SortingExternal Sorting
Memory UsageUses RAM onlyUses disk and RAM
Dataset SizeSmall to mediumVery large
SpeedFasterSlower due to I/O
Algorithms UsedQuick, Merge, HeapExternal Merge Sort
ComplexitySimplerMore complex (I/O handling)
Use CaseIn-memory applicationsDatabases, Big Data

Application of Sorting

  1. The sorting is useful in database applications for arranging the data in desired ORDER.
  2. In the dictionary like applications the data is arranged in sorted order.
  3. For searching the element from the list of elements the sorting is required
  4. For checking the uniqueness of the element the sorting is required.
  5. For finding the closest pair from the list of elements the sorting is required.

Sorting Category

Here's a comprehensive list of sorting algorithms, categorized by technique and characteristics.

I. Comparison-Based Sorting Algorithms

These algorithms determine the order of elements based on comparisons (e.g., a > b, a < b).

A. Simple (Elementary) Sorting Algorithms

  1. Bubble Sort
  2. Selection Sort
  3. Insertion Sort
  4. Gnome Sort
  5. Cocktail Shaker Sort (Bidirectional Bubble Sort)
  6. Comb Sort
  7. Odd-Even Sort (Brick Sort)

B. Efficient Comparison-Based Sorting Algorithms

  1. Merge Sort
  2. Quick Sort
  3. Heap Sort
  4. Shell Sort
  5. Tim Sort (Hybrid of Merge + Insertion Sort used in Python, Java)
  6. Intro Sort (Hybrid of Quick + Heap + Insertion, used in C++ STL)
  7. Tree Sort (Using Binary Search Tree)
  8. Patience Sort

II. Non-Comparison-Based Sorting Algorithms

These algorithms do not compare elements directly and usually work on integers or keys.

  1. Counting Sort
  2. Radix Sort
  3. Bucket Sort
  4. Pigeonhole Sort
  5. Flash Sort
  6. Postman Sort

III. Hybrid Sorting Algorithms

These combine multiple techniques to optimize performance.

  1. Tim Sort (Merge + Insertion)
  2. Intro Sort (Quick + Heap + Insertion)
  3. Block Sort
  4. Library Sort (Gapped Insertion Sort)

IV. Parallel and External Sorting Algorithms

Used for large datasets or parallel/distributed systems.

A. Parallel Sorting

  1. Bitonic Sort
  2. Odd-Even Merge Sort
  3. Parallel Merge Sort
  4. Sample Sort

B. External Sorting

  1. External Merge Sort
  2. Polyphase Merge Sort
  3. Balanced Multi-way Merge Sort

V. Specialized and Theoretical Sorting Algorithms

  1. Bead Sort (natural sorting, not efficient in practice)
  2. Pancake Sort (used in theoretical computer science)
  3. Stooge Sort (inefficient, educational)
  4. Bogo Sort (randomized, inefficient, educational)
  5. Sleep Sort (joke algorithm based on timing)
  6. Cycle Sort (in-place and minimal memory writes)

Popular Sorting Algorithms

  1. Bubble sort
  2. Insertion sort
  3. Radix Sort
  4. Quick sort
  5. Merge sort
  6. Heap sort
  7. Selection sort
  8. Shell Sort

Sorting deals with sorting the data stored in the memory, whereas external sorting deals with sorting the data stored in files.

In bubble sorting, consecutive adjacent pairs of elements in the array are compared with each other.

Insertion sort works by moving the current data element past the already sorted values and repeatedly interchanging it with the preceding value until it is in the correct place.

Selection sort works by finding the smallest value and placing it in the first position. It then finds the second smallest value and places it in the second position. This procedure is repeated until the whole array is sorted.

Merge sort is a sorting algorithm that uses the divide, conquer, and combine algorithmic paradigm. Divide means partitioning the n-element array to be sorted into two sub-arrays of n/2 elements in each sub-array.

Quick sort works by using a divide-and-conquer strategy. It selects a pivot element and rearranges the elements in such a way that all elements less than pivot appear before it and all elements greater than pivot appear after it.

Radix sort is a linear sorting algorithm that uses the concept of sorting names in alphabetical order.

Heap sort sorts an array in two phases. In the first phase, it builds a heap of the given array. In the second phase, the root element is deleted repeatedly and inserted into an array.

Shell sort is considered as an improvement over insertion sort, as it compares elements separated by a gap of several positions.


Time Complexity of all Sorting Techniques

Time complexity of various sorting algorithms.

S.NoAlgorithmWorst CaseAverage CaseBest Case
1Selection SortO (n 2)O (n 2)O (n 2)
2Bubble SortO (n 2)O (n 2)O (n 2)
3Insertion SortO (n 2)O (n 2)n – 1
4Quick SortO (n 2)log2 nlog2 n
5Heap SortO (n log n)O (n log n)O (n log n)
6Merge SortO (n log n)O (n log n)O (n log n)
7Radix SortO (n 2)O (n log n)O (n log n)
8Shell sortO(n log2n)--