Comparing Sorting Algorithms

Sorting large arrays

If you run the programs given for selection sort and merge sort, you will find once the numbers are typed in, the result of sorting them is give instantly. Don't bother trying them out with a lot of numbers, even with a thousand numbers, the result of sorting them will appear to be delivered instantaneously. Computers work very quickly!

If we want to try out our programs with a lot of data, it's impractical to type it all out. Real applications may process huge amounts of data, but if we don't have any to hand we could make some up with a random number generator. Here is a program that uses the selection sort method we looked at earlier to sort an array of randomly generated integers of a size the user gives in response to a prompt. It makes use of the class Random which is provided as part of the Java library:

import java.io.*;
import java.util.*;

class SortIntArrayIt3
{

 public static void main(String[] args)
 throws IOException
 // Illustrates using selection sort to sort an array of integers.
 // Generates random numbers to sort to allow sorting a large array
 {
  int n;
  int[] array;
  Random rand = new Random();
  BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
  System.out.print("Enter number of integers : ");
  n = Integer.parseInt(in.readLine());
  array = new int[n];
  for(int i=0; i<n; i++)
     array[i] = rand.nextInt();
  sort(array);
  System.out.println("The array is now sorted");
 }

 public static void sort(int[] a)
 // Sort the contents of array a in ascending numerical order
 {
  for(int i=0; i<a.length-1; i++)
     {
      int temp,pos=i;
      for(int j=i+1; j<a.length; j++)
         if(a[j]<a[pos])
            pos=j;
      temp = a[i];
      a[i] = a[pos];
      a[pos] = temp;
     }
 }

}

The program simply prints out a message that tells us when it has finished sorting the array of numbers. If you run it, you will notice that once you start sorting thousands of integers, there's a noticeable time before that message is printed out. With 10000 integers to sort, it's very apparent, with 20000 there's a lengthy delay, and if you tried it with 100000, it would do it eventually but you'd probably want to go away and do something else while you're waiting (note this is the case in the year 2001 when these notes are being written; there has been a history of computer hardware improvement making processing faster, so if when you read this your computer is faster, just try adding a 0 on the end of the above numbers ...).

Now try replacing the selection sort method with the merge sort method. That's what's done in the program below where the main method is exactly the same as the one above, so all that's different is the sort method:

import java.io.*;
import java.util.*;

class SortIntArrayRec3
{

 public static void main(String[] args)
 throws IOException
 // Illustrates using merge sort to sort an array of integers.
 // Generates random numbers to sort to allow sorting a large array
 {
  int n;
  int[] array;
  Random rand = new Random();
  BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
  System.out.print("Enter number of integers : ");
  n = Integer.parseInt(in.readLine());
  array = new int[n];
  for(int i=0; i<n; i++)
     array[i] = rand.nextInt();
  sort(array);
  System.out.println("The array is now sorted");
 }

 public static void sort(int[] a)
 // Sort the contents of array a in ascending numerical order
 {
  if(a.length>1)
     {
      int i,mid = a.length/2;
      int[] half1 = new int[mid];
      int[] half2 = new int[a.length-mid];
      for(i=0; i<mid; i++)
         half1[i]=a[i];
      for(; i<a.length; i++)
         half2[i-mid]=a[i];
      sort(half1);
      sort(half2);
      int j=0, k=0;
      for(i=0; j<half1.length&&k<half2.length; i++)
         if(half1[j]<half2[k])
            {
             a[i]=half1[j];
             j++;
            }
         else
            {
             a[i]=half2[k];
             k++;
            }
      for(; j<half1.length; i++, j++)
         a[i]=half1[j];
      for(; k<half2.length; i++, k++)
         a[i]=half2[k];
     }
 }

}

With this program, you will find that sorting an array of 10000 integers seems to happen instantaneously, there is only a short delay if you sort 100000 integers, and you can sort an array of a million integers without it taking a huge amount of time.

Of course there are ways of accurately timing programs, but just watching the differences between these two is enough to illustrate the point - there can be a big difference in the time taken by two different algorithms to solve the same problem. In this example, the difference is consistent and predictable: merge sort always works more quickly than selection sort.

Analysing algorithms

Do we have to run programs before we can decide that one algorithm works more quickly than another? No - it is possible to analyse algorithms and use reasoning to show that one is quicker than another.

Rather than consider every step in running a program, algorithm analysis considers the significant operations which are the building blocks of an algorithm. The main thing we do in sorting an array is comparing two integers to see which is the smallest, and copying integers from one array cell to another or to a variable. Supporting operations, such as increasing array index or the work required to call a method, are ignored.

If we consider selection sort, to sort an array of n elements, we go through it n-1 times, each time picking out the smallest element in the unsorted part of the array, and swapping it with the first element in the unsorted part of the array. But each time we go through it, we reduce the size of the unsorted part of the array by 1. So the first time round we make n -1 comparisons to find the smallest element in the unsorted part, the second time round n-2, and so on down to 1. So in total we make

(n-1)+(n-2)+...+1

comparisons, which is equal to ½×n× (n-1).

We also make 3 assignments each time round to swap the integers, giving us an extra 3×(n-1) significant operations, giving us a total of

½n²+(5/2)×n-3

significant operations to sort an array of n integers.

Now let us consider analysing merge sort. A call to merge sort requires copying all the numbers twice, first into the new array, and then back again. On the next level of recursion each of the numbers is again copied into a new array and back again, since we have two recursive calls each working on half the original number of integers. Below this we have four recursive calls each working on a quarter of the numbers. So how many layers of recursive calls are there altogether? It is the number of times we can divide n by 2 until we get to 1, and we discussed this issue in the powers example, the answer is log₂n. So there are a total of 2×log₂n integer copies done. At each level m of recursion, there are 2^m calls to merge arrays, each of which requires (n/2^m)-1 comparisons. This results in a total of n-2^m comparison operations being done at each level of recursion. So the total number of significant operations is:

(3×n-2^log₂n) ×log₂n

when sorting an array of n integers using merge sort.

The big-O notation

If you put in the values for n in the above numbers of significant operations, you will see that as n gets large, there are far more operations necessary for selection sort as there are for merge sort. That confirms what we observd when we ran the programs.

Note that as n gets large, for any values of a, b and c, the formula a×n²+ b×n+ c gets a value which is increasingly close to a×n². So for large numbers, it's only really the highest power of n that's significant in determing efficiency. Also, no matter what the values of the constants a and d, for a large enough n, the value of a×n² will always exceed the value of d×n× log n (we have already noted in the powers example that the base of the log affects its value only by a constant factor).

So when talking about the efficiency of algorithms, we discount all constant factors and factors which become insignificant for large values of n. We say that selection sort is of order n², or O(n²) while merge sort is of order n×log n, or O(n log n).

Because the letter O is used in this notation, it is called the big O notation. A more formal definition of it is that an algorithm is O(f(n)) if there are two constants, k and n₀ such that for any n which is greater than n₀, the algorithm requires no more k×f(n) time units to solve a problem of size n.

Matthew Huntbach

Last modified: 23 November 2001