If we have an array of n items, we can "break" it into two arrays, and then sort each of those arrays. Then we can put the two arrays together by merging them. This means taking an item from the two sorted arrays, at any time whichever is smallest and has yet to be taken, and putting it into our final array.
This is recursive, because we have described sorting in terms of sorting. So we can do it by making a recursive call to our original algorithm. With recursion we always need a base case, a case which is solved without recursion. The base case here is when we are sorting an array of length 1. The idea of an array which has only one element in it is a bit strange, but is allowable in Java. Obviously such an array is already in order because there is only one possible order for it.
This sorting algorithm is known as merge sort. Here is a start at designing a Java method to sort an array of integers using merge sort (we will assume, as we did before, that we are sorting arrays of integers, although the same algorithm could be used to sort an array of anything so long as we have a way of ordering the array contents):
public static void sort(int[] a) { if(a.length>1) { Break a into two halves Sort each of the two halves Merge the two halves back into one } }
we will end up with the situation:
Note that we cannot guarantee the two new arrays will be of identical lengths, since if the original array is of odd length, we will have to break it into two arrays, one of length one greater than the other.
Here is some Java code to do this breaking task:
int i,mid = a.length/2; int[] half1 = new int[mid]; int[] half2 = new int[a.length-mid]; for(i=0; i<mid; i++) half1[i]=a[i]; for(; i<a.length; i++) half2[i-mid]=a[i];I haven't expressed this in terms of a separate method, since that would raise the problem of how to get the method to return two arrays. The code fragment here assumes the initial array is called
a
and names the two new arrays half1
and half2
.
Note that because variable i
is declared outside the
for loops, the value it has when the first for loop finishes (which
must be equal to mid
) remains when the second for loop starts.
Actually, for this algorithm it doesn't matter how the items in the original array are distributed into the two new arrays, so long as they are of equal size (with the possibility that one is one longer than the other, as mentioned above), and each item in the original array is found just once in just one of the new arrays. The following would be a perfect acceptable alternative:
int mid = a.length/2; int[] half1 = new int[mid]; int[] half2 = new int[a.length-mid]; for(int i=0, j=0; i<a.length-1; i+=2,j++) { half1[j]=a[i]; half2[j]=a[i+1]; } if(a.length%2==1) half2[mid]=a[a.length-1];which would lead to the two new arrays in the case above being:
This code fragment shows how a for loop can have multiple variable declarations in the first part of its header, and multiple variable updates in the last part.
sort(half1); sort(half2);In the usual way with recursion, we will assume our recursive call(s) work, and if under this assumption the call that made them works, then the whole recursive algorithm must work. So, assuming the breaking of the array into a front and back half as first proposed, after the recursive calls we have the situation:
Our algorithm does sorting in place, as we did with
selection sort, so these two arrays
will be named half1
and half2
, the recursive
calls will have moved their contents around. Since we want sorting in
place, we now have to copy the contents of half1
and
half2
back into the array a
, making sure
that they keep numerical order. So after this, a
will have
the same contents it initially held, but in sorted order.
The merge part of the algorithm involves having three variables
as indexes to the three arrays a
, half1
and
half2
. We will use the names i
, j
and k
respectively for these. At any point in the merging,
i
will index the cell in a
into which the next
integer from either half1
or half2
will be put,
while j
and k
will index the integers which are
being considered for putting into a
. Note that what is in
a
before the merging, in other words the original integers,
is irrelevant at this point and plays no part in the merging. Since
half1
and half2
are in order, j
and k
can both start off at 0 and be increased by 1 each
time they index the lowest number next to be put in a
.
So here is our initial position (the cells in a
are shown blank
because what is in them is irrelevant):
At this point it can be seen that half2[k]
is less
than half1[j]
so half2[k]
is put into
a[i]
and i
and k
are increased
by one, giving:
Now half1[j]
is less than half2[k]
, so
half1[j]
is put into a[i]
, and
i
and j
are increased by one, giving:
The lines across the diagrams of the arrays are added, similar to the way
we added lines across the diagram of the array when demonstrating the
the selection sort example, to
show that i
can be considered as the index dividing a
into the part to which the sorted integers have been written and the
part to which sorted integers are going to be written, while j
and k
divided half1
and half2
into the parts from which integers have alresdy been copied into
a
and the parts from which integers are going to be copied
into a
.
The next step also takes the next integer from half1
, giving:
This demonstrates it is not the case, as might naïvely be assumed,
that integers are taken alternatively from half1
and
half2
.
Now we have a choice between putting 7 and 8 as the next integer into
a
, and 7 is the lowest of these, so that gives us:
Next we put in the 8, giving:
But now we are in the situation where there is no half1[j]
to compare with half2[k]
, because j
has been
increased to the point where it is equal to half1
's length,
so it no longer indexes a cell. At this point, the main merge loop
stops, but an extra loop is needed to copy the remainder of half2
into a
. So the point at which the main merge loop halts is when
either j
is the equal to length of half1
, or
k
is equal to the length of half2
.
The loop invariant in the merge loop is that each integer that occurs in
the array half1
from half1[0]
up to but not
including half1[j]
, and each integer that occurs in
the array half2
from half2[0]
up to but not
including half2[k]
, also occurs in the section of the
array a
from a[0]
up to but not including
a[i]
, and this section of the array a
is
ordered in numerical ascending order.
Code for merge sort
Having gone through the merge part of the algorithm in detail, we can
now write code for it. Putting it with the previous code, gives the
following complete code for sort
to give sorting in
place implemented by the merge sort algorithm:
public static void sort(int[] a) // Sort the contents of array a in ascending numerical order { if(a.length>1) { int i,mid = a.length/2; int[] half1 = new int[mid]; int[] half2 = new int[a.length-mid]; for(i=0; i<mid; i++) half1[i]=a[i]; for(; i<a.length; i++) half2[i-mid]=a[i]; sort(half1); sort(half2); int j=0, k=0; for(i=0; j<half1.length&&k<half2.length; i++) if(half1[j]<half2[k]) { a[i]=half1[j]; j++; } else { a[i]=half2[k]; k++; } for(; j<half1.length; i++, j++) a[i]=half1[j]; for(; k<half2.length; i++, k++) a[i]=half2[k]; } }The code for this method could be put in the place of the code for the
sort
method using selection sort developed
previously to give a complete class
with a main
method to test it, and the program would compile
and appear to run exactly the same. As the method has the same header,
it is called in the same way, and as it sorts the integers in the array in
ascending order in place, it has exactly the same effect when execution
of the method concludes. But in the computer, the sorting is done in
a completely different way. This shows that you can have more than one
algorithm to solve the same problem.
half1=sort(half1)
takes the
place of sort(half1)
for example.
Below is the code for a constructive sort method using the merge sort
algorithm, contained inside a class which includes a main
method so that it can be demonstrated. The main
method
shows that the original array is unchanged by the sorting:
class SortIntArrayRec2 { public static void main(String[] args) throws IOException // Illustrates constructive merge sort to sort an array of integers { int n; int[] array, array1; BufferedReader in = new BufferedReader(new InputStreamReader(System.in)); System.out.print("Enter number of integers : "); n = Integer.parseInt(in.readLine()); array = new int[n]; for(int i=0; i<n; i++) { System.out.print("Enter integer "+i+": "); array[i] = Integer.parseInt(in.readLine()); } System.out.println("The array you entered is:"); System.out.println(arrayToString(array)); array1 = sort(array); System.out.println("After sorting, the array is:"); System.out.println(arrayToString(array)); System.out.println("The new array is:"); System.out.println(arrayToString(array1)); } private static String arrayToString(int[] a) { String str="["; if(a.length>0) { str+=a[0]; for(int i=1; i<a.length; i++) str+="|"+a[i]; } return str+"]"; } public static int[] sort(int[] a) // Sort the contents of array a in ascending numerical order // and return in a new array { if(a.length>1) { int i,mid = a.length/2; int[] half1 = new int[mid]; int[] half2 = new int[a.length-mid]; for(i=0; i<mid; i++) half1[i]=a[i]; for(; i<a.length; i++) half2[i-mid]=a[i]; half1=sort(half1); half2=sort(half2); int j=0, k=0; int[] b = new int[a.length]; for(i=0; j<half1.length&&k<half2.length; i++) if(half1[j]<half2[k]) { b[i]=half1[j]; j++; } else { b[i]=half2[k]; k++; } for(; j<half1.length; i++, j++) b[i]=half1[j]; for(; k<half2.length; i++, k++) b[i]=half2[k]; return b; } else { int[] b = new int[1]; b[0]=a[0]; return b; } } }Actually, this code does some unnecessary copying, but we'll leave off considering that until we have covered binary search.
Last modified: 8 August 2001