Sorting

Assignment :

Read Chpt. 14.3, Study sort1.f, sort2.f, and sort3.f

New Fortran:

CSHIFT and EOSHIFT intrinsic functions

Now that you have most of the basics of Fortran down, its time to get more serious about programming applications and you can't have a Computer Science class without talking about sorting methods. In this specific discipline, sorting algorithms are very valuable for Compilers and Operating Systems and Systems Utilities. Usage is much lower in other areas of Engineering and Science, but at some time or another you are likely to need to do some sorting of information.

Sorting has been done for over 40 years by hundreds of programmers with much more skill in that area than you or I could develop in any reasonable time. So the best first reaction to dealing with a sorting problem is to look for a public domain subroutine that will do the job. The first place to look for Subprograms that do Scientific, Mathematical, and Engineering tasks is Netlib. Using their search engine, I was able to locate a subroutine called ssort, that applies a modified Singleton Quicksort algorithm. What's that? Check the reference listed in the subroutine if you are interested (adding references in your comments is a very good programming practice). However, if I need a quick fix to a problem, it doesn't matter, as long as I know what it to do with its arguments, to plug it into a problem. For large poorly ordered array's, this subroutine should be significantly faster than the simpler methods that I will introduce here. Take a look at the way ssort changes the order in which it sorts.

We'll be using Netlib again later. If you remember no other single term exactly from this course, remember NETLIB. It will save you a tremendous amount of programming effort on a wide range of applications, both in later courses and in your professional life.

Selection Sort

The Selection Sort is probably the simplest available, and worth remembering for those emergencies when you must patch something together quickly. For a basic example, assume that we have an array "x" containing "n" elements, that we want to sort from high to low by value. Find the element with the maximum value (maxloc function is a good choice for this job), and swap the contents of this element with those of the first element. Now that the first element contains the largest value in the array, scan the remaining elements (2 through "n") for the element with the largest value. Switch the contents of that element with the contents of the second element in the array. Move on to repeat this sequence of search and swap operations, until you are just checking and possibly swapping just the last two elements of the array. To sort in increasing order, use the same process with a minimum check.

Here is a simple example with brackets marking numbers swapped during a pass.

```Beginning Array :   2  3  1  7  9  5
First Pass      :  [9] 3  1  7 [2] 5
Second Pass     :   9 [7] 1 [3] 2  5
Third Pass      :   9  7 [5] 3  2 [1]
```
Two more checks on maximum value follow, but no more swaps result.

The total number of arithmetic operations in a selection sort is always the same for a given array size. For an array with "n" elements the number of computer operations (hence computer time) is proportional to n**2.

The subroutine contained in sort1.f is my implementation of this algorithm. Note the sequence of lines necessary to swap the contents of the two elements within the array. Also notice the second array that I am swapping at the same time. When we run some sample problems, you will see that this can be used to keep track of the original index for an element in the sorted array. If I load the array called "iy" with integers equal to the element indexes, then after sorting a value of 2 in element 5 of "iy" would say that the contents in element 5 of the sorted "x" array were originally in element 2 of "x".

Bubble Sort

In a bubble sort the first two elements are examined and swapped if the second is greater than the first (assuming sort from high to low value). Next the second and third elements are examined and swapped if the third is greater than the second. Note that at this point the value in the second element could have originally been the value in the first element. This proceeds until each consecutive pair has been considered and if necessary, swapped. The end result is that the smallest value in the array has "bubbled" to the last element in the array. Another pass of these swaps follows, but this time the last element in the array need not be considered. These passes continue until either no swaps occur during an entire pass through the array or the pass consists of simply swapping elements one and two.

The total number of arithmetic operations in a bubble sort may be of the order n**2 for an array with "n" elements. In a high to low sort, the worst case would occur when the highest value started in the last element of the array. However, if the array is nearly in order, the bubble sort may complete relatively quickly ( a few times "n" operations).

The following sequence of steps on a small array illustrates a bubble sort. A subroutine to do the job has been provided in sort2.f .

Pass 1

```Base array :   1  5  4  2  3
Swap       :  [5  1] 4  2  3
Swap       :   5 [4  1] 2  3
Swap       :   5  4 [2  1] 3
Swap       :   5  4  2 [3  1]
```
Pass 2

```No Swap :  [5  4] 2  3  1
No Swap :   5 [4  2] 3  1
Swap    :   5  4 [3  2] 1
```
Pass is done because we know the last element is in the right position

Pass 3

```No Swap   :  [5  4] 3  2  1
No Swap   :   5 [4  3] 2  1
```
Pass is done because we know the last two elements are in the right positions

No swaps in pass 3, sort completed.

Insertion Sort

The first step in an insertion sort is to put the first two elements in proper order. For a high to low sort, this involves swapping the first two if the second element has a higher value. Next, the third element is checked. If it is less than the second element, it is left in place. If not, the elements before it are checked for the first one that has a lower value. The value of the third element is inserted before this lower valued one and values of elements are shifted up in the array to produce an expanded ordered list. This pattern of checking for order, and shifting values as needed continues until all elements in the array are checked.

The total number of arithmetic operations in an insertion sort may be of the order n**2 for an array with "n" elements. In a high to low sort, the worst case would occur when the array is originally perfectly ordered from low to high value. However, if the array is nearly in order, the insertion sort will complete relatively quickly ( a few times "n" operations) because few insertion shifts are required.

Examples of the insertion sort are provided in sort3.f and sort3a.f. The latter uses a new Fortran 90 intrinsic function called CSHIFT to handle the shifting of the array elements. The "C" stands for "circular", meaning that any elements that are shifted off of one end of the array are reinserted at the other end of the array in space vacated by other shifted elements. If I had wanted any elements shifted off of the end of the array to vanish (and perhaps specify special treatment for filling vacated elements at the other end), I would have used the EOSHIFT intrinsic function.

Here is a simple example of an insertion sort. In each step square brackets surround the portion of the array that is shifted, and parentheses surround the value that is inserted.

```Base array :   1  3  4  2  5
Shift      :  (3)[1] 4  2  5
Shift      :  (4)[3  1] 2  5
Shift      :   4  3 (2)[1] 5
Shift      :  (5)[4  2  3  1]
```

Using Isolated Subroutines in Unix

I've provided you with a bunch of files containing single subroutines. Here's how you use them. First you need a file with a main program that calls the subroutine "ssort" (or a subroutine or function calling "ssort"). For this sample I've done that with a file named drvsort.f. To compile both the main program and the version of "ssort" contained in sort1.f, simply type the line:

f77 drvsort.f sort1.f

The resulting executable file (a.out) will be the same as if the contents of drvsort.f and sort1.f had been combined in a single file and that file compiled.

I take advantage of this feature with the shell script runit, existing in the same subdirectory (sort) as all of my sample sorting subroutines. It uses a csh "foreach" loop to compile and run drvsort.f with each of the various methods of sorting, verifying that all subroutines produce the same correct results.

It is actually standard practice in a Unix environment (regardless of programming language), to place each program unit (main program, subroutine, function) in its own file. The files are all stored in a subdirectory set aside for that particular program's source code. Normally the names of files match the names of the program units, although the main program is simply named "main.f". One quick way to compile the full program is to change to the program's directory, and type:

f77 *.f

As you become a more frequent user of programming under Unix (regardless of your choice in programming language) you should take the time to master the use of the "make" utility. This will make the creation of an executable program much simpler for you.