# More on Arrays

### New Fortran:

Fortran 90 array arithmetic assignment statements, Array Initialization, Implied DO, Fortran 90 Intrinsics: MAXLOC, MAXVAL, MINLOC, MINVAL, SUM, COUNT WHERE, ELSE WHERE Structure

Once you get into serious use of arrays in engineering and scientific applications, it doesn't take much effort to stretch the abilities of your computer. It's just a question of increasing the resolution of your view of the real world. You begin to worry about machine storage, and machine speed, and try to remember some of the things you once heard about pipelined arithmetic units, vector, and parallel processing. Part of the motivation behind Fortran 90, was to make efficient utilization of such machine features more transparent. This was accomplished through some simple language extensions, and a substantial increase in the set of available intrinsic functions. The details of many time saving features are hidden in the intrinsic functions, where you don't need to worry about them. If you think about vector pipelines in computers, you should realize that simple DO loops operating with array elements are great ways to keep the pipelines fed. The following segment of programming will do a very good job of pumping a stream of numbers into a pipelined multiplier.

```      parameter (n=20)
real a(n),b(n),c(n)
sum=0
do 90 i=1,n
c(i)=a(i)*b(i)
90   continue
```
However, the addition of a few bells and whistles to the DO loop, can make it very difficult to impossible for a compiler to take advantage of pipeline (or parallel) processing. For example, after generating values for elements of "c", I might need to know the maximum value, minimum value, and average value of the elements in "c". A fairly standard algorithm for generating these values within a DO loop is:

```      integer nd,i
parameter (nd=10)
real a(nd),b(nd),c(nd),csum,cmax,cmin,cavg
csum=0.
cmax=-1.e38
cmin=1.e38
do 100 i=1,nd
c(i)=a(i)*b(i)
cmin=min(c(i),cmin)
cmax=max(c(i),cmax)
100    csum=csum+c(i)
cavg = csum/nd
```
Note that a sum is generated by initializing "csum" to zero before the loop, and adding each element of "c" to "csum" as the loop progresses. A maximum value is obtained by initializing "cmax" to such an extremely negative number, that any element of "c" will exceed the first value of "cmax" as the "max" tests proceed in the loop. A similar action is taken starting with a very large value for "cmin" to obtain the minimum of all elements in "c".

Because the generation of "cmin", "cmax", and "csum" depend on values obtained in the last pass through the loop, these operations are not directly appropriate to pipelined, or parallel processing. As a result a compiler will generally give up trying to do anything to "vectorize" the DO loop.

Fortran 90 encourages you to maintain potential for higher computer speed, by introducing a shorthand for the portion of the loop doing the multiplication, and providing intrinsic functions to cover the other operations.

```      integer nd
parameter (nd=10)
real a(nd),b(nd),c(nd),cmax,cmin,cavg
c=a*b
cavg = sum(c)/nd
cmin=minval(c)
cmax=maxval(c)
```
For long programs I prefer a slightly longer form of this notation:

```      c(1:nd)=a(1:nd)*b(1:nd)
cavg = sum(c(1:nd))/nd
cmin=minval(c(1:nd))
cmax=maxval(c(1:nd))
```
This makes it clear to anyone reading your program that you really are operating with arrays, and clearly specifies the range over which the operations occur.

This notation can be extended to include a skip factor (or stride), behaving like the index increment in a DO statement. For example the sum of the odd numbered elements in "c" could be obtained with:

```            csodd = sum(c(1:nd:2))
```
The sum of even elements would be:

```            cseven = sum(c(2:nd:2))
```
For arrays of any significant size it is unlikely that the compiler will generate more efficient machine code from the DO loop construct, than it will using the Fortran 90 intrinsic functions. Hence, it is worth remembering:

SUM - A Fortran 90 intrinsic function to sum the elements of the array provided as its argument. If only the name of the array is provided (e.g. sum(c)) all elements of the array are summed. A selected portion of the array may be summed, if an integer range expression is provided with the array name (.e.g. sum(c(2:10)), sums elements 2 through 10 of c). This function can take two other optional arguments, that will be covered in more detail, when we get to multi-dimensional arrays. It is worth noting here that the "MASK=" option lets you further restrict the elements that you will include in the sum. This is included as an argument with a logical expression involving an array with the same size as the array being summed. For example:
```      cpsum = sum(c,mask=c.gt.0)
```
evaluates the sum of all elements in "c" that are positive, and
```      cnsum = sum(c,mask=c.lt.0)
```
evaluates the sum of all elements in "c" that are negative. The statement
```      cbsum = sum(c,mask=b.gt.8)
```
evaluates the sum of all elements in "c" for which corresponding elements in "b" are greater than 8 (see sums.f for a test program).

MINVAL - A Fortran 90 function to obtain the minimum value of all elements contained in the array provided as the argument to the function. The same optional arguments provided in SUM are available here.

MINLOC - MINVAL gives you the minimum value, but doesn't tell you which element has that value. This Fortran 90 function returns the index of the element in the array, whose value is minimum. If c(1)=1, c(2)=0, and c(3)= -1, then "minloc(c(1:3))" returns the integer "3". Due to requirements for use with multidimensional arrays, the value of this function must be returned to an array. For example "icmin=minloc(c(1:3))" will only work if "icmin" has been declared to be an integer array, and the value "3" will be placed in the first element of "icmin". The MASK, optional argument is available for this function.

MAXVAL - A Fortran 90 function to obtain the maximum value of all elements contained in the array provided as the argument to the function. The same optional arguments provided in SUM, and MINVAL are available here.

MAXLOC - This Fortran 90 function returns the index of the element in the array, whose value is maximum. If c(1)=0, c(2)=1, and c(3)= -1, then "maxloc(c(1:3))" returns the integer "2". The MASK, optional argument is available for this function.
When using the MASK argument in the SUM function, it is often useful to know the number of elements of the array for which the MASK expression is true. This is provided by the COUNT intrinsic function. If array x has 10 values { -1 , -0.5 , 0, 0.5, 1., 1.5, 2.0, 2.5, 3.0, 3.5}, then "count(x .lt. 0.75)" produces the INTEGER result "4".

The MASK argument is very general, and need not include a reference to the array being operated on by the intrinsic function, it just needs to refer to an array with the same length. For example I might have arrays containing information about "n" people with "age" containing ages, and "weight" containing the corresponding weights. The average weight of those from 30 through 39 years old can be obtained from the following statement:

```      avweight= sum(weight(1:n), mask = age.ge.30.and.age.lt.40)/
&       count(age.ge.30.and.age.lt.40)
```
One other general feature of these array intrinsic functions is that the argument need not be a simple array. It can be an arithmetic expression involving arrays with the same lengths. For example from time to time you will have an array of experimental results "ydata" and an array of corresponding theoretical predictions "ytheory". One useful measure of the quality of your theory (and maybe your data) is the standard deviation, which requires a calculation of the sum of the squares of the differences between data and theory.

```      sumsqdf = sum ( (ydata-ytheory)**2)
```
Note that the order of operation here is such that this gives:

sumsqdf = (ydata(1)-ytheory(2))**2 + (ydata(2) - ytheory(2))**2 + ...

As you get into more complex programming tasks, you may find a need to apply the same type of masking features, discussed for the SUM intrinsic function, to array assignment statements. For example how do you prevent a division by zero in an array divide statement?

```      c(1:nd)=a(1:nd)/b(1:nd)
```
This is covered by the WHERE, ELSE WHERE, END WHERE, structure. I can write

```      where (b(1:nd).ne.0) c(1:nd)=a(1:nd)/b(1:nd)
```
I can provide more information for the case that an element of b is zero with

```      dimension a(nd),b(nd),c(nd),iflag(nd)
where(b(1:nd).ne.0)
c(1:nd)=a(1:nd)/b(1:nd)
else where
c=0.
iflag=1
end where
```
You will be tempted to think of WHERE, ELSE WHERE structures as analogs to IF, ELSE IF, but don't carry the comparison too far. The ELSE WHERE statement is really the same as the simple ELSE statement in an IF structure. There is no equivalent to ELSE IF in a WHERE structure. WHERE permits at most two possible actions. For array elements where the mask in the initial WHERE is true, the statements immediately after that WHERE are executed. For elements where the mask is false, statements following any ELSE WHERE are executed (no ELSE WHERE, then nothing is done). For more practice with WHERE, study where.f the end of array2.f.

Many times you do not want to operate on the full range of the array's dimension. Let's take the example that we only want to operate on elements 2 through 9. Elements 1 and 10 may be boundary values in some physical problem. For a standard DO loop, you would be doing something like:

```      real a(10),b(10),c(10)
data  a/1.,2.,3.,4.,5.,6.,7.,8.,9.,10./,b/3*1.,4*2.,3*3./
csum=0.
do 100 i=2,9
c(i)=a(i)+b(i)
100    csum=csum+c(i)
```
In Fortran 90 the above problem is not difficult. Just use the appropriate integer range notation.

```      real, dimension (10) :: a=(/(j,j=1,10)/), b, c
data b/3*1.,4*2.,3*3./
c(2:9)=a(2:9)+b(2:9)
csum=sum.(c(2:9))
```

### Array Initialization

Those of you who have been studying the example's array1.f, array2.f, sums.f, and where.f, have already seen the two methods of array initialization introduced in the two examples above. However, I haven't spelled out the key rules of the game. The simplest and most common is a full specification of the array's contents in a DATA statement. You give the array name with no index specification followed by a data list containing values for every element of the array..

```      data  a/1.,2.,3.,4.,5.,6.,7.,8.,9.,10./,b/3*1.,4*2.,3*3./
```
Constructs like "3*1." say repeat the value "1." in three consecutive array elements. I could have also done something like:

```      data  a(1),a(2),a(3),a(4)/1.,2.,3.,4./
```
or

data (a(j),j=1,4)/1.,2.,3.,4./

to specifically fill individual array elements, but not initialize others.

The curious construct "(a(j),j=1,4)" is an implied DO loop, similar to the one associated with the WRITE statements in array1.f. The implied DO can also be used with direct data initialization in a type statement. For example:

```      real, dimension (10) :: a=(/(j,j=1,10)/)
```
is equivalent to

```     real, dimension (10) :: a=(/1.,2.,3.,4.,5.,6.,7.,8.,9.,10./)
```
I could have generated a more complicated series:

```      real, dimension (3) :: a=(/(j**2+j,j=1,3)/)
```
which is the same as

```      real, dimension (3) :: a=(/2, 6, 12/)
```
Assignment of values within a REAL statement is new to Fortran 90, and this use of the implied DO is restricted to the REAL (don't try it within a value list for a DATA statement). It is also worth mentioning that although a statement

```     real, dimension (10) :: a=(/1,1,1,1,1,1,1,1,1,1/)
```
is valid, the statement

```      real, dimension (10) :: a=(/10*1/)
```
does not work. You can't use the repeat operation of a DATA statement in a REAL assignment.

Remember that an implied DO loop is legal in DATA statement only as part of the list of variables to receive values:

```      data (a(j),j=1,10,2)/5*1./,(a(j),j=2,10,2)/5*0./
```
The above statement sets odd indexed elements of "a" to a value of 1.0 and even indexed elements to a value of 0.0. To see the values of even elements in "a", I can apply an analogous implied DO loop to a WRITE statement.

```      write(*,2000) (a(j),j=2,10,2)
```
Please note that nothing has changed in positioning of DATA and REAL statements, assigning initial values to arrays. They are still non-executable, and must precede any executable statements in the program unit where they are applied.

## Mathematica (optional)

If you are coasting in the Applications Labs, I recommend that you also take a look at the symbolic algebra package called Mathematica, and have provided some sample Mathematica scripts. It is in effect another programming language, but operating at a much higher lever. Fortran really only deals with numbers, but Mathematica will manipulate symbolic (algebraic) expressions. It can handle all of those factorization problems that drove you nuts in high school algebra, and do all the chain rule work to obtain general derivatives of functions. Better yet it can do a wide range of indefinite or definite integrals. You can also use its built in capabilities to solve complicated equations, both algebraically when possible, and numerically when no closed form algebraic solution exists. Later in the semester, you may want to use Mathematica to solve an Ordinary Differential Equation

If you have the time, search the Web for Mathematica Tutorials, download tutorial notebooks, and use them by starting the XWindows interface to Mathematica with the command "mathematica". To solve the problem at hand I recommend that you copy the file mathin into your home directory, and try it by typing "math" at the Unix prompt, and the typing "<<mathin" when you get a Mathematica prompt. To exit Mathematica type "Quit". Once you have a feel for what "mathin" is doing, modify it with "vi" to solve your problem.

You can learn more about Mathematica by looking at the book "Mathematica" by Stephen Wolfram in one of the campus libraries (usually on reserve in the Math Library). You can also pick up a little from my list of basics, or Chris Duffy's quick reference.