Data Types and Data Representation

New Fortran:

DOUBLE PRECISION, CHARACTER, COMPLEX, TYPE, END TYPE (derived data types)

There are six basic Fortran data types: REAL, DOUBLE PRECISION, INTEGER, LOGICAL, CHARACTER, and COMPLEX. In addition Fortran 90 introduces a structure called the derived type that begins with a TYPE statement and ends with an END TYPE statement. We've covered REAL, INTEGER, and LOGICAL and will cover CHARACTER soon. In a nutshell, you can define "word1" and "word2" as character variables, each containing 16 characters with

```      character*16 word1,word2
```

or

```      character word1*16, word2*16
```

We won't do anything with COMPLEX variables in this course. However, its worth knowing that they are there if you ever get create a scientific or engineering application requiring complex arithmetic. Fortran works with a pair of numbers contained in parentheses to represent the real and imaginary parts of a complex number. To set a variable "cnum" to the value "2.0+3.0i", use the line

```      cnum = (2.0,3.0)
```

This brings us to the DOUBLE PRECISION data type. It is just what its name suggests. If the REAL data type has ~6 decimal digits of precision, then DOUBLE PRECISION is just a real type with double the precision (~13 digits). Review the discussion of internal representation of numbers presented in the second lecture.

The problem with implementation of Fortran data types is that there is no standard representation for the default INTEGER and REAL data types. I have worked on computers variously using 16, 32, 36, 60, and 64 bits for the basic INTEGER type, and 32, 36, 60, and 64 bits for the basic REAL type. Most of my programming career was on CDC (60 bits) and Cray (64 bits). My programs for Cray gave me around 13-14 decimal digits of precision for a standard REAL variable. When I adapted them to Workstations, I was down to about 6-7 digits for the same data type statements. It was necessary to convert the program to DOUBLE PRECISION variables to obtain comparable results.

One common, but non-standard, Fortran construct attempted to establish some clarity in the kind of INTEGER or REAL representation. The type declaration was followed by "*" and an integer giving the number of bytes desired for the variable. The statement

```      real*8 x,y,z
```

asks for 8 bytes (64 bits) to represent the listed variables. On most Workstations, this is equivalent to DOUBLE PRECISION.

Fortran 90 has introduced some important new features to establish some amount of standardization in variable representation. The basic idea behind these features is that you tell the compiler how many decimal digits you want for integers, or what precision and range of exponents you want for reals. The compiler then (if possible) assigns an available kind of INTEGER or REAL to meet (or exceed) your needs. In terms of Fortran statements, this is a two step process. First you create a parameter that will contain the information on the appropriate kind of data. If I want to use integers with up to 9 decimal digits, I use the "selected_int_kind" intrinsic function to set an integer parameter as follows:

```      integer, parameter :: int9 = selected_int_kind(9)
```

Four things are worth noting in the above statement:

1. The argument of "selected_int_kind" is an integer giving the power of ten bounding the largest INTEGER variable of this kind. In the above example, I am asking for integers with absolute values less than ten to the ninth power.
2. The value returned by "selected_int_kind", is not standard from machine to machine. Often it will be the number of bytes required for the job, but don't bet on this being the case.
3. With this function and its relative "selected_real_kind", we are permitted to violate the general prohibition against using intrinsic functions within PARAMETER statements
4. You are not guaranteed that a given computer can provide the number of digits that you have requested. If it can't the value returned above by selected_int_kind to "int9" will be -1. You should test for this value and print an error message and stop if it occurs.

Having set int9, I can declare integer variables of this kind with statements like:

```      integer (kind=int9) i1, i2, i3, i4
```

or

```      integer ( int9 )   i1, i2, i3 i4
```

In general the KIND specification is contained within parentheses imediately after the associated type statement. Use of "kind=" makes the code more readable, but is not required.

I can do something similar for REAL variables. The intrinsic function "selected_real_kind" takes two arguments. The first is the number of digits of precision desired, and the second is the largest magnitude of the exponent of 10. For example if variables "x", "y", and "z" need 14 digits of precision and will contain values ranging through 1.0e30, I would set the data type with the lines:

```      integer, parameter :: r14 = selected_real_kind(14,30)
real (r14) x,y,z
```

When using selected_real_kind within executable statements, you can check to see if your requested representation exists on the machine. If the requested precision is not available, then a value of -1 is returned. If the requested exponent range is not available, then a value of -2 is returned. If neither requested precision nor requested exponent range is available then it gives a value of -3. A simple test program should be written to check for these results before attempting any unusual precision or exponent requests on a machine.

Remember that any given computer only has a small number of ways to represent INTEGERs or REALs. Unless you know the details of the machine, you are not likely to get exactly what you request in the selection of kind. You get something that provides range and precision in excess of your request. Fortran 90 provides a number of intrinsic functions to check the actual representation of your numbers. The example kind.f uses most of this set of intrinsic functions. They are:

kind (var)

Returns the integer that the current machine associates with the kind for variable "var". In the above example, the value returned by "kind(x)" would equal the value of the parameter "r14".

precision (var)

Returns the number of decimal digits of precision possible for any REAL or COMPLEX variable with the same kind as "var".

range (var)

Returns the approximate maximum power of ten bounding variables with the same type (REAL, COMPLEX or INTEGER) and kind as "var".

digits(var)

Returns the number of binary digits (bits) available to represent and INTEGER or the mantissa of a REAL variable "var".

maxexponent(var)

Returns the maximum power of 2 available in the internal binary floating point representation for REAL variable "var".

bit_size(var)

Returns the total number of bits in the word (including sign bit) representing an INTEGER "var"

The end result of these Fortran 90 features is that the old data type DOUBLE PRECISION is obsolete. Avoid it in new programs.

Data Types for Constants

One of the more unpleasant features of FORTRAN is that you should keep track of the kind (single, double, quad precision) of your real constants. For example, when I run the sample program dblconst.f. I get the following results for the double precision variables y1, y2, and y3:

``` For "y1=2.1" we get a value in y1 of  2.09999990463256836
For "y2=2.1d0" we get a value in y2 of  2.10000000000000009
Setting r8=selected_real_kind(13,100), then
For "y3=2.1_r8" we get a value in y3 of  2.10000000000000009
```

When I use a simple constant like "2.1", FORTRAN never tells me, but it assumes that the constant is single precision. That wouldn't be a problem if it exercised more care when using it in a double precision assignment statement. However, when it is used in a way that requires conversion to double precision, every FORTRAN compiler that I have used is remarkably careless. Notice what the example above does to the internal value of "2.1", Before Fortran 90, when it was important to get good representation of all digits we replaced "2.1" with "2.1d0" (double precision equivalent of "2.1e0"). Fortran 90 lets you be more flexible in constant declaration. If I use the parameter "r8" to store the kind appropriate for my desired precision, then I can force any constant to have that precision by appending "_r8" (e.g. 2.1_r8, 1.0_r8, 1.375e11_r8). Similar kind designations can be used for integer constants, but the impact is not as noticeable.

Why worry about all of this? Sometimes I really want to see more than 7 digits of precision, and can't live with the garbage that is dumped into the final 7 or 8 digits of a constant. However, a more important reason is reproducibility. Many significant engineering applications are executed on a wide range of computers. Part of the installation procedure for the software is execution of test problems. Program users become very nervous or hostile if these test problems give noticeably different results on different computers. They frequently will give noticeable differences in results if the computer controls the contents of some of the trailing digits in your constants.

Derived Type

The new derived data type structure in Fortran 90, permits more complicated data structures. Any mix of REAL, INTEGER, CHARACTER, COMPLEX, and LOGICAL information can be combined under a single variable name, in effect representing a mixed type array. For example I can create a derived type called "student" , with necessary information.

```      type student
character*20 last_name
character*16 first_name
character*6 comp_id
integer id
real score
end type
```

Then I can create an array to hold this information for class members.

```      type (student) cs(100)
```

To add student information to the array, I can use a line like:

```      cs(55)=student('Smith','John','jas112',222665555,96.1,'A')
```

To extract specific elements of this composite data type I can use references like:

```      sum = sum + cs(i)%score
```

Here I am summing the scores in the class.

We will use this a little more later in the semester.