Character Variables


Assignment :

Read Chpt. 12, and 14.3. Study charvar.f, charvr90, and associated test problems

New Fortran:

Concatenation operator (//), Character Substrings, Character Intrinsic functions

If you expect to pursue work in science or engineering for any reason beyond your own curiosity, it is extremely important that you develop your communications skills. The abilities to clearly explain your work verbally and in writing ( two separate skills) will play an exceptionally important role in the level of acceptance and respect that your work will receive. This in turn has a significant impact on your salary and rate of professional advancement. The ability to persuasively explain your vision for future projects will have a major impact on your professional and intellectual mobility.

These statements carry into your programming efforts. You can write a program that does the world's best job of solving some particular problem, but it will receive little or no use if the output, including error diagnostics, is difficult to interpret, or the input to the program is difficult to use. CHARACTER variables and associated operations and intrinsic functions are the best tools for clear communication with your program users (including yourself after several months not looking at the program). They also provide the means by which you can make input as easy as possible. Even if you have no intention of using FORTRAN all of your professional careers, pay careful attention to the general capabilities and strategies presented here. They carry into other languages.


Fortran 77

The CHARACTER data type and standardized character manipulation features were first introduced in Fortran 77. Before that we worked with Hollerith strings and loaded them into integer variables. This was awkward at best, and not portable between all machines. Lack of portability was the result of the fact that machines disagreed on the number of characters that fit into a single precision integer.

You've already seen how to define variables to be of type CHARACTER:

      character*80 line, newline
character sentence*80, letter*1, string*8, word(20)*16
The last definition creates an array containing 20 elements, each with 16 characters. You can assign values with DATA statements, but must be careful to include spaces to fill out the full length of the variable.

      data letter,word(1)/'a','obfuscate           ' / 
Normal assignments are more forgiving. The statement

      sentence = 'This is important to learn'
automatically fills blanks into the remaining characters of "sentence" after the last "n". The assignment

      word(2) = 'anticholinesterase'
leaves only 'anticholinestera' in "word(2)".

The next important concept with character variables is the substring. I can pull a portion of a character variable out and assign it to another character variable

      word(1)=sentence(1:4)
i=6
j=7
word(2)=sentence(i:j)
word(3)=sentence((j+2):(j+10))
word(4)=sentence(j+12:j+13)
The results of each assignment are left justified in the appropriate element of "word", and unused space is filled with blanks.

Now that you know how to disassemble CHARACTER variables, you need to know how to assemble several (concatenate them). This uses the "//" operator. I can combine elements 1 through 4 of word and put them into "line" with

      line = word(1)//word(2)//word(3)//word(4)
and get the contents of line as

This            is              interesting     to         
Trailing blanks are preserved during concatenation.

CHARACTER Intrinsic Functions

The last concatenation problem can be cleared up with the most useful Fortran 77 character intrinsic function "index". It has the following form and definition:

index (string1,string2)

Looks in character variable (or quoted string) "string1" for the first occurrence of "string2". It returns an integer value giving the character number within "string1" at which the beginning of the first occurance of "string2" is found. If "string2" is not found, index returns a value of zero.
A much nicer assembly of elements 1 through 4 of word can be accomplished with the following code (assuming at least one trailing blank exists).

i1 = index(word(1),' ') - 1

i2= index(word(2),' ') - 1

i3= index(word(3),' ') - 1

i4 = index( word(4), ' ') - 1

line = word(1)(1:i1)//' '//word(2)(1:i2)//' '//word(3)(1:i3)//' '//word(4)(1:i4)

Another useful but less frequently used intrinsic function tells you the length of any given character variable.

len ( string )

Returns an integer value giving the length of the character variable or quoted constant "string". For example, given previous CHARACTER type declarations, "len(word(1))" would return the value 16 and "len(line)" would return 80.
The "len" function is valuable in substring manipulations, and particularly valuable when a CHARACTER variable has been passed through the argument list of a subroutine or function. Normal procedure for such arguments is to use a "*" in the length specification, telling Fortran to pick up and use whatever length is appropriate. For example the subroutine:

      subroutine  charvar1(charvar)
character charvar*(*)
integer lc
lc = len(charvar)
print *, charvar(lc:lc)
return
end
prints the last character in the variable "charvar", regardless of the actual length of "charvar".

Characters are actually just represented within the computer as numbers. Most computers agree on the ASCII character set, mapping various characters to the numbers 0 through 127 (uses only 7 bits), but other mappings do exist (usually 8 bit extensions to ASCII or IBM's EBCDIC). You can learn the number associated with a given character or character associated with a number.

ichar (letter)

Returns the integer associated with the character contained in the variable or quoted string "letter".
char (num)

Returns the character associated with the integer contained the variable or constant "num".

These functions are particularly useful in converting the case of characters. The function "char" was frequently used to obtain odd non-printing characters for comparison purposes. However, this latter function is now better done with the Fortran 90 "achar" function.

Using CHARACTER Variables in I/O

I mentioned that FORMATs are stored as character strings. As a result Fortran lets you build formats within your program and use them within an I/O statement. As a crude example look at the following code:

      character*32 form1
t = 300.2
form1 = '( ''temperature = '', f6.3)'
write (*,form1) t
I/O statements are also valuable as internal READs and WRITEs to convert back and forth between INTEGER or REAL variables and their formatted character representations. For example if the CHARACTER variable "one" contains the character string '1.0', then the following read will result in the REAL variable x containing the internal binary representation of the floating point number "1.0".

      read (one,*) x
Alternately if I wanted to convert the internal value of the INTEGER variable "ix", into characters within the CHARACTER variable "cnum", the following would work:

      write(cnum, *) ix
The number is right justified within "num". This type of operation is particularly useful in the construction of formats.

Relational Operators and CHARACTER variables

You will frequently want to check to see if the contents of a CHARACTER variable match a specific quoted string, or another CHARACTER variable. The following statements are valid:

      character string1*8, string2* 16
if( string1.eq.'YES') print * , 'Answer was yes'
if (string1.eq.string2) print *, 'Strings match'
if (string1(1:1).ne.'q'.or.string1(1:1).ne.'Q')
& print *, 'Don''t quit'
When lengths of variables being compared don't match, the contents of the shorter are padded with blanks at the end to the length of the longer before comparison.

The relational operators .LT., .GT., .LE., .GE. are also valid, and are useful in applications requiring alphabetization.

Look in charvar.f for more examples of all the character operations and usage discussed here for Fortran 77.


Fortran 90

Fortran 90 introduces a new way to initialize values of CHARACTER variables.

      character*8 ::   string='abcde'
However, the only other major change is the addition of several significant new intrinsic functions, and an important addition to INDEX. A new optional argument "BACK=.TRUE.", has been introduced to force INDEX to perform its search from the back to front of the character string. This argument is also available in two close relatives of INDEX.

scan(string, chars)

Scans "string" from left to right (unless BACK=.TRUE.) for the first occurance of any character contained in "chars". It returns an integer giving the position of that character, or zero if none of the characters in "chars" have been found. A value of 3 is returned by "scan('function','no')". A value of 8 is returned from "scan( 'function', 'no', back=.true.)".
verify(string, chars)

Scans "string" from left to right (unless BACK=.TRUE.) for the first occurance of any character not contained in "chars". It returns an integer giving the position of that character, or zero if only the characters in "chars" have been found. A value of 3 is returned by "scan('function','uf')".

The functions "char" and "ichar" are supplemented with two that perform roughly the same function, but always applied to the ASCII character set, letting you work with a known base to generate characters in your machine's set.

achar(num)

Returns the character in your machine's representation (if it exists) corresponding the character in the ASCII set with position given by the INTEGER variable or constant "num". For example: "achar(9)" returns the Tab character; "achar(65)" returns 'A'; and "achar(97)" returns 'a'.
iachar(letter)

Returns an INTEGER value corresponding to the position in the ASCII character set of the letter in the CHARACTER variable or quoted "letter". For example: "iachar('A')" always returns the INTEGER value 65, regardless of the machine that you use.
Several other new functions provide additional flexibility when dealing with characters.

adjustl(string)

Left justifies characters contained in the CHARACTER variable "string". If
      string1 = '   abcd'
string2 = adjustl(string)
then "string2" is just 'abcd' followed by blanks if needed. This can be very useful in preparing some character strings for trimming (see below) and concatenation.
adjustr(string)

Right justifies characters contained in the CHARACTER variable "string". If
      string1 = 'abcd'
string2 = adjustr(string)
then for CHARACTER*8 "string2", "string2" is four blanks followed by 'abcd'. This is a good feature when creating tabular output, some graphics labels, and strings for later use in a formatted read.
trim (string)

Produces a CHARACTER result that has length equal to that of "string" minus the number of trailing blanks in "string". This is a great way to get rid of troublesome trailing blanks when concatenating two or more CHARACTER variables. If "string1" and "string2" each contain one word followed by blanks, then use "trim(string1)//' '//trim(string2)" to obtain the two basic words separated by a single blank.
len_trim(string)

Produces an INTEGER equal to the length of "string" (len(string)) minus the number of trailing blanks. The result of "string(1:len_trim(string))" is the same as the result of "trim(string)".
repeat(string,ncopy)

Produces a CHARACTER result with length equal to "ncopy" times the length of "string", and containing "ncopy" concatenated copies of "string" ("ncopy" is an INTEGER and "string" a CHARACTER variable). For example "repeat('abc',3)" is 'abcabcabc'. Not something I'd use much, but somebody must need it.
Take a look at charvr90.f for my quick attempt to convert charvar.f to Fortran 90.


Applications of CHARACTERs

As you study charvr90.f and charvar.f think carefully about writing programs to provide as much diagnostic information as possible to the user. What other diagnostic messages do you think are appropriate? Also notice that I've constructed my input processing to allow comments in the input file. For significant problems, the presence of comments in the input is as important as comments in the program source code. It let's you make notes about the contents, and keep track of some of the changes during the history of your attempts to model a system. When I am writing a major application, I almost always read text input files simply with a '(a)' FORMAT, and let the program itself sort out the intended information in the line. This gives much more flexibility for error diagnostics, comments, and special input processing.

In the era of point and click computing, you may ask why we need text input files, when everything can be done with menus. The simple answer is that the use of a graphical user interface (GUI) with menus is an excellent way for initial user input, but on large problems you get tired of too much clicking for minor variations on some basic theme. At some point the results of a GUI need to be piped to a text file that can be modified and reused quickly for scoping studies, and that provides full documentation of the initial conditions for your problem.

Another important programming issue, that you now need to consider is validation of your program. This becomes both important and complicated when many options exist within your program. To minimize long term suffering, you should always think of test problems as you are developing each special section of the program.


Check your knowledge of this material, but first be sure your Web Browser works correctly.


Back to the Table of Contents / Home


Written and Maintained by John Mahaffy : mahaffy@psu.edu