Introduction to Computers and Computing


Assignment :

Read one or more of the on-line Unix Tutorials

History

The Historical path that leads to this class can be traced to Galileo. He introduced the idea that mathematics can be used to make predictions about the behavior of the physical world. This began the search for better ways to calculate the trajectories of projectiles (cannon shells), motion of planets, tides. etc.. In terms of money and time spent, the cannon shells have tended to dominate over planetary motions for most of the history of computing. The largest driving force in the development of hardware and software for modern scientific computing was until recently the design efforts for nuclear and thermo-nuclear weapons. During recent years, it might be argued that the largest market for high performance hardware is the entertainment industry. A huge amount of high performance computer power goes into generating special effects for movies and TV, and a rapid upward spiral can be observed in the power of home computer systems, driven largely by a desire for improvement in graphics oriented programs (games etc.). Of course much of the special effects and gaming involve large explosions and massive destruction. The more things change, the more they stay the same.

Another path that has enhanced the development of computing began further back in history. Shortly after people began communicating, they started looking for ways to communicate with some people in a way that most others couldn't understand (secret codes, encryption). Much later, but still long ago some of those others decided they'd like to know what was being said, and began the process of code breaking. As computers developed, the encrypters realized that computers could be used to generate more secure codes, and the code breakers realized that more computers could be used to break these codes (look at the history of code breaking during World War II). This community has been and is a major driving force behind the development of state of the art computers if only as a major market for the largest and fastest machines.

I always find it interesting to look at the people behind ideas and inventions. I recommend that you take a look at my list of people in the history of computing, and Lyle Long's very brief History of Computing, or do a little reading on your own at the library.


How a Computer Works

Before you spend a semester learning how to issue instructions to computers, it's a good idea to understand how computers work. This will give you hints on what types of instructions to expect in any computer language, and help you group what you learn in ways that are easier to remember. At a more advanced level, understanding the underlying operation of computers will help you structure your programs in ways that improve their execution speeds.


Key Elements of a Computer

Memory (Storage)

Two classes of information are stored:

  1. Data - Characters, Integers, Floating Point Numbers, ...
  2. Program Instructions to manipulate the data

CPU (microprocessor)

It follows stored program instructions to move, add, subtract, multiply, divide, and compare data. It contains the following elements:

  1. Control unit to process program instructions
  2. Logic units to compare data. Comparisons may simply return a true or false answer (to questions like "Is x less than y"), or they may force the control unit to branch to process a specified instruction if a condition is met.
  3. Arithmetic (integer and hopefully floating point) units to add, multiply, and divide
  4. Registers for storage of instructions and data currently in use. These have the advantage that their contents are available to the logical and arithmetic units immediately. Contents of main memory take many clock cycles to reach the CPU.
  5. Cache (only on some cpu's) for storage of soon to be used instructions and/or data. Contents are available as quickly or nearly as quickly as contents of registers.

Peripheral Processors

These relieve the CPU from handling special Input and Output operations. The video card on your PC is an example of a Peripheral Processor Unit (PPU).

Bus

It moves data and instructions between the above three

Clock

This is the basic driver for operation of the above units. The clock pulse triggers interpretation and execution of instructions. Most instructions take more than one clock pulse to complete. A 133MHz Personal Computer has a CPU clock that pulses 133,000,000 times each second. If such a machine is well designed and programmed, it could in theory do 133,000,000 additions in a second. This seldom happens in practice.


More Details on Computer Structure and Operation

PC's and Workstations are getting faster by implementing architectural features that have been in mainframe computers for up to 30 years.

Computers need to deal with information in orderly lumps. The fundamental lump of information is a bit, which contains either the number 0 or 1. This is done with a switch that is either on or off, or an electrical voltage or current that is either high or low. Anything between these on/off, high/low states leads to a potential for ambiguities and detection errors that are not acceptable in computing. This naturally leads to the use of the binary (base 2) number system for arithmetic on computers.

As we will see the byte (8 bits) is nice way to store the information needed to represent the characters we need to output text describing our numerical results. Many computers are structured internally to make accessing data in 8 bits lumps easy. This number of bits lets us represent a set of 256 different "characters", although only 128 are used in the standard ASCII character set (a-z, A-Z, 0-9, various common symbols, and "control" characters). Use of bytes has given rise to common application of the hexadecimal (base 16) number system. Four bits (a "nibble") are needed for each hexadecimal digit, so the contents of any given byte can be represented by a 2 digit hexadecimal number.

Thirty-two bits, (4 bytes) has been the most popular lump for storing numbers. It can store integers from 0 to 4,294,967,295 or -2,147,483,647 to +2,147,483,647, or a floating point, decimal number with about 7 decimal points of accuracy and a big enough exponent to handle most numbers (roughly as small as 10 to the -38 power and as large as 10 to the 38 power).

Actually there are usually more than 32 bits of physical memory for each 32 bits of data. A Parity bit is included to indicate an even or odd number of 1's in the basic 32 bits of data. This will allow the computer to detect corrupted data. More bits of information will let the computer correct a single bit error and detect a double bit error. This detection/correction scheme is common in large fast computers and beginning to appear on personal machines.

One function of integers in computers is to provide an "address" to the CPU of the starting byte in memory where an important piece of information (instruction or data) is located. You don't tend to see machines with 4 billion bytes of main (chip) memory, so the 32 bit integer is generally more than enough. However, you now frequently see computers with more than 4 billion bytes of hard disk space. This can be used by the computer as a form of virtual memory. Modern scientific and 3-D graphics applications can involve manipulation of far more than 4 billion numbers. Although this is not an insurmountable problem with 32 bit addressing, it drives an need for integers using more than 32 bits. Many new machines are designed to store and use 64 bit integers, providing addressing to more than enough memory for any foreseen application..

Scientific computing has also required more bits to represent floating point numbers. Frequently 7 decimal digits of precision is inadequate. The most common solution to this problem is to represent real numbers with 64 bits. This gives approximately 15 decimal digits of precision, and covers powers of 10 between -308 and 308. One nice feature of Fortran 90, is that it provides intrinsic functions (TINY and HUGE) to tell you what decimal numbers can be stored with any given data representation, and intrinsic functions to allow you to specify the decimal precision you need in your calculations.

You build the bus to handle the size of data lumps you want to move. A 32 bit wide bus is now common on PC's, but they are switching to the 64 bit bus to support both larger integers and real numbers, and higher data flow rates for the 32 bit representations.

Note that many clock cycles are needed to get a given piece of data from memory to the CPU. However, memory can be set up so that requests can be overlapped to get one result out of memory per clock cycle after flow has started. This technique is call "banking" memory. This action is complicated on PC's by the fact that the CPU is usually running at a different (faster) clock rate than the bus.

Each Add , Multiply, and Divide operation generally takes more than one clock cycle to complete. However, the associated arithmetic units are usually "segmented", so that a specific portion of the operation is performed in a given segment of the unit during any clock cycle. These units can be "pipelined" so that each segment is operating on the results of a different calculation at any given clock cycle. As an example in the first cycle two numbers (say 3.0 and 4.0) are fed to the multiplier unit and partially processed by the first segment of the unit. In the second clock cycle the partial results from multiplying 3.0 and 4.0 are passed to the second segment while the first segment picks up two new numbers (say 5.0 and 6.0) to start processing. This continues through as many pairs of numbers as you have available with the net result that after an initial startup time, you get one multiplication result out of your multiply unit every clock cycle.

Machines that are designed to take full advantage of pipelined CPU units are called vector processors. They are typically designed with extended register sets to handle long strings of numbers (vector registers) or with special hardware to feed the CPU with vectors directly from main memory. Vector registers originally appeared in the early Cray designs. Direct vector feeds from memory were first seen in the CDC Star computer designed by Neil Lincoln, and later in his ETA series of computers. The ability to perform these fast operations on long strings of numbers, results in significant run time reductions for a wide range of scientific calculations.

Another strategy for speed-up of the CPU is to keep various units busy at the same time. A well designed chip will, for instance, permit the floating point multiply unit to function, while the floating point adder is working on two other numbers, and the integer adder is processing a pair of integers. This is a simple beginning to parallel processing, but a true parallel machine contains many full CPU chips.

Recognition of the needs for providing data to pipelined or parallel CPUs can have a significant impact on the way you program Scientific Applications.

How a CPU processes Instructions

Kinds of machine instructions

Sample program in machine memory

Here is a list of addresses in memory occupied by a simple program, and a word description of the instructions or data in these locations. When you decipher the purpose of this program, you will begin to see the merits of learning languages like Fortran or C. You may also begin to appreciate how things like registers can significantly speed the execution of a program.

		Address  Instruction
          0        put contents of next word into register 1
          4        contains the integer 10 
          8        put contents of next word into register 3
          12       contains the integer 1
          16 (10)  copy register 1 to register 2
          20 (14)  subtract register 3 from register 1 put
                   result in register 1
          24 (18)  multiply register 1 by register 2 put result
                   in register 2     
          28 (1C)  if register 1 greater than register 3 branch
                   to  address 20
          32 (20)  copy register 2 to address FFFF
          36 (24)  branch to address 10000

How do you get all of this started and stopped? The Operating system handles it.

How do you get all of this done without worrying about binary number machine instructions and keeping track of memory addresses?

By looking at the above example, you should begin to appreciate some of the advantages of compilers. Instead of remembering the instruction numbers for subtract and multiply, we just remember symbols that make simple sense like "-" and "*". Instead of keeping track of the exact numerical address in memory where we store each number, we associate these storage locations with simple variable names such as "n", "x", "time", ... . It becomes the compiler's problem to keep track of the memory address where the value associated with the variable "n" is stored, and generate any needed instructions to shuffle that value between memory, registers, and arithmetic units.

If you havn't done so already, read the Introduction to the Class for general background information. Read the section on the fundamentals of Fortran and programming principles.

Up one level / Home


Maintained by John Mahaffy : jhm@cac.psu.edu