Floating Point Numbers


How Computers Represent Real Numbers

Real numbers are stored in computers in a variation of scientific notation. By now you are used to seeing numbers in the form:

6.02 1023

1.53345810-15

Actually the representation within the computer is always in a form analogous to:

0.602 1024

0.153345810-14

The number is arranged so that the digit just to the left of the decimal place is always zero, and the digit just to the right of the decimal is never zero. The beginning of each of these numbers (0.602 or 0.1533458) is called the mantissa and the power of 10 (23 or -15)is referred to as the exponent. The total number of non-zero digits in the mantissa is refered to as the decimal precision of the number. The decimal precision of 0.602 1023 is "3 digits". The decimal precision of 0.153345810-15 is "7 digits".

Of course the computer doesn't deal in decimal numbers internally, so rather than storing a decimal fraction and an exponent of ten, it stores a binary fraction and an exponent of two. For a "single precision" floating point number, this information is stored within a total of 32 bits. The first bit contains the sign of the mantissa (0 for positive and 1 for negative). The next 8 bits store the exponent with a bias such that the binary number 10000000 represents the exponent 1, 10000001 the exponent 2, 01111111 the exponent -1, etc.. The remaining 23 bits provide the mantissa, giving an approximate decimal precision of 7 digits. The largest number that can be stored is approximately 3x1038 . The smallest positive number is approximately 110-38 .

For a "double precision" floating point number, the information is stored within a total of 64 bits. The first bit contains the sign of the mantissa (0 for positive and 1 for negative). The next 11 bits store the exponent, and the remaining 52 bits provide the mantissa, giving an approximate decimal precision of 15 digits. The largest number that can be stored is approximately 210308 . The smallest positive number is approximately 210-308 .

3f800000 +1.000000e+00

40000000 +2.000000e+00

40400000 +3.000000e+00

40800000 +4.000000e+00

40a00000 +5.000000e+00

40c00000 +6.000000e+00

40e00000 +7.000000e+00

41000000 +8.000000e+00

bf800000 -1.000000e+00

c0000000 -2.000000e+00

c0400000 -3.000000e+00

c0800000 -4.000000e+00

c0a00000 -5.000000e+00

c0c00000 -6.000000e+00

c0e00000 -7.000000e+00

c1000000 -8.000000e+00