Bill's Win32Asm Page
|  go back  

Introduction to Floating Point Programming

This article will give a general overview of the Intel Architcture Floating Point Unit (FPU). It will describe the various data types used byt the FPU, and will also cover the FPU's internal architecture, including all of the internal registers.

FPU Data Formats

The FPU uses 3 different types of data: signed integer, BCD, and floating point. The following table shows the various data types used by the FPU, along with their size and approximate range.

Type Length Range
Word Integer 16 bit -32768 to 32768
Short Integer 32 bit -2.14e9 to 2.14e9
Long Integer 64 bit -9.22e18 to 9.22e18
Single Real 32 bit 1.18e-38 to 3.40e38
Double Real 64 bit 2.23e-308 to 1.79e308
Extended Real 80 bit 3.37e-1932 to 1.18e4932
Packed BCD 80 bit -1e18 to 1e18

The following sections will describe each of these data types: including how they are internally represented and how they can be defined.

Signed Integers

Positive signed integers are stored in normal format, with the left-most sign bit set to 0. Negative numbers are stored in 2's-complement form, with the left-most sign bit equal to 1. Following shows how to define the various integer types, and what their binary representations are:

  0024                  var1   dw        24
  FFFE                  var2   dw        -2
  000004D2              var3   dd        1234
  FFFFFF85              var4   dd        -123
  0000000000002694      var5   dq        9876
  FFFFFFFFFFFFFEBF      var6   dq        -321

BCD (Binary Coded Decimal)

BCD numbers are 10-bytes in size. Each number is stored as 18 digits, with 2 digits per byte. The highest order byte stores the sign of the number, with the highest order bit of this byte being the sign bit. Note that both positive and negative numbers are stored in true form, and not in complemented form.

The DT directive can be used to define a BCD number. Following is an example of some BCD numbers, and their hex representations:

  00000000000000012345         var1   dt    12345
  80000000000000000100         var2   dt    -100

Floating Point Numbers

A floating point (or real) number has three parts: a sign bit, a biased exponent, and a significand. They are written using binary scientific notation. There are three types of floating point numbers: short (32-bit), long (64-bit), and extended (80-bit). The following diagram shows the format of the different types of numbers:

Floating point data types can be defined using DD or REAL4 for short, DQ or REAL8 for long, and DT or REAL10 for extended precision floating point numbers. The following table show examples of defining various floating point numbers, and the hex representation of them:

  C377999A                     var1   dd      -247.6
  40000000                     var2   dd      2.0
  486F4200                     var3   real4   2.45E+5
  4059100000000000             var4   dq      100.25
  3F543BF727136A40             var5   real8   0.00123

  C377999A                     var1   dd      -247.6
  40000000                     var2   dd      2.0
  486F4200                     var3   real4   2.45E+5
  4059100000000000             var4   dq      100.25
  3F543BF727136A40             var5   real8   0.001235
  400487F34D6A161E4F76         var6   real10  33.9876

FPU Data Registers

The FPU is divided into 2 main sections: the control unit and the numeric execution unit. The control unit interfaces FPU to the main microprocessor. The numeric execution unit is responsible for executing the actual floating point instructions.

Stack Registers

All operands for floating point instructions, as well as all of the results are stored in an internal 8 register stack. Instructions either directly address a specific stack data register or use a push and pop to work off of the top of the stack.

Typically, ST(0) refers to the top of the stack. When data is pushed onto the stack, the new data is stored in ST(0), while the rest of the stack is pushed down by one. Similarly, when an item is popped from the stack, it is taken from ST(0), and the rest of the stack items are moved up. (Note that ST(0) need not always refer to the top of the stack. For these examples we will asume it does). The following diagram illustrates how this works:

The stack contains eight 80-bit registers. These always contain 80-bit extended precision floating point numbers. The FPU automatically converts signed integers, BCD, single precision, or double precision numbers to extended form as data is moved between memory and the FPU stack.

Status Register

The status register reflects the overall status of the FPU. The following diagram shows the format of the status regster.

B Busy Bit: indicates that the fpu is busy executing a task. newer coprocessors are automatically synchrnoized with the cpu, so use of this flag is not normally not neeed.
C3-C0 Condition Code Bits: indicate contitions about the coprocessor. these bits have varying meanings, dependng on the instruction.
ST Top of Stack: indicates the current register addressed as the top of stack. normally, this is register 0.
ES Error Summary: set if any error bit (pe, ue, oe, ze, de, or ie) is set.
PE Precision Error: indicates that the result or operands exceed the current precision (set in control register).
UE Underflow Error: indicates the result is too small to represent with the current precision (set in control register)
OE Overflow Error: indicates result is too large to be represented by the current precision (set in control register).
ZE Zero Error: indicates divisor was 0, while the dividend is non-infinite or non-zero.
DE Denormalized Error: indicates at least one of the operands is denormalized.
IE Invalid Error: indicates stack overflow or underflow, indeterminate form, or use of nan as an operand.

The status register can be accessed using the FSTSW instruction, which copies the register into a word of memory. FSTSW AX can be used to copy the status register into register ax. From there, the contents of the register can be tested. One method of testing these bits is with the TEST instruction. Another method uses SAHF to store the status register conte The status register can be accessed using the FSTSW instruction, which copies the register into a word of memory. FSTSW AX can be used to copy the status register into register ax. From there, the contents of the register can be tested. One method of testing these bits is with the TEST instruction. Another method uses SAHF to store the status register contents in the flags register. SAHF puts C0 into the carry flag, C2 into the parity flag, and C3 into the zero flag.

The following table describes the meanings of the condition code bits for the instructions FTST, FCOM, FPREM, and FXAM.

Instruction C3 C2 C1 C0 Meaning
FTST, FCOM 0 0 X 0 ST > Operand
0 0 X 1 ST < Operand
1 0 X 1 ST = Operand
1 1 X 1 ST is not comparable
FPREM Q1 0 Q0 Q2 Rightmost 3 bits of quotient
? 1 ? ? Incomplete
FXAM 0 0 0 0 + unnormal
0 0 0 1 + NAN
0 0 1 0 - unnormal
0 0 1 1 - NAN
0 1 0 0 + normal
0 1 0 1 + infinity
0 1 1 0 - normal
0 1 1 0 - normal
0 1 1 1 - infinity
1 0 0 0 + 0
1 0 0 1 Empty
1 0 1 0 - 0
1 1 0 0 + denormal
1 1 0 1 Empty
1 1 1 0 - denormal
1 1 1 1 Empty

Unnormal: Leading bits of the significand are zero.
Denormal: Exponent is at its most negative value.
Normal: Standard floating point form.
NAN: (Not-a-Number) An exponent of all ones and a significand not equal to zero.

Control Register

The control register is used to select the precision, rounding, and infinity control. It is also used to mask the 6 error bits of the status register. The following diagram shows the control register:

IC Infinity Control: Selects Projective or Affine infinity. Affine infinity allows positive and negative infinity, while projective assumes unsigned infinity.
0 = Projective
1 = Affine
RC Rounding Control: Determines the type of rounding:
00 = Round to nearest or even
01 = Round down towards minus infinity
10 = Round up towards positive infinity
11 = Truncate towards zero
PC Precision Control: Sets the precision of the result:
00 = Single Precision (short)
01 = Reserved
10 = Double Precision (long)
11 = Extended Precision
PM Precision Error Mask: Determines whether a precision error is indicated in the status registers PE bit. A 1 masks the error.
UM Underflow Error Mask Determines whether a underflow error is indicated in the status registers UE bit. A 1 masks the error.
OM Overflow Error Mask: Determines whether a overflow error is indicated in the status registers OE bit. A 1 masks the error.
ZM Zero Error Mask: Determines whether a zero error is indicated in the status registers ZE bit. A 1 masks the error.
DM Denormalized Error Mask: Determines whether a denormalized error is indicated in the status registers DE bit. A 1 masks the error.
IM Invalid Error Mask: Determines whether an invalid operation error is indicated in the status registers IE bit. A 1 masks the error.

Tag Register

The tag register is used to indicate the contents of each register of the FPU stack. The tag indicates whether the register is valid, zero, infinity/invalid, or empty. The tag register is as follows:

Tag Register Values:
00 = Valid
01 = Zero
10 = Invalid or Infinity
11 = Empty

The only way to view the tag regster is by storing the FPU environment using either the FSTENV, FSAVE, or FRSTOR instructions. These instructions store the tag register along with the other FPU data.

Conclusion

This article should have provided a good introduction the the internal registers of the FPU. It should also have described the data formats used by the FPU. The next article will introduce the FPU's instruction set.

April 12, 2001
Copyright (C) 2001 by Bill Tyler (billasm@usa.net)

  asmcode@hotmail.com