For those who are used to C's flexible data types and convenient calculations, it seems meaningless to understand the underlying floating-point operations. In the era of visual prevalence, how many people still care about the so-called bottom?
For AfOs, floating-point operation is a very important part of programming, because we may face some slightly complicated operations. If you are a die-hard advocate of asm like me and don't want to solve the problem easily with C, you can certainly imagine the pain of finding sin(2.3) by integer operation under asm. In fact, the microcomputer has already prepared a solution for us: floating-point operation. But now there is little information about floating-point operation, and I believe a lot.
(1) floating-point number
This part is mainly from Bill's article.
Before that, let's look at a few terms:
FPU-& gt; Floating point unit
BCD-& gt; Binary coded decimal compressed decimal number uses four bits to represent the numbers 0~9, and one byte represents two decimal numbers, for example, 0111represents 89.
Scientific counting method: this is the concrete meaning of science. Look up the math textbook in junior high school or primary school d:)
Floating-point operations use three different kinds of data:
1) integer, divided into words, short integers and long integers.
2) True, single precision and double precision.
3) compressed binary number (BCD)
The following is the sum of its digits and the approximate range it can represent.
Type length range
-
Word integer 16 bits -32768 to 32768
Short integer 32 bits -2. 14e9 to 2. 14e9
Long integer 64 bits -9.22e 18 to 9.22e 18.
Single real number 32 bits 1. 18e-38 to 3.40e38.
Double real number 64-bit 2.23e-308 to 1.79e308
Extended real number 80 bits 3.37e- 1932 to 1. 18e4932.
Compress BCD 80 bits-1e 18 to 1e 18.
The range of double precision number and extended precision number is large enough for general application!
1) integer, stored in the form of complement, the complement of positive number is itself, and the complement of negative number is 1 after each bit of its absolute value is inverted. The following is an example of actual storage:
0024 var 1 dw 24
FFFE var2 dw -2
000004D2 var3 dd 1234
FFFFFF85 var4 dd - 123
0000000000002694var5 dq 9876
fffffffffffffebfvar 6 dq-32 1
2)BCD number
In FPU, 80 bits is exactly the width of the floating-point file register, and its format is stored as follows:
a little
79___72_7 1________________________________________0
Symbol-18 decimal number-.
Please look at the following example:
0000000000000000 12345 var 1 dt 12345
80000000000000000 100 var2 dt- 100
3) floating point number, this complex point, has three formats.
Single precision: _ 31_ 30 _ _ _ _ 23 _ 22 _ _ _ _ _ _ 0
Symbolic exponential significant number
Double precision: _ 63 _ 62 _ _ _ _ _ _ 52 _ 51_ _ _ _ _ _ 0
Symbolic exponential significant number
Extended precision number:
_79_78____________64_63___________________0
Symbolic exponential significant number
Example:
C377999A var 1 dd -247.6
40000000 var2 dd 2.0
486F4200 var3 real4 2.45e+5
4059 1000000000 var 4 dq 100.25
3f 543 BF 727 136 a40 var 5 real 8 0.00 123
C377999A var 1 dd -247.6
40000000 var2 dd 2.0
486F4200 var3 real4 2.45e+5
4059 1000000000 var 4 dq 100.25
3f 543 BF 727 136 a40 var 5 real 8 0.00 1235
400487 f 34 d6a 16 1e4f 76 var 6 real 10 33.9876
Both DD and real4 can define 4-byte single-precision floating-point numbers in asm.
Both DQ and real8 can define double-precision floating-point numbers in asm, 8 bytes.
Both DT and real 10 can define extended precision floating-point numbers in asm, 10 bytes.
(2) floating-point components
FPU is functionally divided into two parts: control unit and operation unit. The control unit is mainly oriented to CPU, while the arithmetic unit is responsible for specific arithmetic operations.
The floating-point unit FPU includes 8 general registers, 5 error pointer registers and 3 control registers.
1) Eight general-purpose registers with 80 bit each form a register stack, and all the calculation results are saved in the register stack, in which all the data are in 80-bit extended precision format, and even BCD, integer, single precision and double precision will be automatically converted into 80-bit extended precision format by FPU when they are loaded into the registers. Note that the top of the stack is usually represented by ST(0), followed by ST (65438).
Similar to a stack, but with a width of 80 bits, the image is as follows:
_______________________
| ST(0) |
|_______________________|
ST( 1) |
|_______________________|
|......|
|......|
| ST(i) |
|_______________________|
2) Control register. FPU has three control registers: status register, control register and flag register.
Status register->; Southwest
_ M _ _ _ _ _ D _ _ _ _ _ _ _ _ 10 _ _ 9 _ _ _ _ _ 8 _ _ _ 7 _ _ _ _ _ _ _ _ _ 5 _ _ _ _ _ _ _ _ _ 0 _ _
| B | C3 | TOP | C2 | c 1 | C0 | ES | PE | UE | OE | ZE | DE | IE |
|____|____|____|____|____|____|____|____|____|____|____|____|____|____|
Floating-point component is busy.
C0-C3 indicate the result of floating-point operation, and different instructions have different meanings.
TOP represents the top of the stack, usually 0.
Any bit below ES (pe, ue, oe, ze, de or ie) will be set.
PE precision fault
The UE number is too small to indicate overflow.
The existing precision of OE cannot be expressed, and the quantity is too large to overflow.
ZE divided by 0 error
DE indicates that at least one operand is not normalized.
IE invalid error, indicating stack overflow or underflow, invalid operand, etc.
Control register:
_ 15____________ 10___9____8___7_________5______________________0__
| |IC | RC | PC | | PM | UM | OM | ZM | DM | IM |
|____|____|____|___|__|_|__|__|____|____|____|____|____|____|____|
IC infinite control, for 486, has been invalid.
RC rounding control
00 = Rounded to the nearest number or even number.
0 1 = Rounds to negative infinity.
10 = Round to positive infinity.
1 1 = Truncation beyond 0.
PC precise control
00 = single precision
0 1 = reserved
10 = double precision
1 1 = extended precision
PM~IM masks the error indicated by the lower 5 bits of the status register. If it is 1, it is masked.
Label register:
Every 2 bits represent the state of the corresponding stack register, and the specific meaning is as follows:
15________________________________________3_____0
| tag 7 | ................................... | tag1|
|_____|___________________________________|____|
Meaning:
00 = valid
0 1 = zero
10 = invalid or infinite.
1 1 = empty.
(3) Design of floating-point instruction system and floating-point program under 3)MASM.
In fact, the most important and difficult information has been introduced in parts (1) and (2). The following is for completeness. If you are new to floating-point instructions, it is no problem to read the following summary. In addition, one aspect not covered in this paper is floating-point exception handling, because it involves many contents such as protection mode, interrupt, task switching, SEH, etc. I believe that the introduction will only make people more confused. In addition, it seems that I can't completely talk about these issues now.
Floating-point programming is a big topic. I just briefly describe the design method in Masm32V7(/V6). Because the floating-point components are built into the CPU above 486, floating-point instructions can be used directly in the program. Here is a small example:
__MASMSTD equ 1
. 386p
. Model plane, standard call
Option case diagram: None; case sensitive
Including c: \ \ HD \ \ HD.h.
Including c: \ \ HD \ \ mac.h.
; ; -
. data
num 1 dq 12345
num2 dq 98765
Resource Add 0
. Data?
buf db 200 dup(? )
; ; -
. password
_ _ Start:
Finit initializes floating-point components
fild num 1; Load num 1
Fild num2 loads num2.
Fmul performs multiplication.
Fist res rescue
Call wsprintf, addr buf, CTEXT ("Result: %ld"), res.
Call StdOut, addr buf display, pay attention to the console display, and compile with /SUBSYSTEM:CONSOLE.
Call standard input, address buffer, 20
Call ExitProcess, 0
END __Start
How to use this instruction depends on your own operation and algorithm. Note that in fpu, registers always use extended precision numbers to represent values, so integer operations must be stored first in the end to get correct results. These conversions are done automatically by fpu.
Floating-point instruction system is divided into five categories: data transmission, arithmetic operation, transcendental function, comparison, environment and system control.
I don't want to list the parameters and usage of all functions, because it will waste manpower. I type in pinyin! (d) See the specific reference materials at the end of the article, otherwise I can't help you.
1) data transmission class, mainly including
This kind of instruction mainly loads floating-point register file data from memory, and the general destination address is always ST(0) at the top of the stack. You can see this clearly through the debugger. Note that the operation ending in p pops up after the previous operation is completed, that is, the original content of ST( 1) has now become the content of ST(0). Pay attention to this, and you can easily design flexible programs.
Load:
FLD pushes Real Madrid onto the stack.
Convert two's complement integer into real number and push it.
FBLD converts BCD into real number and pushes it on the stack.
Storage:
FST stores floating-point numbers from the stack.
FSTP converts the top of the stack to an integer.
fist
FISTP converts the top of the stack to an integer.
FBSTP stores BCD into an integer and pops the stack.
Exchange:
FXCH exchanges the top two stack elements.
Constant load:
FLD 1 load constant 1.0
FLDZ loading constant 0.0
The constant pi (=3. 14 15926) is loaded in FLDPI ... which is accurate enough to be used safely).
FLDL2E loads the constant log (2) e.
FLDL2T Load Constant Log (2) 10
FLDLG2 loads the constant log( 10)2.
FLDLN2 loads the constant Log(e)2.
I don't want to list the detailed formats of all floating-point instructions, because it is unnecessary! Many materials have introduced these instruction formats. Floating-point instructions all start with f, LD stands for Load, ILD stands for integer Load, and BLD is binary number load, so it is easy to remember, and many instruction functions are clear at a glance according to the instructions.
2) Arithmetic operation
Add:
FADD/FADDP Add/Add and Pop-up
Integer addition
Subtraction:
FSUB/FSUBP Subtraction/Subtraction and Popup
FSUBR/FSUBRP uses inverted operands for subtraction/subtraction and pop-up operations.
Integer subtraction
FISUBR integer subtraction/inverse operand subtraction
Multiplication:
FMUL/FMUL multiplication/multiplication and pop-up
Least integer multiplication
Department:
FDIV/FDIVP Divide/divide and pop
FIDIV integer division
FDIVR/FDIVRP uses inverted operands for division/division and pop-up operations.
Integer division using inverted operands
Others:
Absolute value calculated by fab
FCHS change flag
Round to an integer
Calculate square root
FSCALE scales the top of the stack by a power of 2.
Segregation index and mantissa
FPREM calculates partial remainder
FPREM 1 Calculate partial remainder in IEEE format.
If there are no operands after the instruction, the default operands are ST(0) and ST( 1). For instructions with R suffix, the order of normal operands is reversed, for example, fsub executes X-Y and FSUBR executes Y-X. 。
3) Transcendental function class
trigonometric function
FSIN calculates sine
FCOS calculation cosine
Fast calculation of sine and cosine
FPTAN calculates partial tangent.
FPATAN calculates partial arctangent
Log class
FYL2X calculates the base-2 logarithm of y times x.
FYL2XP 1 Calculate y times the logarithm with base 2 (x+ 1).
F2XM 1 calculation (2 x)-1
4) comparison class
FCOM comparison
FCOMP comparison and pop-up
FICOM integer comparison
FTST integer comparison and pop-up
Disordered comparison
Out-of-order comparison and popup
FXAM sets the condition code bit for the value at the top of the stack.
FSTSW stores status words.
Will be set according to the results, C0~C3, which will not be introduced in detail above. C 1 is used to judge overflow or underflow. C0 is equivalent to CF in EFLAGS, and its functions are basically the same. C2 is equivalent to PF and C3 is equivalent to ZF. You may see the following instructions.
FSTSW ax
SAHF
JZ label
Why? Because the status word is stored in EFLAGS of C0 with the above instruction, C0 is at the CF position and C3 is at the ZF position.
5) Environment and system control
FLDCW load control word
FSTCW storage control word
FSTSW stores status words.
FLDENV loading environment block
FSTENV storage environment block
Save coprocessor state
FRSTOR restores coprocessor state.
Finite initialization coprocessor
FCLEX clear exception flag
Increment stack pointer
FDECSTP decrements stack pointer.
FFREE marks an element as free.
no-operation
FWAIT waits for floating-point instruction to complete.
I really don't want to go into details, because the format and usage of these instructions are described in detail in the fphelp.hlp file under the help directory of Masm32V7. Of course, there are many other instruction format lists, which I have listed for completeness. There is a more difficult problem here, which is the display of floating-point numbers. Windows has no ready-made function calls, wsprintf can only display integers, but there are many libraries to support it, such as the floating-point development package on LYB homepage. Sure, when you get familiar with it.
Regarding the debugging of floating-point programs, Softice is recommended, because Trw does not support the display of floating-point stacks. Now there is an fpu plug-in on the Internet, which can partially solve the problem, but it is not easy to use. It depends on your own choice.