Current location - Plastic Surgery and Aesthetics Network - Plastic surgery and beauty - The Representation and Operation of Integer in Computer
The Representation and Operation of Integer in Computer
There are two main encoding methods for integers. One can only be used to represent non-negative numbers, and the other can represent negative numbers, zero and positive numbers, corresponding to unsigned numbers and signed numbers in C language. Java only supports signed numbers.

Unsigned coding is based on traditional binary representation. The unsigned representation of ω bit vector is obtained by treating it as a binary number. Here, the bit vector of ω bits represented by symbols is mapped to non-negative integers according to unsigned coding, including:

For example:

According to the above formula, the value range of ω-bit unsigned coding is.

In many cases, we need to use negative values, and the most common computer representation of signed numbers is two's complement. In complement coding, most significant bit is interpreted as negative weight. Here, the bit vector of ω bits represented by symbols is mapped into integers according to complement coding, including:

For example:

The value range of ω-bit complement is.

C language allows forced type conversion between different digital data types, including signed number and unsigned number. Although the C standard does not specify some representation of signed numbers or how to convert signed numbers and unsigned numbers, almost all machines use complement, and most systems follow the conversion rules between signed numbers and unsigned numbers with the same word length:

Let's use an example to illustrate:

The conversion rules between signed and unsigned numbers in C language, combined with the nature of arithmetic conversion, may lead to some strange behaviors. If there are both unsigned numbers and signed numbers in an expression, the signed numbers will be implicitly converted into unsigned numbers, which may have little influence on arithmetic operations, but for relational operators such as >: and<, it will lead to some intuitive results. Please see some examples:

In the example in the previous section, we use -2 147483647- 1 to represent the minimum value that 32-bit complement can represent, and then look at the sum defined in the c stdlib.h limit.h file:

This is related to the identification of the actual type of integer constants in C language. The actual type of integer constant depends on the length, radix, suffix letter and the accuracy of the representation of the determined type realized by C language. The specific rules are shown in the following table:

The data type of the constant is the first most suitable type selected from the above table (it can represent the constant without overflow). In addition, the C standard stipulates that integer constants start with numbers and can be prefixed with a specified radix. That is to say, if overflow does not occur, the value of integer constant is always non-negative. If the minus sign appears before it, it is a unary operator for integer constants, not a part of integer constants.

2 147483648 exceeds the maximum that can be accommodated by int and long types, so it is accommodated by unsigned long type. The unsigned number is negated by using the preceding minus sign as the operator, and the result is still unsigned long type. 0xFFFFFFFF will be accommodated by unsigned type, and it will still be unsigned type after negation. This is why the definition of Tmin in C language does not need to be represented by -2 147483648 or 0xFFFFFFFF.

When converting between data types with different word lengths, it will involve the expansion and truncation of numbers, and only the following rules need to be observed:

This paper mainly summarizes the existing forms of integers in computers and some behavior patterns in type conversion and operation. Understanding these is also targeted in the process of writing programs, avoiding bugs caused by some data types or understanding how these bugs are generated from bit-level behavior and fixing them. After that, I will make a similar summary of floating-point numbers, characters and strings, and thoroughly understand the coded representation of information in the computer.

Of course, this paper does not involve the mathematical principle and level behavior of integer addition, subtraction, multiplication and division. Generally speaking, the integer operation performed by a computer is a modular operation. The limited word length representing numbers limits the range of possible values of data, and the result may overflow. In addition, in integer operations, no matter whether operands are represented by unsigned numbers or complement forms, they all have exactly the same or very similar bit-level behavior. If you want to know more details and principles, you can refer to the third chapter of "Deep Understanding of Computer Systems".