Character types in c language
Character data in C language can be divided into two categories: character data and string data. * * Character data refers to a single character enclosed in single quotation marks, * * such as' a',' 2' and' & Wait string data refers to a series of characters enclosed in double quotation marks, such as "good", "0 132" and "a".
1, basic type definition
Type specifier: characters
2. Storage and value range of character data
The value range of character data is printable characters in ASCII character set. The storage of a character data accounts for 1 byte, and the actual storage is the ASCII code value (that is, integer value) of the corresponding character.
American standard code for information interchange:
ASCII codes use a specified combination of 7-bit or 8-bit binary numbers to represent 128 or 256 possible characters. The standard ASCII code, also known as the basic ASCII code, uses 7-bit binary numbers to represent all uppercase and lowercase letters, numbers 0 to 9, punctuation marks and special control characters used in American English.
These include:
0~3 1 and 127 (* * 33 characters) are control characters or communication-specific characters (the rest are displayable characters), such as control characters: LF (line feed), CR (carriage return), FF (page feed), DEL (delete), BS (backspace) and BEL. Special communication characters: SOH (prefix), EOT (suffix), ACK (acknowledgement), etc. ASCII values of 8, 9, 10 and 13 are converted to backspace, tab, line feed and carriage return, respectively. They have no specific graphic display, but they have different effects on text display according to different applications.
32 ~ 126 (* * 95) is a character (32 is a space), where 48-57 are ten Arabic numerals from 0 to 90.
65-90 is 26 uppercase English letters, 97- 122 is 26 lowercase English letters, and the rest are some punctuation marks and operation symbols.
Also note that in standard ASCII, its most significant bit (b7) is used as parity bit. The so-called parity check refers to a method used to check whether there is an error during code transmission, which is generally divided into parity check. Odd parity stipulates that the number of 1 in a byte of the correct code must be odd. If it is not an odd number, the highest bit b7 plus1; Parity check stipulates that the number of 1 in one byte of the correct code must be even. If it is not an even number, add 1 to the highest bit b7.
The last 128 code is called extended ASCII code. Many x86-based systems support the use of extended (or "advanced") ASCII. The extended ASCII code allows the 8th bit of each character to be used to identify additional 128 special symbol characters, loanwords and graphic symbols.
3. Representation method of character data
Character data is stored in the computer in binary form of ASCII code value of characters, and the storage of one character data occupies 1 byte. * * Because ASCII code is an integer between 0 and 255 in form, character types and integers can be used in C language. * * For example, the ASCII code value of the character "a" is 1 1000 1 in binary and 97 in decimal. The storage form of the character' a' is actually an integer 97, so it can be directly operated with an integer, assigned with an integer variable, or used as a character or an integer. When outputting in the form of characters, the ASCII code value is first converted into corresponding characters, and then output; When the output is in integer form, the ASCII code value is directly used as the output.
Grammatically speaking, C language * * * provides three types of characters, namely char, signedchar and unsignedchar. The binary length is 8 bits, and the value ranges are respectively
- 128- 127、- 128- 127、0~255。 If the type of character variable is not specified, it defaults to the signedchar type. Because character data is mainly used to process characters, it cannot be decorated with long or short modifiers.
Character data:
The number of single characters contained in a single quotation mark. Such as' a','%',':',' 9', etc., and' 12' or' ABC' is an illegal character.
String data:
A single character or a string of characters enclosed in double quotation marks, such as "good", "0 132", "w 1" and "a". Please note that "a" is a string, not some characters.
In order to make it easier for C program to judge whether the string ends, the system adds an end flag-null operator ""at the end when storing each string number, and its ASCII code value is 0.
It will neither cause any operation nor display output, so the number of bytes saved for a string should be the length of the string plus 1.
You can't create a variable of type string in C because "string" is not a type.
By definition, "string" is "a sequence of consecutive characters ending with and containing the first empty character". It is not a data type, but a data format.
The Char array can contain a string. Char* can point to a string. None of them are strings.
If you like, you can define
1, and the character type is called char.
2. In the character type, a * * * contains 256 integers, and each integer can represent a character (for example,' d','&; ), these integers and characters are completely interchangeable.
3. The 3.ASCII code table lists the corresponding relationships between all integers and characters.
97' s
65 years old
'0'48
4. All lowercase English letters are arranged continuously in the ASCII code table, with "a" corresponding to the smallest integer and "z" corresponding to the largest integer.
All capitalized English letters and Arabic numerals also conform to this rule.
D'-'a' equals' d'-'a'
D'-'a' equals' 3'-'0' equals 3-0.
6. All the character data are divided into two groups, each with 65,438+028 characters. The correspondence between a set of characters and integers is the same on all computers, and the range of integers corresponding to these characters is from 0 to 127.
7. On different computers, the correspondence between another set of characters and integers may be different. The integer range corresponding to these characters may be from-128 to-1 or from 128 to 255.
line break
Carriage return character
Character; Role; letter
' character
'' character
8. The name of the short integer is short. This type contains 65,536 different integers, half of which are negative and the other half are non-negative. These numbers spread in both directions around the number 0.
9. The name of the long integer is Long. This type contains different integers to the 32nd power of 2, half of which is negative and the other half is non-negative. These numbers spread to both sides around 0.
10, and the integer type is called int. On our computer, integer and long integer are exactly the same.
1 1, the above types are called signed types.
12. Every signed type has a corresponding unsigned type. The name of unsigned type is unsigned before the name of signed type (such as unsignedchar, unsignedint, etc.). ).
13. Except that it does not contain negative numbers, each unsigned type contains the same number of digits as the corresponding signed type.
14. The range of numbers contained in all data types related to integers will overlap and gradually expand.
15. The number without decimal point plus U in the program indicates that the number type is unsigned integer type.
Floating-point type is used in 16 and C languages, and numbers are represented by decimal points.
17, floating-point types are divided into single-precision floating-point types and double-precision floating-point types.
18, double-precision floating-point type can record more digits after decimal point.
19. The name of a single-precision floating-point type is float.
20. The name of a double-precision floating-point type is double.
2 1, the number with decimal point in the program is double-precision floating point by default.
22. If the number with decimal point is followed by f, it means that the type of the number is single-precision floating-point type.
23. New data types can be created in C language. These created data types are called composite data types and need to be created before use.
24. The Boolean type introduced in C99 specification contains two integers, where 0 is called false, 1 is called true, and false is called Boolean.
25. Any integer in C language can be used as a Boolean value. When using a Boolean value, 0 is false, and when using a Boolean value, all other integers are true.
26, the program doesn't need to use Boolean type, just use integers as Boolean values.
27. Correspondence between data types and placeholders
1, char and unsignedchar%c
2. Short% HD
3. Unsigned short %hu
4、int%d
5. unsigned %u
6. Length %ld
7. unsigned long%lu
8. Floating point %f or %g
9. Double %lf or %lg
%f and %lf will keep invalid 0 after decimal point, %g and %lg will not keep it.
28. One of the main differences between different types of storage areas is that they contain different numbers of bytes.
29. The sizeof keyword can be used to calculate the number of bytes contained in a storage area.
Char and unsignedchar 1 byte
Short unsigned short 2 bytes.
Int and unsignedint4 bytes.
Long and unsignedlong4 bytes.
Floating-point 4 bytes
Double 8 bytes
30. Anything that can be used as a number can be written in parentheses after the sizeof keyword.
3 1, the modification of the contents of any storage area in the parentheses of the sizeof keyword will not really happen.
/*
*sizeof keyword demonstration
**/
# include & ltstdio.h & gt
intmain(){
int num = 0;
Printf("sizeof(int) is %d ",sizeof(int));
Printf("sizeof(num) is %d ",Sizeof (num));
Printf("sizeof(6+7) is %d ",Sizeof (6+7));
sizeof(num = 10);
Printf("num is %d ",num);
return0
}