Current location - Plastic Surgery and Aesthetics Network - Plastic surgery and medical aesthetics - What is PID PID, PIDD under linux? What is their relationship and application?
What is PID PID, PIDD under linux? What is their relationship and application?
When executing an instruction under Linux, the system will give this action an ID, which we call PID, and give this specific PID a set of permissions according to the user who enabled this instruction and the related instruction functions, and the behavior that this instruction can execute is related to the permissions of this PID.

Brief introduction of linux process

Linux is a multi-tasking operating system, that is, multiple processes can be executed at the same time. If readers have a certain understanding of computer hardware system, they will know that our commonly used single CPU computer can only execute one instruction in a period of time, so how does Linux realize the simultaneous execution of multiple processes? Initially, Linux used a method called "process scheduling". First, each process is assigned a specific running time, which is usually very short, only a few milliseconds. Then, according to certain rules, one of the many processes is put into operation, and other processes wait for a period of time. When the running process times out, or exits after execution, or pauses for some reason, Linux will reschedule and select the next process. Because each process takes up a short time slice, from our users' point of view, it seems that multiple processes are running at the same time.

In Linux, each process is assigned a data structure called process control block (PCB) when it is created. PCB contains a lot of important information about system scheduling and process execution, the most important of which ID(process ID (also called process identifier), which is a non-negative integer and uniquely identifies a process in Linux operating system. On the I386 architecture (that is, the architecture used by PC), the value range of a non-negative integer is 0-32767, which is actually the process ID number. Everyone's ID number will be different, and the process ID of each process will be different.

One or more processes can be combined into a process group, and one or more process groups can be combined into a session. In this way, we have the ability to operate processes in batches, for example, by sending signals to a process group to realize sending signals to each process in the group.

Finally, let's see with our own eyes how many processes are running in our system through ps command:

$ps -aux (The following is the running result on my computer, and your result is probably different. )

User PID %CPU %MEM VSZ RSS TTY statistics start time command

Root10.10.41412 520? S May 15 0:04 init [3]

Root 2 0.0 0.0 0 0? SW May 15 0:00 [keventd]

Root 3 0.0 0.0 0 0? SW May 15 0:00 [kapm-idled]

Root 4 0.0 0.0 0 0? SWN plum 15 0:00 [ksoftirqd_CPU0]

Root 5 0.0 0.0 0 0? SW May 15 0:00 [kswapd]

Root 6 0.0 0.0 0 0? SW May 15 0:00 [kreclaimd]

Root 7 0.0 0.0 0 0? SW May 15 0:00 [bdflush]

Root 80.00 0.000? SW may 15 0:00[k update]

Root 9 0.0 0.0 0 0? SW & ltMay 15 0:00 [mdrecoveryd]

Root 13 0.0 0.0 0 0? SW May 15 0:00 [kjournald]

Root 132 0.0 0.0 0 0? SW May 15 0:00 [kjournald]

Root 673 0.0 0.4 1472 592? S May 15 0:00 syslogd -m 0

Root 678 0.00.8 20841116? S May 15 0:00 klogd -2

rpc 698 0.0 0.4 1552 588? May 15 0:00 Port Map

rpcuser 726 0.0 0.6 1596 764? S May 15 0:00 rpc.statd

Root 839 0.0 0.4 1396 524? s may 15 0:00/usr/sbin/apmd-p

Root 908 0.0 0.7 2264 1000? s may 15 0:00 xinetd-stay alive

Root 948 0.01.5 52961984? S May 15 0:00 Send an email: accepti.

Root 967 0.0 0.3 1440 484? S May 15 0:00 gpm -t ps/2 -m /d

wnn 987 0.0 2.7 4732 3440? s may 15 0:00/usr/bin/cserver

Root1005 0.00.51584 660? S May 15 0:00 crond

WNN 1025 0.0 1.9 3720 2488? s may 15 0:00/usr/bin/t server

xfs 1079 0.0 2.5 4592 32 16? s may 15 0:00 xfs-drop priv-da

Daemon11150.00.41444 568? S May 15 0:00 /usr/sbin/atd

Root11300.00.31384 448

Root11310.00.31384 448 tty 2SMay150: 00/sbin/Mingotti TT.

Root11320.00.31384 448 tty3May150: 00/sbin/mingettett.

Root11330.00.31384 448ty4s May150: 00/sbin/mingettett.

Root11340.00.31384 448ty5May150: 00/sbin/mingettett.

Root11350.00.31384 448ty6s May150: 00/sbin/mingettett.

Root 8769 0.00.61744812? 00:08 0:00 in . telnetd: 192. 1

Root 8770 0.0.9 23361184pts/0s00: 08 0: 00 Login-Lei

Lei 87710.10.9 24321264pts/0s00: 08 0: 00-bash

Lei 8809 0.00.6 2764 808pts/0r00: 09 0: 00ps-aux

Except for the title, each line above represents a process. Among these columns, PID column represents the process ID of each process, and COMMAND column represents the name of the process or the command line called in the Shell. I won't explain the specific meaning of other columns, and interested readers can refer to related books.

getpid

In version 2.4.4 kernel, getpid is the 20th system call, and its prototype in Linux function library is:

# include & ltsys/types . h & gt; /* Provide the definition of pid_t type */

# include & ltunistd.h & gt/* provides the definition of the function */

PID _ t getpid(void);

The function of getpid is simple, which is to return the process id of the current process. Please look at the following example:

/* getpid_test.c */

# include & ltunistd.h & gt

Master ()

{

Printf ("The current process ID is% d.

“,getpid());

}

Careful readers may notice that the header file sys/types.h is not included in the definition of this program, because we don't use pid_t type in the program, which is the type of process id. In fact, on the i386 architecture (that is, the architecture of our general PC), the pid_t type is completely compatible with the int type. We can handle the data of pid_t type by processing integers, such as printing with "%d".

Compile and run the program getpid_test.c:

$gcc getpid_test.c -o getpid_test

$./getpid_test

The current process ID is 1980.

Your own running performance is probably different from this figure, which is normal. )

Run again:

$./getpid_test

The current process ID is 198 1

We can see that although it is the same application, the process identifier assigned at each run is different.

fork

In version 2.4.4 kernel, fork is system call No.2, and its prototype in Linux function library is:

# include & ltsys/types . h & gt; /* Provide the definition of pid_t type */

# include & ltunistd.h & gt/* provides the definition of the function */

PID _ t fork(void);

Just looking at the name fork, few people may guess what it is for. The fork system call is used to copy the process. When a process calls it, two almost identical processes appear after completion, and we get a new process from this. It is said that the name of fork comes from this workflow which is somewhat similar to the shape of fork.

In Linux, there is only one way to create a new process, which is the fork we are introducing. Other library functions, such as system (), seem to be able to create new processes. If you can look at their source code, you will understand that they actually call fork internally. Including the application we run under the command line, the new process is also done by shell calling fork. Fork has some interesting features. Let's take a look at it through a small program.

/* fork_test.c */

# include & ltsys/types . h & gt;

# inlcude & ltunistd.h & gt

Master ()

{

pid _ t pid

/* There is only one process at this time */

PID = fork();

/* There are already two processes running at the same time */

if(PID & lt; 0)

Printf ("Error in fork!" );

else if(pid==0)

Printf ("I am a child process, and my process ID is% d.

“,getpid());

other

Printf ("I am the parent process, and my process ID is% d.

“,getpid());

}

Compile and run:

$gcc fork_test.c -o fork_test

$./fork_test

I am the parent process, and my process ID is 199 1

I am a child process, and my process ID is 1992.

When watching this program, you must first understand a concept in your mind: before the pid=fork () statement, only one process was executing this code, but after this statement, it became two processes, and the code parts of these two processes are exactly the same, and the next statement to be executed is if(pid==0).

Of these two processes, the original process is called "parent process" and the new process is called "child process". The difference between parent process and child process lies not only in the process ID, but also in the value of variable pid, which stores the return value of fork. One of the wonders of fork call is that it is only called once, but it can be returned twice. It may have three different return values:

In the parent process, fork returns the process ID of the newly created child process;

In the subprocess, fork returns 0;

If an error occurs, fork returns a negative value;

There are two possible reasons for bifurcation errors:

(1) The current number of processes has reached the upper limit specified by the system, and the value of errno is set to EAGAIN. (2) The system is out of memory, and the value of errno is set to ENOMEM. For the meaning of errno, please refer to the first article in this series. )

The possibility of an error in the fork system call is very small. If an error occurs, it is usually the first error. If the second error occurs, it means that the system has no memory to allocate and is on the verge of collapse, which is rare for Linux.

At this point, the smart reader may have fully understood the rest of the code. If pid is less than 0, there is an error; Pid==0, which means that fork returns 0, which means that the current process is a child process, so printf ("I am a child!") )), otherwise (else), the current process is the parent process, execute printf ("I am the parent!") ) )。 Perfectionists will find this unnecessary, because every process has a statement that they can never execute. Don't worry too much about it. After all, many years ago, the ancestors of UNIX wrote programs in this way on computers with unimaginable memory at that time. With our "massive" memory, we can completely put these few bytes of worry behind us.

At this time, some readers may still have questions: If the child process and the parent process after fork are almost identical, and the only way to generate new processes in the system is fork, then all processes in the system are not exactly the same? What should we do when we want to execute a new application? From the experience of Linux system, we know that this problem does not exist. As for the method adopted, we will leave this issue for detailed discussion later.

export

In version 2.4.4 kernel, exit is the call number 1, and its prototype in Linux function library is:

# include & ltstdlib.h & gt

Void exits (int state);

Unlike fork, from the name of exit, this system call is used to terminate a process. No matter where in the program, as long as the exit system call is executed, the process will stop all remaining operations, clear all kinds of data structures including PCB, and terminate the operation of this process. Please check the following procedures:

/* exit_test 1.c */

# include & ltstdlib.h & gt

Master ()

{

Printf ("This process will exit!

");

Exit (0);

Printf ("Never show!

");

}

Run after compilation:

$ gcc exit _ test 1 . c-o exit _ test 1

$./exit_test 1

The process will exit!

We can see that the program didn't print "Never show!" Because before that, when the exit(0) was executed, the process was terminated.

The exit system call has an integer parameter status, which can be used to convey the status when the process ends, for example, whether the process ends normally or unexpectedly. Generally speaking, 0 means that there is no unexpected normal end; Other values indicate that an error occurred and the process ended abnormally. In actual programming, we can use the wait system call to receive the return value of the subprocess, so as to handle it differently according to different situations. We will introduce the details of waiting in the later space.

Exit and _ exit

As system calls, _exit and exit are twin brothers. To what extent are they similar, we can find the answer from the source code of Linux:

#define __NR__exit __NR_exit /* from line 334 of file include/asm-i386/unistd.h */

"__NR_" is a prefix added to each system call in the Linux source code. Please note that there are two underscores before the first exit, and only 1 underscores before the second exit.

At this time, people who know C language and have a clear mind will say that there is no difference between _exit and exit, but we have to talk about the difference between them, which is mainly reflected in their definition in the function library. The prototype of _exit in the Linux library is:

# include & ltunistd.h & gt

Void _exit(int state);

Compared with exit, the exit () function is defined in stdlib.h, and _exit () is defined in unistd.h. From the name, stdlib.h seems to be a little higher than unistd.h. So, what's the difference between them? Let's look at the flow chart first. Through the following figure, we will have a more intuitive understanding of the execution process of these two system calls.

It can be seen from the figure that the function of _exit () is the simplest: stop the process directly, empty the memory space it uses, and destroy its various data structures in the kernel; The exit () function is packaged on these bases, and several procedures are added before exiting. It is also for this reason that some people think that exit can no longer be regarded as a pure system call.

The biggest difference between the exit () function and the _exit () function is that the exit () function should check the opening of the file before calling the exit system call, and write the contents of the file buffer back to the file, which is the item of "clearing the I/O buffer" in the figure.

In the standard function library of Linux, there is a set of functions called "advanced I/O". We are familiar with printf (), fopen (), fread () and fwrite (). They are also called "buffered I/O", and their feature is that there is a buffer in memory for each opened file. In this way, the next time you read a file, you can read it directly from the buffer in memory. Every time you write a file, you just write it to the buffer in memory. When certain conditions are met (such as a certain number of characters, such as line breaks and file terminator EOF), writing the contents of the buffer into the file at one time greatly improves the reading and writing speed of the file, but it also brings a little trouble to our programming. If there is some data, we think it has been written into the file, but in fact it is only stored in the buffer because it does not meet certain conditions. At this time, we use the _exit () function to close the process directly, and the data in the buffer will be lost. On the contrary, if you want to ensure the integrity of data, you must use the exit () function.

Look at this routine:

/* exit2.c */

# include & ltstdlib.h & gt

Master ()

{

Printf ("Output Start

");

Printf ("contents in buffer");

Exit (0);

}

Compile and run:

$gcc exit2.c -o exit2

$./exit2

Output start

The contents of the buffer

/* _exit 1.c */

# include & ltunistd.h & gt

Master ()

{

Printf ("Output Start

");

Printf ("contents in buffer");

_ exit(0);

}

Compile and run:

$ gcc _ exit 1 . c-o _ exit 1

$./_exit 1

Output start

In Linux, both standard input and standard output are regarded as files. Although it is a special file, from the programmer's point of view, it is no different from ordinary files that store data on the hard disk. Like all other files, they have their own buffer after opening.

Please consider why these two programs get different results in combination with the previous description. I believe that if you understand what I said earlier, you will easily come to a conclusion.

In this paper, we have a preliminary understanding of Linux process management, and on this basis, we have learned four system calls: getpid, fork, exit and _exit. In the next article, we will learn about other system calls related to Linux process management and do some further discussion.

In the last article, we have understood the concepts of parent process and child process, and mastered the usage of system calling exit, but few people may realize that after a process calls exit, it does not disappear immediately, but leaves a data structure called zombie. Among the five states of Linux process, zombie process is a very special one. It has given up almost all the memory space, has no executable code, and cannot be scheduled. It only keeps one place in the process list to record the exit status of the process for other processes to collect. In addition, zombie processes no longer occupy any memory space. From this point of view, although the name of the zombie process is cool, its influence is far less than that of the real zombie brothers. Real zombies can always make people feel horrible, but the zombie process has no influence on the system except leaving some messages for people to mourn.

Perhaps readers are still curious about this new concept, so let's take a look at what the zombie process in Linux looks like.

When a process has exited, but its parent process has not called the system call wait (described later) to collect it, it will remain in a zombie state. Using this feature, let's write a simple small program:

/* zombie.c */

# Including

# Including

Master ()

{

pid _ t pid

PID = fork();

if(PID & lt; 0) /* If there is an error */

Printf ("An error occurred! n”);

Else if(pid==0) /* If it is a child process */

Exit (0);

Else /* If it is the parent process */

Sleep (60); /* Sleep for 60 seconds, during which the parent process can't do anything */

Wait (empty); /* collecting robots */

}

The function of hibernation is to let the process hibernate for a specified number of seconds. During this 60-second period, the child process has quit, and the parent process is busy sleeping and cannot collect. In this way, we can keep the child process in a zombie state for 60 seconds.

Compile this program:

$ TERM Zombie. c -o Zombie

Run the program in the background so that we can execute the next command:

$./Zombie &

[ 1] 1577

List the processes in the system:

$ ps -ax

......

1 177 minutes /0 seconds 0:00- carnival

1577 minutes /0 second 0:00. /Zombie

1578 pts/0 Z 0:00[ Zombie]

1579 pts/0 R 0:00 ps -ax

See the z in the middle? That's a sign of a zombie process, which means that process 1578 is now a zombie process.

We know that the system calls exit, its function is to make the process exit, but it is only limited to turning a normal process into a zombie process, and it cannot be completely destroyed. Although the zombie process has little influence on other processes, does not occupy CPU time and consumes almost negligible memory, it still makes people feel uncomfortable to stay there. Moreover, the number of processes in Linux system is limited. In some special cases, if there are too many zombie processes, it will also affect the generation of new processes. So, how can we eliminate these zombie processes?

First, let's learn about the origin of the zombie process. We know that Linux and UNIX have always had a kinship, and the concept of zombie process is also inherited from UNIX. The pioneers of UNIX didn't design this thing because they were bored and wanted to disturb other programmers. There is a lot of information in the zombie process that is very important to programmers and system administrators. First of all, how did this process die? Is it a normal exit, an error, or forced exit by other processes? Secondly, what is the total system CPU time and user CPU time occupied by this process? Number of page faults and number of received signals. All this information is stored in the zombie process. Imagine that if there is no zombie process, as soon as the process exits, all the information related to it will be invisible immediately, and at this time, programmers or system administrators need to use it and have to stare.

So, how do we collect this information and end these zombie processes? It depends on the waitpid call and the wait call that we will discuss next. Both of them are used to collect the information left by the zombie process and make the process disappear completely. These two calls are described in detail below.