This example is about: there are two pigs, a big pig and a little pig in the pigsty. There is a pedal on one side of the pigsty. Every time you step on the pedal, a small amount of food will fall on the feeding port on the other side of the pigsty far from the pedal. If one pig steps on the pedal, the other pig has a chance to eat the food that has fallen on the other side first. As soon as the pig steps on the pedal, the big pig will eat all the food just before the pig runs to the trough; If the big pig steps on the pedal, there is still a chance for the little pig to run to the trough and compete for the other half before eating the fallen food.
So, what strategy will the two pigs adopt? The answer is: Piglets will choose the "hitchhiking" strategy, that is, they will wait comfortably in the trough; The big pig ran tirelessly between the pedal and the trough, just for a little leftovers.
What is the reason? Because, little pigs can get nothing by pedaling, but they can eat food without pedaling. For piglets, it is always a good choice not to step on the pedal whether the big pig does or not. On the other hand, the big pig knows that the little pig can't step on the gas pedal. It's better to step on the accelerator by himself than not to step at all, so he has to do it himself.
The phenomenon of "the little pig is lying down and the big pig is running" is caused by the rules of the game in the story. The core indicators of the rules are: the number of things falling each time and the distance from the pedal to the feeding port.
If we change the core indicators, will there be the same scene of "pigs lying down and big pigs running" in the pigsty? Give it a try.
Change scheme 1: reduction scheme. Feeding is only half of the original weight. As a result, neither the little pig nor the big pig kicked. The little pig will step on it and the big pig will finish the food; If the big pig steps on it, the little pig will finish the food, too. Whoever pushes means contributing food to each other, so no one will have the motivation to push.
If the goal is to make pigs pedal more, the design of this game rule is obviously a failure.
Variation scheme 2: incremental scheme. Feed twice as much as before. As a result, both the little pig and the big pig can pedal. Anyone who wants to eat will kick. Anyway, the other party won't eat all the food at once. Piglets and big pigs are equivalent to living in a materialistic society with relatively rich materials, and their sense of competition is not very strong.
For the designer of the rules of the game, the cost of this rule is quite high (providing two meals at a time); Moreover, because the competition is not strong, it has no effect to let the pigs push more.
Variant 3: Decreasing plus shifting scheme. Feed only half the original weight, but at the same time move the feeding port near the pedal. As a result, both the little pig and the big pig pushed hard. Those who wait will not eat, and those who work hard will get more. Every harvest is just a flower.
This is the best solution for game designers. The cost is not high, but the harvest is the biggest.
The original story of "Smart Pig Game" inspired the weak (pigs) in the competition to wait for the best strategy. But for the society, the allocation of social resources when piggy hitchhiked is not optimal, because piggy failed to participate in the competition. In order to make the most efficient allocation of resources, the designers of rules don't want to see anyone hitchhiking, so does the government, and so does the boss of the company. Whether the phenomenon of "hitchhiking" can be completely eliminated depends on whether the core indicators of the rules of the game are set properly.
For example, the company's incentive system design is too strong, and it is still holding shares and options. All the employees in the company have become millionaires. Not to mention the high cost, the enthusiasm of employees is not necessarily high. This is equivalent to the situation described in the incremental scheme of Smart Pig Game. However, if the reward is not strong and the audience is divided (even the "little pigs" who don't work), the big pigs who have worked hard will have no motivation-just like the situation described in the first phase of the "Smart Pig Game". The best incentive mechanism design is like changing the third scheme-reducing staff and changing shifts. Rewards are not shared by everyone, but for individuals (such as business proportion commission), which not only saves costs (for the company), but also eliminates the phenomenon of "hitchhiking" and can achieve effective incentives.
Many people haven't seen the story of "smart pig game", but they are consciously using pig strategy. Retail investors are waiting for the dealer to get on the sedan chair in the stock market; Waiting for profitable new products to appear in the industrial market, and then copying hot money on a large scale to make huge profits; People in the company who do not create benefits but share the results, and so on. Therefore, for those who make various rules of economic management, they must understand the reasons for the index change of "smart pig game".
Then there is the prisoner's dilemma.
The police arrested two suspects, A and B, but there was not enough evidence to charge them. So the police detained the suspects separately and met them separately, and offered the following options to both parties:
If a person pleads guilty and testifies against the other party (called "betrayal" in related terms), but the other party keeps silent, the person will be released immediately, and the silent person will be sentenced to 10 years imprisonment.
If both of them remain silent (related terminology is called "cooperation" with each other), they will also be sentenced to six months in prison.
If both of them report each other ("betray each other"), they will also be sentenced to two years in prison.
Summarized in the following table:
Silence (cooperation) confession (betrayal)
Second, silence (cooperation), both of them were released immediately after serving half a year; B imprisonment 10 year.
B pleaded guilty (betrayal), and A served his sentence 10 years; B was released immediately, and both of them served two years in prison.
comment
Like other examples of game theory, the prisoner's dilemma assumes that each participant (that is, "prisoner") is self-interested, that is, they are all seeking the greatest self-interest without caring about the interests of another participant. If the return of a strategy is lower than other strategies under any circumstances, this strategy is called "strict disadvantage" and rational participants will never choose it. In addition, there is no other force to interfere with personal decision-making, and participants can choose strategies completely according to their own wishes.
In order to shorten the individual sentence to the shortest, which strategy should the prisoner choose? Two prisoners were held in isolation and did not know each other's choice; And even if you can talk, you may not be able to believe that the other person will not talk back. As far as personal rational choice is concerned, the sentence for reporting betrayal is always lower than silence. Try to imagine how two rational prisoners will make a choice in a dilemma;
If the other party is silent and betrayed, I will be released, so I will choose to betray.
If the other party accuses me of betrayal, I will also accuse the other party of getting a lighter sentence, so I will also choose betrayal.
Two people face the same situation, so their rational thinking will come to the same conclusion-choose betrayal. Betrayal is the dominant strategy of the two strategies. So the only possible Nash equilibrium in this game is that both players betray each other, and both of them serve two years in prison.
The Nash equilibrium of this game is obviously not a Pareto optimal solution that takes into account the interests of the group. As far as the overall interests are concerned, if both participants cooperate and remain silent, both of them will only be sentenced to half a year, and the overall interests will be higher, and the result will be better than two years' imprisonment for mutual betrayal. However, according to the above assumptions, both of them are rational individuals who only pursue their own personal interests. In a balanced situation, both prisoners choose to betray, and as a result, their judgments are higher than cooperation, and their overall interests are lower than cooperation. This is the "dilemma". This example beautifully proves that Pareto optimality and Nash equilibrium are in conflict in non-zero-sum games.
Sorting out the basic game structure of prisoner's dilemma can analyze prisoner's dilemma more clearly. Experimental economics often uses the general form of this game to analyze various topics. The following is an example of a general form of implementation:
There are two participants and a banker. Each participant has two cards in duplicate, each printed with "cooperation" and "betrayal". Each participant puts one face down in front of the dealer. Face down ruled out the possibility that participants knew each other's choices. Then, the dealer opens the two participant cards and pays the benefits according to the following rules:
One person betrays one person to cooperate: the betrayer gets 5 points (betrayal temptation) and the collaborator gets 0 points (fraudulent payment).
Cooperation between two people: 3 points each (cooperation reward).
Both of them betrayed: each got 1 point (betrayal punishment).
Use the payment matrix table to display the payment, as shown below (the two participants are shown in red and blue respectively):
Cooperative betrayal of payment matrix under the general form of prisoner's dilemma
Cooperation 3, 3 0, 5
Betrayed 5,0 1,1
The symbol "T, R, P, S" indicates cooperation and betrayal.
Cooperation r, r, s, t
Betraying t, s, p, p
Use the word "win or lose" to express cooperation and betrayal
Win together-win big losses-win big.
Betrayal victory-big negative negative
Some general conclusions can be drawn from the points obtained from simple games.
T, r, p, s symbol table
Interpretation of English and Chinese Symbolic Fractions (Non-terminology)
Temptation betrayed temptation, betrayed success.
R 3 reward cooperation reward * * * cooperation income
P 1 punishment for betrayal * * * is the same as the income from betrayal.
S 0 was cheated to pay for being betrayed by himself.
If T (temptation) = betrayal temptation, R (reward) = cooperation reward, P (punishment) = betrayal punishment, S (suckers) = cheated payment, as far as personal choice score is concerned, the following inequality can be obtained.
T & gtR & gtP & gtS
(Solution: From 5>3>1> 0 to obtain the above inequality)
As far as the total score is concerned, the following inequalities will be obtained.
2R & gtT+S or 2R & gt2P
(solution: 2× 3 >; 5+0 or 2× 3 >; 2x 1; Two people get 6 points for cooperation, 2 points for those who betray each other, and 5 points for those who betray alone. Obviously, cooperation scores higher than betrayal. Cooperation is the dominant strategy in the group. )
Repeated games or repeated prisoner's dilemma will make participants pay attention to T>R>P>s instead of 2R>T+S. In other words, it will help participants get out of the predicament. The above theory was founded by Douglas hofstadter.
An example of politics: the arms race
In politics, the arms race between the two countries can be described as prisoner's dilemma. Both countries can claim to have two choices: increasing armaments (betrayal) or reaching an agreement on reducing weapons (cooperation). Neither country is sure that the other will abide by the agreement, so the two countries will eventually tend to increase their armaments. Paradoxically, although increasing armaments will be a "rational" behavior of the two countries, the result is "irrational" (for example, it will cause damage to the economies of both sides, and so on). This can be regarded as the inference of containment theory, that is, to contain the opponent's attack with strong military force in order to achieve the purpose of peace.
Economic example: tariff war.
Two countries can have two choices in tariffs:
Raise tariffs to protect your goods. (betrayal)
Reach a tariff agreement with each other and reduce tariffs to facilitate the circulation of their respective commodities. (cooperation)
When one country does not abide by the tariff agreement for some reason and raises the tariff alone (betrayal), another country will make the same reaction (betrayal), which will lead to tariff war, and the goods of the two countries will lose each other's markets and also cause damage to their own economies (the result is betrayal). Then the two countries reached a new tariff agreement. (The result of repeated games is to find that cooperation with * * * is the most profitable. )
Business example: advertising war
There will also be various examples of prisoner's dilemma in business activities. Take the advertising competition as an example.
The two companies compete with each other, and their advertisements influence each other, that is, if one company's advertisements are more acceptable to customers, it will take away part of the income of the other company. But if they publish advertisements with similar quality at the same time, the income will increase little but the cost will increase. But if we don't improve the quality of advertising, the business will be taken away by the other party.
The two companies can have two choices:
Reach an agreement with each other to reduce advertising costs. (cooperation)
Increase the cost of advertising, try to improve the quality of advertising and overwhelm the other party. (betrayal)
If the two companies don't trust each other and can't cooperate, and betrayal becomes the dominant strategy, the two companies will fall into an advertising war, and the increase in advertising expenses will damage the profits of the two companies, which is the prisoner's dilemma. In reality, it is difficult for two competing companies to reach a cooperation agreement, and most of them will fall into a prisoner's dilemma.
An example of a bicycle race
The competitive strategy of cycling race is also a game, and the result can be explained by the research results of prisoner's dilemma. For example, in the Tour de France held every year, there are the following situations: before reaching the finish line, the contestants often advance in the form of large teams (Peloton in English), and they adopt this strategy in order to keep themselves from falling behind and make moderate efforts. The person who runs in front is the most laborious when facing the wind, so choosing the front is the worst strategy. Usually, everyone is unwilling to move forward at first (* * * cooperate with betrayal), which slows down the whole team, and then two or more players usually ride to the front, and then exchange the position of the front line for a period of time to share the wind resistance (* * * cooperate with each other), so as to improve the speed of the whole team. At this time, if a player in front tries to keep the position in front (betrayal), other players and the brigade, usually, the player with the most times in front (cooperation) will usually be caught up (betrayal) by the player behind, because it is relatively effortless for the player behind to ride in the rush of the previous player.
Events related to the prisoner's dilemma
[Edit this paragraph]
illusion
William poundstone used a New Zealand example to illustrate the prisoner's dilemma in his works. In New Zealand, newsstands are neither managed nor locked, and people who buy newspapers put down their money and take it away. Of course, some people may take newspapers without paying (betrayal), but this rarely happens, because everyone realizes that if everyone steals newspapers (betrayal), it will cause inconvenience and harmful results in the future. What is special about this example is that New Zealanders can get out of the prisoner's dilemma without any other factors. No one pays special attention to the newsstand. People obey the rules in order to avoid the consequences of betrayal. This common reasoning or idea to avoid the prisoner's dilemma is called "magical thinking". [3]
"Pleading guilty and reducing sentence" is not feasible.
The conclusion of prisoner's dilemma is one of the reasons why plea bargaining is banned in many countries. The conclusion of the prisoner's dilemma is that if there are two criminals, one of whom is guilty and the other is innocent, the criminals will confess everything, even wronged the innocent people (just saying betrayal) in order to reduce their sentences. In the worst case, if they are all sentenced to prison, confessed criminals will get shorter sentences, while innocent criminals will get longer sentences.
The tragedy of public goods
There is more than one participant in the real game, and there will be a prisoner's dilemma of multiple participants. Garrett James Hardin's The Tragedy of Public Goods is an example: "The tragedy of public goods means that public property belonging to the largest number of people is often the least valued". For example, in fishing, the fish on the high seas belong to the public. Under the concept of not overfishing others, fishermen will overfish and cause damage to marine ecology. However, the formulation of multi-party prisoner's dilemma remains to be discussed, because it can always be decomposed into groups of classic two-party prisoner's dilemma. In other words, the prisoner's dilemma has only two sides, not many sides. The so-called multi-party prisoner's dilemma is just an illusion mixed by multiple two-party prisoner's dilemmas.