How to effectively prevent data center system downtime

This paper mainly introduces several schemes to prevent MySQL data loss when the server is down, and introduces the application of replication, monitoring and failover combined with practice. Friends in need can refer to it. For most applications, MySQL is the most critical data storage center, so how to make MySQL provide HA services is a problem we have to face. When the main user crashes, we need to think about how to ensure that the data is not lost as much as possible, how to ensure that the main user crashes quickly and handle the corresponding failover. Here, the author will combine the related work of MySQL agent and toolset to talk about the MySQL HA scheme that we will adopt in the project at present and in the future. Replication is a good way to ensure that MySQL data is not lost, and MySQL also provides a powerful replication mechanism. We just need to know that for the sake of performance, replication adopts asynchronous mode, that is, the written data will not be updated to slave synchronously. If the main server crashes at this time, we may still face the risk of data loss. To solve this problem, we can use semi-synchronous replication. The principle of semi-synchronous replication is simple. When the master device finishes processing a transaction, it will wait for at least one slave device supporting semi-synchronization to confirm that it has received the event and write it into the relay log before returning. In this way, even if the master device crashes, at least one slave device gets complete data. However, semi-synchronization cannot guarantee that 100% data will not be lost. If the master crashes while completing the transaction and sending it to the slave, it may still cause data loss. Compared with traditional asynchronous replication, semi-synchronous replication can greatly improve data security. More importantly, it is not slow. The authors of MHA say that they use semi-synchronization (here) in facebook's production environment, so I don't think it's really necessary to worry about its performance unless your business level has completely surpassed facebook or google. As mentioned in this paper, lossless semi-synchronous replication has been used since MySQL 5.7, so the probability of data loss is very small. If you really want to completely guarantee that the data will not be lost, a better method at this stage is to use the MySQL cluster solution gelera to ensure that the data will not be lost by writing three copies at the same time. I have no experience in using gelera, but I know that some companies in the industry have used it in production environment, and performance should not be a problem. But gelera is very intrusive to MySQL code, which may not be suitable for some students who are obsessed with code neatness:-) We can also copy MySQL data with drbd. MySQL official documents have introduced it in detail, but the author did not adopt this scheme. The author of MHA wrote some questions about using drdb, which is for reference only. In the next project, the author will give priority to the solution of semi-synchronous replication, and if the data is really important, he will consider gelera. In front of the Monitor, we talked about using the replication mechanism to ensure that the data will not be lost as much as possible after the master crashes, but we can't wait until the master has been working for a few minutes before we know that there is a problem. So a good set of monitoring tools is essential. When the main server fails, the monitor can quickly detect and carry out subsequent processing, such as notifying the administrator by email or notifying the daemon to quickly fail over. Usually, for the monitoring of a service, we use keepalived or heartbeat, so that when the main user crashes, we can easily switch to the standby machine. But they still can't immediately detect that the service is unavailable. At present, our company uses keepalived, but in the future, I prefer to use zookeeper to solve the monitoring and failover of the whole MySQL cluster. For any MySQL instance, we have a corresponding agent. The agent is placed on the same machine as the MySQL instance, and the ping command is sent to the MySQL instance regularly to check its availability. At the same time, the agent is mounted on zookeeper through periodic. In this way, we can know whether MySQL crashes, mainly in the following situations: the machine crashes, so both MySQL and agent will fail, agent will naturally disconnect from zookeeper, agent will disconnect from zookeeper and Agent will actively fail, but MySQL fails. In the above three cases, we can all think that there is something wrong with the MySQL machine, and zookeeper can immediately perceive it. Agent disconnects from zookeeper, zookeeper triggers the corresponding children changed event, and the management service that monitors the event can handle it accordingly. For example, in the first two cases, the control service can automatically fail over, but in the third case, it may not handle it, waiting for the related services such as crontab or supersivord on the machine to automatically restart the agent. The advantage of using zookeeper is that it is convenient to monitor the whole cluster, get the change information of the whole cluster in real time, trigger corresponding events to inform interested services, and coordinate multiple services for related processing. And these are things that keepalived or heartbeat can't do or are too troublesome to do. The problem of using zookeeper is that the deployment is complicated. If it fails over, how to get the latest database address for the application is also a troublesome problem. For deployment problems, we must ensure that a MySQL matches a proxy. Fortunately, docker has it these days, so it's really simple. As for the second problem of database address change, it's not just zookeeper's problem. We can inform the application to dynamically update the configuration information, VIP, or use an agent to solve it. Although zookeeper has many advantages, if your business is not complicated, such as only one master and one slave, zookeeper may not be the best choice, and perhaps keepalived is enough. Failover through monitor can conveniently monitor MySQL, and notify the corresponding services for failover after MySQL crashes. Suppose there is such a MySQL cluster now, with A as the master and B and C as the slaves. When A fails, we need to fail over, so which one of B and C should we choose as the new master? The principle is simple: the slave with the latest original master data will be selected as the new master. We can know which slave has the latest data by the command show slave status. We only need to compare two key fields, Master_Log_File and Read_Master_Log_Pos. These two values represent the location of the binlog file that the slave reads into the master. The larger the index value of the binary log, the larger the pos, and the slave can be promoted to the master. Here we will not discuss the situation that multiple slaves may be promoted to master. In the previous example, assuming that B has been promoted to master, we need to redirect C to the new master b to start copying. We reset C's master by changing it to, but how do we know which file and location to copy from B's binlog? In order to solve this problem, GTid introduced the concept of GTid after MySQL 5.6, that is, uuid:gid, uuid is the uuid of MySQL server, which is globally unique, and gid is the incremental transaction ID. Through these two things, we can uniquely mark a transaction recorded in binlog. With GTID, we can handle failover very conveniently. Still in the previous example, suppose that the last GTID of A read by B at this time is 3e11fa 47-71ca-1e1-9e33-c80a 9429562: 23. And c is 3e11fa47-711-9e33-c80aa9429562:15. When C points to the new master B, we can know it through GTID. As long as the event with GTID 3e11fa47-71e1-9e33-c80aa9429562:15 is found in the binlog in B, then the event can be downloaded from. Although the way to find binlog is sequential search, which is a bit inefficient and violent, it is much more convenient than guessing which file name and location by ourselves. Google also had a global transaction ID patch a long time ago, but it only used an incremental plastic. LedisDB borrowed its idea to achieve failover, but it seems that Google is gradually migrating to MariaDB now. The implementation of GTID in MariaDB is different from MySQL 5.6, which is actually more troublesome. For my MySQL toolset go-mysql, this means writing two different sets of code to deal with GTID. Whether to support MariaDB in the future depends on the situation. Pseudo GTIDGTID is a good thing, but it is limited to MySQL 5.6+. At present, most businesses still use the version before 5.6, and our company is 5.5. These databases will not be upgraded to 5.6 for at least a long time. So we still need a good mechanism to choose the file name and location of the main binlog. At first, the author intends to study the implementation of MHA, using the method of copying relay logs to make up for the missing events first, but I don't trust relay logs very much, and MHA uses perl, a language I can't understand at all, so I gave up further research. Fortunately, the author stumbled upon project orchestrator, which is really a very magical project. It adopts pseudo-GTID mode, and the core code is the following copy code: if there is no meta, create a database; If meta. create _ pseudo _ gtid _ view _ event exists, delete the event; Delimiter; ; If it doesn't exist, create an event meta. create _ pseudo _ gtid _ view _ event on schedule, and start the current _ times cotton ball every 10 second to complete the preserverabledobegiset @ pseudo _ gtid: = uuid (); set @ _ create _ statement:= concat(' create or replace view meta . pseudo _ gtid _ view as select ' '，@pseudo_gtid，' ' as pseudo _ gtid _ unique _ val from dual ')； Prepare st from @_create_statement; Execute ST; De-allocation preparation ST; End; ; Delimiter; Set global event _ scheduler: =1; It creates an event on MySQL. Every 10 second, it writes a uuid into a view, which will be recorded in the binlog. Although we can't locate an event directly like GTID, we can also locate an interval of 10, so that we can compare two MySQL Binlogs in a very small interval. Continuing the above example, suppose that the last uuid in C is s 1, we find the uuid in B in s2, and then compare the subsequent events in turn. If they don't match, there may be a problem and replication will stop. When we traverse to the last binlog event of C, we can get the file name and location corresponding to the next event of B at this time, and then let C point to this location to start copying. Using pseudo GTID requires slave to turn on the log-slave-update option. Considering that GTID must also open this option, it is completely acceptable. In the future, my own failover tool will be implemented in this pseudo-GTID way. In MySQL High Availability, the author uses another method of gtid. Every submission needs to record the gtid in a table, and then the corresponding location information can be found through this GTID, but this method needs the support of the business MySQL client, so I don't like it very much, so I don't need it. Postscript MySQL HA has always been a field of deep cultivation. The author only lists some recent research, and some related tools will be implemented in go-mysql as far as possible. After a period of thinking and research, the author has gained a lot of experience and gains, and the MySQL HA designed is different from the previous one. Later, I found that the HA scheme I designed was almost the same as this article on facebook. In addition, I recently chatted with people on facebook and heard that they are also vigorously promoting it, so I feel that my direction is right. The new HA, I will completely embrace GTID. This thing appeared to solve the original replication problem, so I will not consider the low-end version of MySQL without GTID. Fortunately, our project has upgraded MySQL to 5.6, which fully supports GTID. Different from fb article, MyqlBinlog is converted to support semi-synchronous replication protocol. I support semi-synchronous replication protocol in go-mysql replication library, so MyqlBinlog can be synchronized to a machine in real time. This may be the only difference between me and fb scheme. Synchronizing only binlog is definitely faster than native slave. After all, there is no process in binlog that executes events. In addition, for the real slave, we still use the most primitive synchronous mode instead of semi-synchronous replication. Then, we monitor the whole cluster and handle the failover through MHA. Before, I always thought MHA was difficult to understand, but in fact, it is a very powerful tool. If you really read perl, you can still understand it. MHA has been used in production environment by many companies and has been tested. Using it directly is definitely more cost-effective than writing one yourself. Therefore, I will not consider zookeeper in the future, but will consider writing my own agent.