Separation
Replication is a good way to ensure that MySQL data is not lost, and MySQL also provides a powerful replication mechanism. We just need to know that for the sake of performance, replication adopts asynchronous mode, that is, the written data will not be updated to slave synchronously. If the main server crashes at this time, we may still face the risk of data loss.
To solve this problem, we can use semi-synchronous replication. The principle of semi-synchronous replication is simple. When the master device finishes processing a transaction, it will wait for at least one slave device supporting semi-synchronization to confirm that it has received the event and write it into the relay log before returning. In this way, even if the master device crashes, at least one slave device gets complete data.
However, semi-synchronization cannot guarantee that 100% data will not be lost. If the master crashes while completing the transaction and sending it to the slave, it may still cause data loss. Compared with traditional asynchronous replication, semi-synchronous replication can greatly improve data security. More importantly, it is not slow. The authors of MHA say that they use semi-synchronization (here) in facebook's production environment, so I don't think it's really necessary to worry about its performance unless your business level has completely surpassed facebook or google. As mentioned in this paper, lossless semi-synchronous replication has been used since MySQL 5.7, so the probability of data loss is very small.
If you really want to completely guarantee that the data will not be lost, a better method at this stage is to use the MySQL cluster solution gelera to ensure that the data will not be lost by writing three copies at the same time. I have no experience in using gelera, but I know that some companies in the industry have used it in production environment, and performance should not be a problem. However, gelera is more intrusive to MySQL code, which may not be suitable for some students who are obsessed with code:-)
We can also use drbd to copy MySQL data. MySQL official documents have introduced it in detail, but the author did not adopt this scheme. The author of MHA wrote some questions about using drdb, which is for reference only.
In the next project, the author will give priority to the solution of semi-synchronous replication, and if the data is really important, he will consider gelera.
class monitor
Earlier, we talked about using the replication mechanism to ensure that the data is not lost as much as possible after the host crashes, but we can't wait until the host has been working for a few minutes before we know that there is a problem. So a good set of monitoring tools is essential.
When the main server fails, the monitor can quickly detect and carry out subsequent processing, such as notifying the administrator by email or notifying the daemon to quickly fail over.
Usually, for the monitoring of a service, we use keepalived or heartbeat, so that when the main user crashes, we can easily switch to the standby machine. But they still can't immediately detect that the service is unavailable. At present, our company uses keepalived, but in the future, I prefer to use zookeeper to solve the monitoring and failover of the whole MySQL cluster.
For any MySQL instance, we have a corresponding agent. The agent is placed on the same machine as the MySQL instance, and the ping command is sent to the MySQL instance regularly to check its availability. At the same time, the agent is mounted on zookeeper through periodic. In this way, we can know whether MySQL crashes, mainly in the following situations:
When the machine crashes, MySQL and agent will be pawned, and the connection between agent and zookeeper will naturally be disconnected.
When MySQL fails, agent finds that ping doesn't work, and actively disconnects from zookeeper.
The agent failed, but MySQL failed.
In the above three cases, we can all think that there is something wrong with the MySQL machine, and zookeeper can immediately perceive it. Agent disconnects from zookeeper, zookeeper triggers the corresponding children changed event, and the management service that monitors the event can handle it accordingly. For example, in the first two cases, the control service can automatically fail over, but in the third case, it may not handle it, waiting for the related services such as crontab or supersivord on the machine to automatically restart the agent.
The advantage of using zookeeper is that it is convenient to monitor the whole cluster, get the change information of the whole cluster in real time, trigger corresponding events to inform interested services, and coordinate multiple services for related processing. And these are things that keepalived or heartbeat can't do or are too troublesome to do.
The problem of using zookeeper is that the deployment is complicated. If it fails over, how to get the latest database address for the application is also a troublesome problem.
For deployment problems, we must ensure that a MySQL matches a proxy. Fortunately, docker has it these days, so it's really simple. As for the second problem of database address change, it's not just zookeeper's problem. We can inform the application to dynamically update the configuration information, VIP, or use an agent to solve it.
Although zookeeper has many advantages, if your business is not complicated, such as only one master and one slave, zookeeper may not be the best choice, and perhaps keepalived is enough.
Failover
Monitor can be used to monitor MySQL conveniently, and notify the corresponding services for failover after MySQL crashes. Suppose there is such a MySQL cluster now, with A as the master, B and C as slaves, and A needs to fail over after it fails, so which of B and C should we choose as the new master?
The principle is simple: the slave with the latest original master data will be selected as the new master. We can know which slave has the latest data by the command show slave status. We only need to compare two key fields, Master_Log_File and Read_Master_Log_Pos. These two values represent the location of the binlog file that the slave reads into the master. The larger the index value of the binary log, the larger the pos, and the slave can be promoted to the master. Here we will not discuss the situation that multiple slaves may be promoted to master.
In the previous example, assuming that B has been promoted to master, we need to redirect C to the new master b to start copying. We reset C's master by changing it to, but how do we know which file and location to copy from B's binlog?
GTID
In order to solve this problem, the concept of GTid was introduced after MySQL 5.6, that is, uuid:gid, uuid is the uuid of MySQL server, which is globally unique, and gid is the incremental transaction id. Through these two things, we can uniquely mark a transaction recorded in binlog. With GTID, we can handle failover very conveniently.
Still in the previous example, suppose that the last GTID of A read by B at this time is 3e11fa 47-71ca-1e1-9e33-c80a 9429562: 23. And c is 3e11fa47-711-9e33-c80aa9429562:15. When C points to the new master B, we can know it through GTID. As long as the event with GTID 3e11fa47-71e1-9e33-c80aa9429562:15 is found in the binlog in B, then the event can be downloaded from. Although the way to find binlog is sequential search, which is a bit inefficient and violent, it is much more convenient than guessing which file name and location by ourselves.
Google also had a global transaction ID patch a long time ago, but it only used an incremental plastic. LedisDB borrowed its idea to achieve failover, but it seems that Google is gradually migrating to MariaDB now.
The implementation of GTID in MariaDB is different from MySQL 5.6, which is actually more troublesome. For my MySQL toolset go-mysql, this means writing two different sets of code to deal with GTID. Whether to support MariaDB in the future depends on the situation.
Pseudo GTID
Although GTID is a good thing, it is limited to MySQL 5.6+. At present, most businesses still use the version before 5.6, and our company is 5.5. These databases will not be upgraded to 5.6 for at least a long time. So we still need a good mechanism to choose the file name and location of the main binlog.
At first, the author intends to study the implementation of MHA, using the method of copying relay logs to make up for the missing events first, but I don't trust relay logs very much, and MHA uses perl, a language I can't understand at all, so I gave up further research.
Fortunately, the author stumbled upon project orchestrator, which is really a very magical project. It adopts a pseudo-GTID method, and the core code is as follows.
The code is as follows: create a database, if there is no meta;
If meta. create _ pseudo _ gtid _ view _ event exists, delete the event;
Delimiter; ;
If it does not exist, an event is created.
meta . create _ pseudo _ gtid _ view _ event
Start current_timestamp every 10 second as planned.
Retain after completion
enable
do
begin
set @ pseudo _ gtid:= uuid();
set @ _ create _ statement:= concat(' create or replace view meta . pseudo _ gtid _ view as select \ ' ',@pseudo_gtid,' \ ' as pseudo _ gtid _ unique _ val from dual ');
Prepare st from @_create_statement;
Execute ST;
De-allocation preparation ST;
end
; ;
Delimiter;
Set global event _ scheduler: =1;
It creates an event on MySQL. Every 10 second, it writes a uuid into a view, which will be recorded in the binlog. Although we can't locate an event directly like GTID, we can also locate an interval of 10, so that we can compare two MySQL Binlogs in a very small interval.
Continuing the above example, suppose that the last uuid in C is s 1, we find the uuid in B in s2, and then compare the subsequent events in turn. If they don't match, there may be a problem and replication will stop. When we traverse to the last binlog event of C, we can get the file name and location corresponding to the next event of B at this time, and then let C point to this location to start copying.
Using pseudo GTID requires slave to turn on the log-slave-update option. Considering that GTID must also open this option, it is completely acceptable.
In the future, my own failover tool will be implemented in this pseudo-GTID way.
In MySQL High Availability, the author uses another method of gtid. Every submission needs to record the gtid in a table, and then the corresponding location information can be found through this GTID, but this method needs the support of the business MySQL client, so I don't like it very much, so I don't need it.
postscript
MySQL HA has always been a deep-water field. The author only lists some recent research, and some related tools will be implemented in go-mysql as far as possible.
update
After a period of thinking and research, the author has gained a lot of experience and gains, and the MySQL HA designed is different from the previous one. Later, I found that the HA scheme I designed was almost the same as this article on facebook. In addition, I recently chatted with people on facebook and heard that they are also vigorously promoting it, so I feel that my direction is right.
The new HA, I will completely embrace GTID. This thing appeared to solve the original replication problem, so I will not consider the low-end version of MySQL without GTID. Fortunately, our project has upgraded MySQL to 5.6, which fully supports GTID.
Different from fb article, MyqlBinlog is converted to support semi-synchronous replication protocol. I support semi-synchronous replication protocol in go-mysql replication library, so MyqlBinlog can be synchronized to a machine in real time. This may be the only difference between me and fb scheme.
Synchronizing only binlog is definitely faster than native slave. After all, there is no process in binlog that executes events. In addition, for the real slave, we still use the most primitive synchronous mode instead of semi-synchronous replication. Then, we monitor the whole cluster and handle the failover through MHA.
Before, I always thought MHA was difficult to understand, but in fact, it is a very powerful tool. If you really read perl, you can still understand it. MHA has been used in production environment by many companies and has been tested. Using it directly is definitely more cost-effective than writing one yourself. Therefore, I will not consider zookeeper in the future, but will consider writing my own agent.
Articles you may be interested in: correctly face the server data storage and prevent data loss. Easily realize data recovery of VMware virtual machine. IIS backup automatically backs up IIS settings and restores IIS settings (automatically restores the Web server). Linux svn server construction, client operation, backup and recovery. Data recovery method of server power failure InnoDB engine backup and restore Windows IIS server settings under MySQL database. Detailed explanation of the recovery method of virtual machine data loss caused by server power failure.