percona 5.6 flush_master_info如何保证数据一致性 ?

10140阅读 2评论2015-06-23 zxszcaijin
分类:Mysql/postgreSQL

 首先,我们还是先来看看这部分处理逻辑. 
 先来看看flush_master_info的代码:

点击(此处)折叠或打开

  1. 623 int flush_master_info(Master_info* mi, bool force)
  2.  624 {
  3.  625 DBUG_ENTER("flush_master_info");
  4.  626 DBUG_ASSERT(mi != NULL && mi->rli != NULL);
  5.  627 /*
  6.  628 The previous implementation was not acquiring locks.
  7.  629 We do the same here. However, this is quite strange.
  8.  630 */
  9.  631 /*
  10.  632 With the appropriate recovery process, we will not need to flush
  11.  633 the content of the current log.
  12.  634
  13.  635 For now, we flush the relay log BEFORE the master.info file, because
  14.  636 if we crash, we will get a duplicate event in the relay log at restart.
  15.  637 If we change the order, there might be missing events.
  16.  638
  17.  639 If we don't do this and the slave server dies when the relay log has
  18.  640 some parts (its last kilobytes) in memory only, with, say, from master's
  19.  641 position 100 to 150 in memory only (not on disk), and with position 150
  20.  642 in master.info, there will be missing information. When the slave restarts,
  21.  643 the I/O thread will fetch binlogs from 150, so in the relay log we will
  22.  644 have "[0, 100] U [150, infinity[" and nobody will notice it, so the SQL
  23.  645 thread will jump from 100 to 150, and replication will silently break.
  24.  646 */
  25.  647 mysql_mutex_t *log_lock= mi->rli->relay_log.get_log_lock();
  26.  648
  27.  649 mysql_mutex_lock(log_lock);
  28.  650
  29.  651 int err= (mi->rli->flush_current_log() ||
  30.  652 mi->flush_info(force));
  31.  653
  32.  654 mysql_mutex_unlock(log_lock);
  33.  655
  34.  656 DBUG_RETURN (err);
  35.  657 }

从注释我们可以很清晰的看到,在刷新master_info文件的数据之前,必须先刷新relay log的数据,保证CACHE中relay log的数据已经全部写到文件。

否则就会出现



1.tx1 写入了relay log
2.mysqld crash
3.tx2,tx3写入了cache中,但是未落地到磁盘,最终丢失
4.mysqld重启,之后IO THREAD接收到tx4
5.最终relay log中只包含了tx1,tx4,因此丢失了tx2,tx3

所以,每次flush master_info文件,都会先刷新relay log,从而保证不会有数据丢失。


当然,该操作的也不是堪称完美的。
如果relay log写成功了,但是在flush master_info的时候失败了,可能导致重复的数据被写入relay,从而被SQL THREAD重复的执行。

还好,5.6版本提供了crash safe相关的表来保证了这一点,不通过写文件,而通过写 innodb表来保证数据一致性,在master info写入失败,而relay log写入成功,
crash recovery时,通过relay_log_info表来构建master_info表的数据。
不过最好设置了如下参数

点击(此处)折叠或打开

  1. #crash safe options
  2. relay_log_recovery =1
  3. master_info_repository =TABLE
  4. relay_log_info_repository =TABLE
关于为何通过表来保证crash safe 可以参见:
http://blog.booking.com/better_crash_safe_replication_for_mysql.html
上一篇:mysql5.6对于thread_running_concurrency处理的源码分析
下一篇:由mysql timestamp字段引发的一个系统bug

文章评论