今天其中一台游戏服务器的数据库mysql master当机, 系统变为只读模式,重启后进入安全模式,执行fsck后恢复正常。服务器起来之后mysql启动正常,但一台slave却一直出现同步错误。
登录后查看,发现以下错误:
mysql> show slave status\G*************************** 1. row *************************** Slave_IO_State: Master_Host: 10.90.13.238 Master_User: slave Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000949 Read_Master_Log_Pos: 277562491 Relay_Log_File: mysql-relay-bin.001616 Relay_Log_Pos: 277562637 Relay_Master_Log_File: mysql-bin.000949 Slave_IO_Running: No Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 1 Exec_Master_Log_Pos: 277562491 Relay_Log_Space: 277562836 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULLMaster_SSL_Verify_Server_Cert: No Last_IO_Errno: 1236 Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position; the first event 'mysql-bin.000949' at 277562491, the last event read from './mysql-bin.000949' at 4, the last byte read from './mysql-bin.000949' at 4.' Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 41 row in set (0.00 sec)
错误为:
Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position; the first event 'mysql-bin.000949' at 277562491, the last event read from './mysql-bin.000949' at 4, the last byte read from './mysql-bin.000949' at 4.'
这个错误之前也遇到过,但没有具体记录下来,于是网上找资料。
参考了这几个资料:
出现这样的错误原因很简单,原本的slave在master当机前一直在执行同步的动作,当master当机重启mysql恢复之后,会重新开一个新的binlog继续写,但slave不知道发生了这件事,所以还在问上次同步的那个binlog文件和读到得那个位置。
要确定这个情况,我执行了如下的操作:
1. 检查master的位置
mysql> show master status\G*************************** 1. row *************************** File: mysql-bin.000950 Position: 336492640 Binlog_Do_DB: Binlog_Ignore_DB: 1 row in set (0.00 sec)mysql> show master status;
2. 检查master上binlog的大小和最新的修改时间:
[root@d1 ~]# ll /data/mysql/mysql-bin.*-rw-rw---- 1 mysql mysql 1073742473 Nov 17 10:38 /data/mysql/mysql-bin.000944-rw-rw---- 1 mysql mysql 1073742022 Nov 18 12:44 /data/mysql/mysql-bin.000945-rw-rw---- 1 mysql mysql 1073745576 Nov 19 15:31 /data/mysql/mysql-bin.000946-rw-rw---- 1 mysql mysql 1073745324 Nov 21 05:03 /data/mysql/mysql-bin.000947-rw-rw---- 1 mysql mysql 1073742027 Nov 22 16:09 /data/mysql/mysql-bin.000948-rw-rw---- 1 mysql mysql 277553623 Nov 23 05:07 /data/mysql/mysql-bin.000949-rw-rw---- 1 mysql mysql 337157571 Nov 23 18:04 /data/mysql/mysql-bin.000950-rw-rw---- 1 mysql mysql 133 Nov 23 08:06 /data/mysql/mysql-bin.index
[root@d1 ~]# du /data/mysql/mysql-bin.* -sh1.1G /data/mysql/mysql-bin.0009441.1G /data/mysql/mysql-bin.0009451.1G /data/mysql/mysql-bin.0009461.1G /data/mysql/mysql-bin.0009471.1G /data/mysql/mysql-bin.000948265M /data/mysql/mysql-bin.000949323M /data/mysql/mysql-bin.0009504.0K /data/mysql/mysql-bin.index
从这里可以发现,000949是mysql在系统崩溃的时候最后写过的文件,在恢复之后重新建立了一个新的
000950,从时间和大小的条件可以判断,正常情况下mysql-bin.000949应该会写到1.1G的时候才会重新建立新的文件继续写,现在的情况是服务器宕机导致binlog crash了,所以mysql启动后会重新建立一个新的binlog文件。
3. 在slave上执行如下命令:
mysql> stop slave -> ;Query OK, 0 rows affected (0.00 sec)mysql> change master to master_host='10.90.13.238', master_user='slave' ,MASTER_PASSWORD='',MASTER_LOG_FILE='mysql-bin.000950',MASTER_LOG_POS=4;Query OK, 0 rows affected (0.09 sec)
就是在mysql上重新指定新的binlog和它的初始位置。然后启动slave:
mysql> start slave;
观察slave启动正常了
mysql> show slave status\G*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.90.13.238 Master_User: slave Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000950 Read_Master_Log_Pos: 336968550 Relay_Log_File: mysql-relay-bin.000002 Relay_Log_Pos: 52752780 Relay_Master_Log_File: mysql-bin.000950 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 52752634 Relay_Log_Space: 336968852 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 31164Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 41 row in set (0.00 sec)mysql>